Thursday, May 16, 2019
The Relationship Between Life Expectancy at Birth and Gdp Per Capita
The affinityship in the midst of lifespan apprehension at ingest and gross domestic product per capita (PPP) Candidate Teacher Candidate number Date of submission give voice Count 2907 ingredient 1 Introduction In a given country, Life prediction at birth is the expected number of course of instructions of life from birth. Gross domestic product per capita is defined as the market honour of all final goods and services produced within a country in matchless year, divided by the size of the population of that country. The main objective of the guide on project is to establish the creation of a statistical comparison betwixt Life expectation (y) at birth and gross domestic product per capita (x).First, we go away present in Section 2 the selective information, from an official political source, containing Life anticipation at birth and gross domestic product per capita of 48 countries in the year 2003. We will put this data in a circuit card ordered alphabetically a nd at the end of the section we will perform some basic statistical analysis of these data. These statistics will include the mean, median, modal auxiliary verb class and standard going away, for both Life Expectancy and gross domestic product per capita. In Section 3 we will find the regression hunt which outflank meets our data and the similar correlation coefficient r.It is natural to ask if at that place is a non- bi additive model, which better describes the statistical relation amid gross domestic product per capita and Life Expectancy. This question will be studied in Section 4, where we will arrest if a enterarithmic relation of lawsuit y=A ln(x+C) + B, is a better model. In Section 5 we will perform a chi squargon toes running game to get evidence of the existence of a statistical relation between the variables x and y. In the last section of the project, new(prenominal) than summarizing the obtained results, we will present several possible directions to fu rther investigation. Section 2 Data line of battleThe hobby control board shows the gross domestic product per capita (PPP) (in US Dollars), denoted xi, and the mean Life Expectancy at birth (in years), denote yi, in 48 countries in the year 2003. The data has been collected through an online website (2). According to this website it represents official world records. Country gross domestic product per capita (xi) Life Expectancy at birth (yi) 1. genus Argentina 11200 75. 48 2. Australia 29000 80. 13 3. Austria 30000 78,17 4. Bahamas, The 16700 65,71 5. Bangladesh 1900 61,33 6. Belgium 29100 78,29 7. Brazil 7600 71,13 8. Bulgaria 7600 71,08 9. Burundi 600 43,02 10. Canada 29800 79,83 1. Central African democracy 1100 41,71 12. khile 9900 76,35 13. China 5000 72,22 14. Colombia 6300 71,14 15. Congo, Republic of the 700 50,02 16. Costa Rica 9100 76,43 17. Croatia 10600 74,37 18. Cuba 2900 76,08 19. Czech Republic 15700 75,18 20. Denmark 31100 77,01 21. Dominican Republic 6000 6 7,96 22. Ecuador 3300 71,89 23. Egypt 4000 70,41 24. El Salvador 4800 70,62 25. Estonia 12300 70,31 26. Finland 27400 77,92 27. France 27600 79,28 28. tabun 2500 64,76 29. Germ each 27600 78,42 30. Ghana 2200 56,53 31. Greece 20000 78,89 32. Guatemala 4100 65,23 33.Guinea 2100 49,54 34. Haiti 1600 51,61 35. Hong Kong 28800 79,93 36. Hungary 13900 72,17 37. India 2900 63,62 38. Ind mavensia 3200 68,94 39. Iraq 1500 67,81 40. Israel 19800 79,02 41. Italy 26700 79,04 42. Jamaica 3900 75,85 43. Japan 28200 80,93 44. Jordan 4300 77,88 45. atomic number 16 Africa 10700 46,56 46. Turkey 6700 71,08 47. fall in Kingdom 27700 78,16 48. United States 37800 77,14 delay1 GDP per capita and Life Expectancy at birth in 48 countries in 2003 (source reference 2) Statistical analysis First we image some basic statistics of the data collected in the above plug-in.Basic statistics for the GDP per capita Mean x=i=148xi48 = 12900 In order to cast the median, we need to order the GDP determine 600 , 700, 1100, 1500, 1600, 1900, 2100, 2200, 2500, 2900, 2900, 3200, 3300, 3900, 4000, 4100, 4300, 4800, 5000, 6000, 6300, 6700, 7600, 7600, 9100, 9900, 10600, 10700, 11200, 12300, 13900, 15700, 16700, 19800, 20000, 26700, 27400, 27600, 27600, 27700, 28200, 28800, 29000, 29100, 29800, 30000, 31100, 37800. The median is obtained as the shopping mall assess of the two central set (the 25th and the 26th) Median= 7600+91002 = 8350 In order to compute the modal class, we need to split the data in classes.If we visit classes of USD 1000 (0-999, 1000-1999, ) we experience the adjacent table of frequencies word form Frequency 0-999 2 1000-1999 4 2000-2999 5 3000-3999 3 4000-4999 4 5000-5999 1 6000-6999 3 7000-7999 2 8000-8999 0 9000-10000 2 10000-10999 2 11000-11999 1 12000-12999 1 13000-13999 1 14000-14999 0 15000-15999 1 16000-16999 1 17000-17999 0 18000-18999 0 19000-19999 1 20000-20999 1 21000-21999 0 22000-22999 0 23000-23999 0 24000-24999 0 25000-25999 0 26000-26999 1 27000-27999 4 28000-28999 2 29000-29999 3 30000-30999 1 31000-31999 1 32000-32999 0 3000-33999 0 34000-34999 0 35000-35999 0 36000-36999 0 37000-38000 1 carry over 2 Frequencies of GDP per capita with classes of USD 1000 With this preference of classes, the modal class is 2000-2999 (with a frequency of 5). If instead we consider classes of USD 5000 (0-4999, 5000-9999, ) the modal class is the first 0-4999 (with a frequency of 18). Class Frequency 0-4999 18 5000-9999 8 10000-14999 5 15000-19999 3 20000-24999 1 25000-29999 10 30000-34999 2 35000-40000 1 Table 3 Frequencies of GDP per capita with classes of USD 5000 Standard deviation Sx=i=148(xi-x)248 =11100Basic statistics for the Life Expectancy Mean y=i=148yi48 = 70,13 As before, in order to compute the median, we need to order the Life Expectancies 41. 71, 43. 02, 46. 56, 49. 54, 50. 02, 51. 61, 56. 53, 61. 33, 63. 62, 64. 76, 65. 23, 65. 71, 67. 81, 67. 96, 68. 94, 70. 31, 70. 41, 70. 62, 71. 08, 71. 08, 71. 13, 71. 14, 71. 89, 72. 17, 72 . 22, 74. 37, 75. 18, 75. 48, 75. 85, 76. 08, 76. 35, 76. 43, 77. 01, 77. 14, 77. 88, 77. 92, 78. 16, 78. 17, 78. 29, 78. 42, 78. 89, 79. 02, 79. 04, 79. 28, 79. 83, 79. 93, 80. 13, 80. 93. The median is obtained as the middle value of the two central determineMedian= 72,17+72,222 = 72. 195 To find the modal class of Life Expectancy we consider modal classes of one year. The table of frequencies is the adjacent Class Frequency 41 1 42 0 43 1 44 0 45 0 46 1 47 0 48 0 49 1 50 1 51 1 52 0 53 0 54 0 55 0 56 1 57 0 58 0 59 0 60 0 61 1 62 0 63 1 64 1 65 2 66 0 67 2 68 1 69 0 70 3 71 5 72 2 73 0 74 1 75 3 76 3 77 4 78 5 79 5 80 2 Table 4 Frequencies of Life Expectancy at birth with classes of 1 year It appears from the table above that there are three modal classes 71, 78 and 79 (with a frequency of 5).Standard deviation Sy=i=148(yi-y)248 =10. 31 The standard deviations Sx and Sy ca-ca been appoint using the following table of data Country GDP Life exp. (x x? ) (x x? )2 (y ? y) (y y? )2 (x x ? )(y y ? ) Argentina 11200 75. 48 -1665 2770838 5. 35 28. 64 -8907. 60 Australia 29000 80. 13 16135 260351671 10. 00 100. 03 161374. 34 Austria 30000 78. 17 17135 293622504 8. 04 64. 66 137790. 17 Bahamas. The 16700 65. 71 3835 14710421 -4. 42 19. 53 -16947. 75 Bangladesh 1900 61. 33 -10965 120222088 -8. 80 77. 42 96474. 63 Belgium 29100 78. 29 16235 263588754 8. 16 66. 1 132501. 29 Brazil 7600 71. 13 -5265 27715838 1. 00 1. 00 -5271. 16 Bulgaria 7600 71. 08 -5265 27715838 0. 95 0. 90 -5007. 93 Burundi 600 43. 02 -12265 150420004 -27. 11 734. 88 332477. 52 Canada 29800 79. 83 16935 286808338 9. 70 94. 11 164294. 71 Central African Republic 1100 41. 71 -11765 138405421 -28. 42 807. 63 334334. 75 Chile 9900 76. 35 -2965 8788754 6. 22 38. 70 -18443. 41 China 5000 72. 22 -7865 61851671 2. 09 4. 37 -16446. 81 Colombia 6300 71. 14 -6565 43093754 1. 01 1. 02 -6638. 43 Congo. Republic of the 700 50. 02 -12165 147977088 -20. 1 404. 36 244614. 57 Costa Rica 9100 76. 43 -3765 14172088 6. 30 39. 71 -23721. 58 Croatia 10600 74. 37 -2265 5128338 4. 24 17. 99 -9604. 66 Cuba 2900 76. 08 -9965 99292921 5. 95 35. 42 -59301. 73 Czech Republic 15700 75. 18 2835 8039588 5. 05 25. 52 14322. 40 Denmark 31100 77. 01 18235 332530421 6. 88 47. 35 125482. 46 Dominican Republic 6000 67. 96 -6865 47122504 -2. 17 4. 70 14887. 57 Ecuador 3300 71. 89 -9565 91481254 1. 76 3. 10 -16845. 62 Egypt 4000 70. 41 -8865 78580838 0. 28 0. 08 -2493. 16 El Salvador 4800 70. 62 -8065 65037504 0. 9 0. 24 -3961. 73 Estonia 12300 70. 31 -565 318754 0. 18 0. 03 -102. 33 Finland 27400 77. 92 14535 211278338 7. 79 60. 70 113249. 07 France 27600 79. 28 14735 217132504 9. 15 83. 75 134847. 48 Georgia 2500 64. 76 -10365 107424588 -5. 37 28. 82 55644. 86 Germany 27600 78. 42 14735 217132504 8. 29 68. 74 122175. 02 Ghana 2200 56. 53 -10665 113733338 -13. 60 184. 93 145025. 00 Greece 20000 78. 89 7135 50914171 8. 76 76. 76 62515. 17 Guatemala 4100 65. 23 -8765 76817921 -4. 90 24. 00 42935. 50 Guine a 2100 49. 54 -10765 115876254 -20. 59 423. 0 221629. 32 Haiti 1600 51. 61 -11265 126890838 -18. 52 342. 94 208606. 00 Hong Kong 28800 79. 93 15935 253937504 9. 80 96. 06 156187. 00 Hungary 13900 72. 17 1035 1072088 2. 04 4. 17 2113. 54 India 2900 63. 62 -9965 99292921 -6. 51 42. 36 64856. 98 Indonesia 3200 68. 94 -9665 93404171 -1. 19 1. 41 11488. 77 Iraq 1500 67. 81 -11365 129153754 -2. 32 5. 38 26351. 63 Israel 19800 79. 02 6935 48100004 8. 89 79. 05 61664. 52 Italy 26700 79. 04 13835 191418754 8. 91 79. 41 123290. 86 Jamaica 3900 75. 85 -8965 80363754 5. 72 32. 73 -51288. 2 Japan 28200 80. 93 15335 235175004 10. 80 116. 67 165641. 67 Jordan 4300 77. 88 -8565 73352088 7. 75 60. 08 -66386. 23 South Africa 10700 46. 56 -2165 4685421 -23. 57 555. 49 51016. 52 Turkey 6700 71. 08 -6165 38002088 0. 95 0. 90 -5864. 06 United Kingdom 27700 78. 16 14835 220089588 8. 03 64. 50 119146. 94 United States 37800 77. 14 24935 621775004 7. 01 49. 16 174828. 44 Table 5 Statistical analysis of the data collected in Table 1 From the last column we can compute the covariance disceptation of the GDP and Life Expectancy Sxy =148 i=148(xi-x)(yi-y)= 73011. 6 Section 3 linear regression We start our investigation by perusal the line best fit of the data in Table 1. This will allow us to see whether there is a relation of linear dependence between GDP and Life Expectancy. The regression line for the variables x and y is given by the following formula y-y? =SxySx2(x-x ) By using the values found above we get y= 62. 51 + 0. 5926*10-3 x The Pearsons correlation coefficient is r = 0. 6380 The following graph shows the data on Table 1 together with the line of best fit computed presage 1 Linear regression. The value of the correlation coefficient r 0. , is evidence of a moderate substantiating linear correlation between the variables x and y. On the other hand it is apparent from the graph above that the relation between the variables is not exactly linear. In the next section we wi ll try to speculate on the reason for this non-linear relation and on what type of statistical relation can exist between GDP per capita and Life Expectancy. Section 4 Logarithmic regression As explained in reference 3, the main reason for this non-linear race between GDP per capita and Life Expectancy is because people consume both needs and wants.People consume needs in order to survive. at one time a persons needs are satisfied, they could then spend the rest of their silver on non-necessities. If everyones needs are satisfied, then any increase in GDP per capita would barely ask Life Expectancy. There are various other reasons that one can think of, to explain the non-linear relationship between GDP per capita and Life Expectancy. For example the GDP per capita is the average riches, while one should consider also how the global wealth is distributed among the population of a given country.With this in mind, to have a more sleep with picture of the statistical relation b etween economy of a country and Life Expectancy, one should take into considerations also other economic parameters, such as the contrariety big businessman, that describe the distribution of wealth among the population. Moreover, the wealth of the population is not the only factor effecting Life Expectancy one should also take into account, for example, the authoritiesal policies of a nation towards health and poverty. For example Cuba, a country with a very low GDP per capita ($ 2900), has a relatively high Life Expectancy (76. 8 years), to the highest degreely due to the fact that the government provides basic needs and health assistance to the population. Some of these aspects will be discussed in the next section. allows try to guess what could be a reasonable relation between the variables x (GDP per capita) and y (Life Expectancy). According to the above observations we can consider the total GDP formed by two values x= xn + xw, where xn denotes the part of wealth spent on necessities, and xw denotes the part spent on wants.It is reasonable to overhear the following assumptions 1. The Life Expectancy depends linearly on the part of wealth spent on necessities y=axn + b, (1) 2. The fraction xn/x of wealth spent on necessities, is close to 1 when x is close to 0 (if one has a little amount of money he/she will spend most of it on necessities), and is close to 0 when x is very large (if one has a very large money he/she will spend only a little fraction of on necessities). 3.We make the following choice for the function xn= f(x) satisfying the above requirements xn= log (cx + 1)/c, (2) where c is some positive parameter. This function is chosen mainly for two reasons. On one hand it satisfies the requirements that are describe in 2, indeed the identical graph of xn/x = f(x) = log (cx + 1)/cx Figure 2 Graph of the function y= log (cx + 1)/cx, for C=0. 5 (blue), 1 (black) and 10 (red). The blue, black and red lines correspond respectively to the choi ce of parameter c= 0. 5, 1 and 10.As it appears from the graph in all matters we have f(0)= 1 and f(x) is small for large values of x. On the other hand the function chosen allows us to use the statistical tools at our organisation in the excel software to derive some fire conclusion about the statistical relation between x and y. This is what we are going to do next. First we want to find the relation between x and y under the above assumptions. Putting together equations (1) and (2) we get y= aclncx+1+b, (3) which shows that there is a logarithmic dependence between x and y.Equation (3) can be rewritten in the following combining weight form if we denote A=a/c, B= b+(a/c)ln(c), C=1/c, y=Aln(x+C)+B . (4) We can now study the curve of type (4) which best fits the data in Table 1, using the statistical tools of excel spreadsheet. Unfortunately excel allows us to plot only a curve of type y= Aln(x) + B (i. e. equation of type four where C is equal to 0). For this choice of C, we g et the following logarithmic curve of best fit together with the corresponding value of correlation coefficient r2. Figure 3 Logarithmic regression.To find the analogous curve of best fit for a given value of C (positive, arbitrarily chosen) we can simply add C to all the x values and redo the same plot as for C= 0 with the new independent variable x1= x + C. We omit masking the graphs containing the curve of best fit for all the possible values of C and we simply report, in the following table, the correlation coefficient r for some appropriately chosen values of C. C r 0. 00 0. 77029 0. 01 0. 77029 0. 1 0. 77028 1 0. 77025 10 0. 76991 100 0. 76666 Table 8 correlation coefficient r2 for the curve of best fit y= Aln(x+C) +B, for some values of C. The above data indicate that the optimal choice of C is between 0. 00 and 0. 01, since in this case r is the closest to 1. Comparing the results got with the linear regression (r 0,6) and the logarithmic regression (r 0,8) we can conclud e that the last mentioned appears to be a better model to describe the relation between GDP per capita and Life Expectancy, since the value of the correlation coefficient is significantly bigger. From Figure 3 one the data is very far from the curve of best fit and so we may decide to discuss it separately and do the regression without it.This data is corresponds to South Africa with a GDP per capita of 10700 and a Life Expectancy at birth of 46. 56 (much lower than any other country with a comparable GDP). It is reasonable to think that this anomaly is due to the peculiar annals of South Africa which, after the end of apartheid, had to face an uncontrolled violence. It is therefore difficult to fit this country in a statistical model and we can decide to remove it from our data. Doing so, we get the following new plot. Figure 4 Logarithmic regression for the data in Table 1 excluding South Africa. The new value of correlation coefficient r 0. 3 indicates that, excluding the anoma lous data of South Africa, there is a strong positive logarithmic correlation between GDP per capita and Life Expectancy at birth. Section 5 Chi square test (? 2? test) We conclude our investigation by making a chi square test. This will allow us to confirm the existence of a relation between the variables x and y. For this design we formulate the following null and alternative hypotheses. H0 GDP and Life Expectancy are not correlated. H1 GDP and Life Expectancy are correlated * Observed frequency The observed frequencies are obtained directly from Table 2 Below y? Above y? TotalBelow x 14 1 15 Above x 16 17 33 Total 30 18 48 Table 6 Observed frequencies for the chi square test * Expected frequency The expected frequencies are obtained by the formula fe = (column total (row total) / total sum Below y? Above y? Total Below x 9. 375 5. 625 15 Above x 20. 625 12. 375 33 Total 30 18 48 Table 7 Expected frequencies for the chi square test. We can now calculate the chi square varia ble ?2? = ( f0-fe)2/fe = 8. 85 In order to decide whether we accept or not the alternative hypothesis H1, we need to find the number of degrees of freedom (df) and to fix a aim of trust .The number of degrees of freedom is df= (number of rows 1) (number of columns 1) = 1 The corresponding critical values of chi square, depending on the choice of level of confidence , are given in the following table (see reference 4) df 00. 10 00. 05 0. 025 00. 01 0. 005 1 2. 706 3. 841 5. 024 6. 635 7. 879 Table 7 Critical values of chi square with one degree of freedom. Since the value of chi square is greater than any of the above critical values, we conclude that even with a level of confidence = 0. 005 we can accept the alternative hypothesis H1 GDP and Life Expectancy are related.The above test shows that there is some relation between the two variables x (GDP per capita) and y (Life Expectancy at birth). Our purpose is to further investigate this relation. Section 6 Conclusions Interpreta tion of results Our study of the statistical relation between GDP per capita and Life Expectancy brings us to the following conclusions. As the chi square test shows there is in spades some statistical relation between the two variables (with a confidence level = 0. 005). The study of linear regression shows that there is a moderate positive linear correlation between the two variables, with a correlation coefficient r 0. . This linear model can be greatly improved replacing the linear dependence with a different type of relation. In particular we considered a logarithmic relation between the variable x (GDP) and y (Life Expectancy). With this new relation we get a correlation coefficient r 0. 7. In fact, if we remove the data related to the anomalous country of South Africa (which should be discussed separately and does not fit well in our statistical analysis), we get an even higher correlation coefficient r 0. . This is evidence of a strong positive logarithmic dependence between x and y. Validity and Areas of improvement Of course one possible improvement of this project would be to consider a much more extended collection data on which to do the statistical analysis. For example one could consider a large rock countries, data related to different years (other than 2003), and one could even think of studying data referring to local anesthetic regions within a single country.All this can be found in literature but we indomitable to restrict to the data presented in this project because we considered it enough as an application of the mathematical and statistical tools apply in the project. A second, probably more interesting, possible improvement of the project would be to consider other economic factors that can affect the Life Expectancy at birth of a country. Indeed the GDP per capita is just a measure of the average wealth of a country and it does not take in account the distribution of the wealth.There are however several economic indices that measu re the dispersion of wealth in the population and could be considered, together with the GDP per capita, as a factor influencing Life Expectancy. For example, it would be interesting to study a linear regression model in which the dependent variable y is the Life Expectancy and with two (or more) independent variables xi, one of which should be the GDP per capita and another could be for example the Gini Inequality Index reference (measuring the dispersion of wealth in a country).This would have been very interesting but, perhaps, it would have been out of context in a project studying GDP per capita and Life Expectancy. Probably the most important direction of improvement of the present project is related to the somewhat arbitrary choice of the logarithmic model apply to describe the relation between GDP and Life Expectancy. Our choice of the function y= Aln(x+C) +B, was mainly dictated by the statistic package at our disposal in the excel software used in this project.Nevertheles s we could have considered different, and probably more appropriate, choices of functional relations between the variables x and y. For example we could have considered a mixed linear and hyperbolic regression model of type y= A + Bx + C/(x+D), as it is sometimes considered in literature (see reference 4). Bibliography 1. Gapminder World. Web. 4 Jan. 2012. lthttp//www. gapminder. orggt. 2. GDP per Capita (PPP) vs. Infant Mortality Rate. Index Mundi Country Facts. Web. 4Jan. 2012. <http//www. indexmundi. com/g/correlation. aspx? v1=67>. 3. Life Expectancy at Birth versus GDP per Capita (PPP). Statistical Consultants Ltd. Web. 4 Jan. 2012. <http//www. statisticalconsultants. co. nz/ weeklyfeatures/WF6. html>. 4. Table Chi-Square Probabilities. Faculty & Staff Webpages. Web. 4 Jan. 2012. <http//people. richland. edu/james/lecture/m170/tbl-chi. html>.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.