top of page
PREDICTIVE MODEL & EDA ON WHO LIFE EXPECTANCY DATA
Everyone wants a healthy and long life, and does everything possible to make their bodies fit in order to increase their life expectancy. However, it depends on many factors and some of them are uncontrollable such as birthplace, gender, family background. A person can always have a chance to change the factors such as choices in terms of various addictions and BMI. This project aims to predict the average life expectancy in each nation, so governments and WHO can take actions to improve it.
These factors affect the average life expectancy of a person.
​
1 Year
2 Status
3 Country
4 Adult_Mortality
5 Infant_Deaths
6 Alcohol
7 Percentage_Exp
8 HepatitisB
9 Measles
10 BMI
11 Under_Five_Deaths
12 Polio
13 Tot_Exp
14 Diphtheria
15 HIV/AIDS
16 GDP
17 Population
18 thinness_1to19_years
19 thinness_5to9_years
20 Income_Comp_Of_Resources
21 Schooling
​
There is a major difference in life expectancy between developing and developed nations. This can be seen in the chart below.
These countries have the highest life expectancy on average
The correlation matrix is also plotted. The figure below shows the correlation matrix diagram of all features.
After that, the data is normalized, explored using various features of Keras and SK-Learn, and finally, split into training and testing datasets. 20% of the total data is reserved for testing.
These models are used for forecasting the value of life expectancy.
1) Linear Regression Model
2) Mixed Effect Model
3) Deep Neural Networks
Linear Regression Model :
​
Firstly, Linear Regression is used to predict life expectancy. The performance is measured in terms of RMSE (Root Mean Squared Error). The RSME loss for Linear Regression was 4.9438. The chart below shows the relation between residuals and predicted values.
Mixed Effect Model
This model proved very effective for this dataset. RMSE loss for the mixed effect model is 1.556047. The chart below shows the relation between residuals and predicted values.
Deep Neural Networks
Neural network is designed as follows.
layers.Dense(128, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(32, activation='relu'),
layers.Dense(16, activation='relu'),
layers.Dense(1)
These are some predictions derived from neural networks and the actual value of the life expectancy.
Predictions True
568 65.390266 64.5
569 66.062790 59.2
570 73.869949 73.3
571 71.569748 78.0
The RMSE loss for DNN is 3.1332
​
In this particular case, the mixed effect model is superior to both normal linear regression and the used neural network architecture, since it takes into account the dependence of the data.
bottom of page