Goodness-of-fit of the Model

Measures of goodness of fit are statistical tools used to explore the extent to which the fitted response obtained from the postulated model compares with the observed data. Clearly, the fit is good if there is a good agreement between the fitted and the observed data.

Likelihood-Ratio Test

The likelihood ratio test statistic (LRT) is the most common test for assessment of overall goodness of fit of logistic regression model. The likelihood ratio test is used to test the significance of a number of explanatory variables. This is appropriate for a variety of types of statistical models. The likelihood-ratio test is used to test the ratio of the maximized value of the likelihood function for the full model (Lful) over the maximized value of the likelihood function for the reduced model (Lred).

The likelihood-ratio test statistic is given by:

LRT = ?2(lred? lful),

where,lred and lful are the log likelihood function of the reduced and full model, respectively (Hosmer and Lemeshow, 2011).

Hosmer-Lemshow Test

The Hosmer-Lemshow test statistic evaluates the goodness-of-fit of the model by creating 10 equal groups of subjects and then compares the number actually in each group (observed) to the number predicted by the logistic regression model. The test is similar to a ?2 test statistic and has the advantage of partitioning the observations into groups of approximately equal size, and therefore, there are less likely to be groups with very low observed and expected frequencies. In this case, better model fit is indicated by a smaller difference in the observed and predicted classification. The Hosmer-Lemeshow test statistic is given by:

C ?=?_(k=1)^g??(O_k-E_k)?^2/V_k ,

where, Ek= npk, Vk= npk(1 ? pk), g is the number of groups, Okis observed number of events in the kth group. This test statistic has approximately ?2 distribution with (g ? 2) degrees of freedom(Agresti, 1996).

3.5.5 Statistical tests of individual parameters

Wald test

The Wald test is also an alternative test which is commonly used to test the significance of the individual logistic regression coefficients for each independent variable (that is, to test the null hypothesis in logistic regression analysis that a particular logit (effect) coefficient is zero i.e. H0:?i= 0 against ?i? 0. The Wald test statistic is:

w=(? ?_i^2)/(var(? ?_i))

For large sample size this test statistic has an approximate chi-square distribution with one degree of freedom (Menard, 2002). Furthermore, likelihood ratio test and score test also used for a significance test of the null hypothesis H0:?i= 0. They all exploit the large sample normality of maximum likelihood estimators. For small to moderate sample sizes, the likelihood-ratio test is usually more reliable than the Wald test (Agresti, 1996).