Institut für Arbeitsmarkt- und Berufsforschung

Navigation zu den wichtigsten Bereichen.


Zertifikat "audit berufundfamilie"

Inhaltsbereich: Institut für Arbeitsmarkt- und Berufsforschung

Fehlende Daten beim Record Linkage von Prozess- und Befragungsdaten



"To compare different missing data techniques, in this paper I use a survey where participants were among other things asked permission for combining the survey with administrative data (record linkage). For those who refuse their permission I set their survey answers to missing, creating pseudo-missing data due to an empirical relevant but unknown mechanism (compared to the statistical simulation of a missing data process). OLS Regression is performed using Complete Case Analysis (CCA), Multiple Imputation (MI) and two versions of Heckman's Sample Selection Model (SSM) to correct for the pseudo-missing data. Their results are compared to a regression based on the complete data set (Benchmark), that gives us the 'true' regression parameters. Results: All missing data techniques under analysis show only small deviations from the benchmark. If only one independent variable contains missing values, MI performs best. If the dependent variable has missing information, CCA and the Two-Step SSM perform better than MI. If missing data is a problem in many or all independent variables, all techniques except for the Maximum likelihood SSM perform equally well." (Author's abstract, IAB-Doku) ((en))

Further information


Bibliographical information

Krug, Gerhard (2009): Missing data in the record linkage of process and survey data : An empirical comparison of selected missing data techniques. (IAB-Discussion Paper, 07/2009), Nürnberg, 29 p.