Dear list,

I have some problems with time-series data and missing values of time-invariant 
informations like sex or the birth-date.

Assume a data (d) structure like

id      birth           sex     year of observation
1       NA              NA      2006
1       1976-01-01      male    2007
1       NA              NA      2008

I am looking for a way to replace the missing values.

Right know my answer to this problem slows down R



for (i in 1:length(d[,1])){ # for all observations

        if (is.na(d$birth)[i])==F){ # Check if birth of observation(i) is 
missing
            d$birth_2[i] <- as.Date(birth[i],"%d.%m.%Y") 
        }else{
            d$birth2[i]  <- d$birth[id[i]==d$id & 
is.na(d$birth)==F],"%d.%m.%Y")[1] # if birth of observation (i) is missing, 
take a observation of another year
        }
    }
}

Result:


id      birth           sex     year of observation     birth2
1       NA              NA      2006                    1976-01-01
1       01.01.1976      male    2007                    1976-01-01
1       NA              NA      2008                    1976-01-01

unfortunately the data consists of over 20000 observations a year.

Does anybody know a better way?

Thanks

Mit freundlichen Grüßen

Andreas Kunzler
____________________________
Bundeszahnärztekammer (BZÄK)
Chausseestraße 13
10115 Berlin

Tel.: 030 40005-113
Fax:  030 40005-119

E-Mail: a.kunz...@bzaek.de 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to