Hi R users:
I have the British Household Panel Survey (BHPS) in .tab format. I want to feed it through the Amelia package (which will be an interesting job in itself).. But first I need to convert the various types of missing value (from about -9 to -1) to a more generic NA code. Ive written the following function to do this: BHPS.converter <- function(from="D:/Data/BHPS/UKDA-5151-tab/tab/", to="D:/BHPS/NA/", ext="tab" ) { from.files <- dir(from, pattern=paste(".",ext,"$",sep="") ) existing.to.files <- dir(to, pattern=paste(".",ext,"$",sep="") ) still.to.do.index <- 1:length(from.files) still.to.do.index <- still.to.do.index[-match(existing.to.files, from.files)] obs.to.do <- length(still.to.do.index) for (i in 1:obs.to.do){ temp.table <- read.delim(paste(from,from.files[still.to.do.index[i]], sep="")) print(paste("read:", from.files[still.to.do.index[i]])) temp.table[temp.table < 0 ] <- NA write.table(temp.table, file=paste(to,from.files[still.to.do.index[i]], sep="")) print(paste("written:", from.files[still.to.do.index[i]])) } rm(i, from.files, existing.to.files, still.to.do.index, obs.to.do, temp.table) } It checks for existing files in the to directory (where files which have been modified with R- -> NA) because when I tried to do this conversion operation previously it got about ½ way through then crashed. The problem is that it crashes *this time* too, without displaying a prompt to say its read a single file. The file it gets stuck on is about 75mb in size. I am using a dual-core 3.2Ghz Pentium D processor with 2 Gb memory (& 2Gb virtual memory), and (unfortunately) Windows XP. Questions: 1) Any general tips on how to increase the amount of memory available to process the file? 2) Can you see a more efficient way of doing what Im doing? 3) Whats the best way of coding for multiple forms of NA? the BHPS code -8 (meaning inapplicable, not routed for this respondent) should really be distinguished from other forms of nonresponse... Thanks, Jon p.s. Apologies if this is slightly too vague/long winded... Jon Minton [[alternative HTML version deleted]]
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.