df is a very large data frame with arrival estimates for many flights (DF$flightfact) at random times (df$PredTime). The error of the estimate is df$dt. My problem is that I want to know the prediction error at each minute before landing. This code works, but is very slow, and dominates everything. I tried using split(), but that rapidly ate up my 12 GB of memory. So, is there a better R way of doing this?
Thanks, Jim Rome flights = table(df$flightfact[1:dim(df)[1], drop=TRUE]) nflights = length(flights) flights = as.data.frame(flights) times = data.frame() # Split by flight for(i in 1:nflights) { tf = df[as.numeric(df$flightfact)==flights[i,1],] # This flight #check for at least 2 entries if(dim(tf)[1] < 2) { next } idf = interpolateTimes(tf) times = rbind(times, idf) } # Interpolate the times to every minute for 60 minutes # Return a new data frame interpolateTimes = function(df) { x = as.numeric(seq(from=0,to=60)) # The times to interpolate to dti = approx(as.numeric(df$PredTime), as.numeric(df$dt), x, method="linear",rule=1:1) # Make a new data frame of interpolated values idf = data.frame(time=dti$x, error=dti$y, runway=rep(df$lrw[1],length(dti$x)), flight=rep(df$flightfact[1], length(dti$x))) return(idf) }
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.