df is a very large data frame with arrival estimates for many flights
(DF$flightfact) at random times (df$PredTime). The error of the estimate
is df$dt.
My problem is that I want to know the prediction error at each minute
before landing. This code works, but is very slow, and dominates
everything. I tried using split(), but that rapidly ate up my 12 GB of
memory. So, is there a better R way of doing this?

Thanks,
Jim Rome

    flights = table(df$flightfact[1:dim(df)[1], drop=TRUE])
    nflights = length(flights)
    flights = as.data.frame(flights)
    times = data.frame()
    # Split by flight
    for(i in 1:nflights) {
        tf = df[as.numeric(df$flightfact)==flights[i,1],]    # This flight
        #check for at least 2 entries
        if(dim(tf)[1] < 2) {
            next
        }
        idf = interpolateTimes(tf)
        times = rbind(times, idf)
    }

# Interpolate the times to every minute for 60 minutes
# Return a new data frame
interpolateTimes = function(df) {
   x = as.numeric(seq(from=0,to=60)) # The times to interpolate to
   dti = approx(as.numeric(df$PredTime), as.numeric(df$dt), x,
method="linear",rule=1:1)
   # Make a new data frame of interpolated values
   idf = data.frame(time=dti$x, error=dti$y,
       runway=rep(df$lrw[1],length(dti$x)),
flight=rep(df$flightfact[1], length(dti$x)))
   return(idf)
}
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to