On 3/20/2010 11:52 AM, Daniel Malter wrote:

If the flight identifiers runway$Flight and oooi$Flight are unique (i.e.
only one observation has the same identifier in each dataset), you could use
merge() to bind together the dataset based on matching the two. See,

?merge

Also, I see an OnDate variable in both dataset. So if Flight does not
provide unique identification, maybe Flight and OnDate together do, which
can also be handled in merge.

Let us know if that solves the problem.

Best,
Daniel 
-----------------------------------
Alas, the flight names are not unique (they fly each day). You would think that 
the OnDate would be the same, but flights arriving at midnight could appear on 
different days, which is why I am using seconds past 1/1/1970.

Will merge work with different length dataframes? Perhaps I could do it in 
multiple steps, assuming that the dates were the same, and then fixing the 
errors?

And I found out that abs() will not take difftime as an argument. I hope I can 
multiply a difftime by itself and check that way.

And to use sqldf, it looks as if I have to read the source data files directly 
into sqldf to use it. It has to make a database. In that case, wouldn't I be 
better doing the whole thing in a database?

Jim

> names(oooi)
>   
 [1] "FltOrigDt"               "MkdCrrCd"              
 [3] "MkdFltNbr"               "DprtTrpnStnCd"         
 [5] "ArrTrpnStnCd"            "ActualOutLocalTimestamp"
 [7] "ActualOffLocal"          "ActualOnLocal"         
 [9] "ActualInLocal"           "ArrivalGate"           
[11] "DepartureGate"           "Flight"                
[13] "OnDate"                  "MinutesIntoDay"        
[15] "OnHour"                  "pt"  


> names(runway)
>   
 [1] "OnDateTime"     "IATA"           "ICAO"           "Flight"       
 [5] "AircraftType"   "Tail"           "Arrived"        "STA"          
 [9] "Runway"         "From.To"        "Delay"          "OnDate"       
[13] "MinutesIntoDay" "pt"   

These sets have several hundred thousand rows.

In both sets, pt is a POSIXct for the arrival time (from different
sources). They are not identical, but surely should be within an hour of
each other (hopefully a lot less), and the Flight fields must be the
same. So
(abs(runway$pt - oooi$pt) < 3600) & (runway$Flight == oooi$Flight)
should pick out the corresponding rows in the two data sets (if there is
a match).

What I need to do is to take the Runway from runway and insert it into
the oooi df for the correct flight.

What is the best way to do this in R?

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to