sqldf package:

> n <- 200
> x <- data.frame(id=sample(1:10,n, TRUE), timex=runif(n))
> y <- data.frame(id=sample(1:15, n, TRUE), timey=runif(n), runway=seq(n))
> require(sqldf)
> sqldf("select x.id, timex, y.id, timey, y.runway from x join y
+     where x.id = y.id and abs(timex - timey) < 0.005")
   id      timex id      timey runway
1   7 0.39110674  7 0.38859025     61
2   7 0.58829365  7 0.58581915    133
3   9 0.34610180  9 0.34866404     65
4   2 0.72069416  2 0.72080087      7
5   3 0.75491843  3 0.75640271     37
6   5 0.86930840  5 0.87373981     15
7   3 0.07181249  3 0.06935293    108
8   6 0.55273483  6 0.55286014    101
9  10 0.50377737 10 0.50563669    139
10  1 0.17325424  1 0.17129662     32
11  3 0.09013624  3 0.09281639    112
12  8 0.04805349  8 0.04618661    196
13  2 0.19428938  2 0.19260035     50
14 10 0.16194595 10 0.16565594     44
15  7 0.51193601  7 0.51352435     14
16  3 0.02331951  3 0.02119733     64
17  9 0.69456540  9 0.69376281     39
18  2 0.20070366  2 0.20432466    134
19 10 0.50438411 10 0.50563669    139
20  4 0.25271897  4 0.25211036     72
>


On Sat, Mar 20, 2010 at 10:20 AM, James Rome <jamesr...@gmail.com> wrote:

> I have two data frames of flight data,  but they have very different
> numbers of rows. They come from different sources, so the data are not
> identical.
>
> > names(oooi)
>  [1] "FltOrigDt"               "MkdCrrCd"
>  [3] "MkdFltNbr"               "DprtTrpnStnCd"
>  [5] "ArrTrpnStnCd"            "ActualOutLocalTimestamp"
>  [7] "ActualOffLocal"          "ActualOnLocal"
>  [9] "ActualInLocal"           "ArrivalGate"
> [11] "DepartureGate"           "Flight"
> [13] "OnDate"                  "MinutesIntoDay"
> [15] "OnHour"                  "pt"
>
> > names(runway)
>  [1] "OnDateTime"     "IATA"           "ICAO"           "Flight"
>  [5] "AircraftType"   "Tail"           "Arrived"        "STA"
>  [9] "Runway"         "From.To"        "Delay"          "OnDate"
> [13] "MinutesIntoDay" "pt"
>
> These sets have several hundred thousand rows.
>
> In both sets, pt is a POSIXct for the arrival time (from different
> sources). They are not identical, but surely should be within an hour of
> each other (hopefully a lot less), and the Flight fields must be the
> same. So
> (abs(runway$pt - oooi$pt) < 3600) & (runway$Flight == oooi$Flight)
> should pick out the corresponding rows in the two data sets (if there is
> a match).
>
> What I need to do is to take the Runway from runway and insert it into
> the oooi df for the correct flight.
>
> What is the best way to do this in R?
>
> Thanks,
> Jim Rome
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to