Re: [R] Comparing dates in two large data frames

2021-04-10 Thread Rui Barradas

Hello,

The following solution seems to work and is fast, like findInterval is.
It first determines where in df2$start is each value of df1$Time. Then 
uses that index to see if those Times are not greater than the 
corresponding df$end.

I checked against a small subset of df1 and the results were right.


result <- logical(nrow(df1))
inx <- findInterval(df1$Time, df2$start)
not_zero <- inx != 0
result[not_zero] <- df1$Time[not_zero] <= df2$end[ inx[not_zero] ]


Hope this helps,

Rui Barradas


Às 12:06 de 10/04/21, Kulupp escreveu:

Dear all,

I have two data frames (df1 and df2) and for each timepoint in df1 I 
want to know: is it whithin any of the timespans in df2? The result 
(e.g. "no" or "yes" or 0 and 1) should be shown in a new column of df1


Here is the code to create the two data frames (the size of the two data 
frames is approx. the same as in my original data frames):


# create data frame df1
ti1 <- seq.POSIXt(from=as.POSIXct("2020/01/01", tz="UTC"), 
to=as.POSIXct("2020/06/01", tz="UTC"), by="10 min")

df1 <- data.frame(Time=ti1)

# create data frame df2 with random timespans, i.e. start and end dates
start <- sort(sample(seq(as.POSIXct("2020/01/01", tz="UTC"), 
as.POSIXct("2020/06/01", tz="UTC"), by="1 mins"), 5000))

end   <- start + 120
df2 <- data.frame(start=start, end=end)

Everything I tried (ifelse combined with sapply or for loops) has been 
very very very slow. Thus, I am looking for a reasonably fast solution.


Thanks a lot for any hint in advance !

Cheers,

Thomas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Comparing dates in two large data frames

2021-04-10 Thread Kulupp

Dear all,

I have two data frames (df1 and df2) and for each timepoint in df1 I 
want to know: is it whithin any of the timespans in df2? The result 
(e.g. "no" or "yes" or 0 and 1) should be shown in a new column of df1


Here is the code to create the two data frames (the size of the two data 
frames is approx. the same as in my original data frames):


# create data frame df1
ti1 <- seq.POSIXt(from=as.POSIXct("2020/01/01", tz="UTC"), 
to=as.POSIXct("2020/06/01", tz="UTC"), by="10 min")

df1 <- data.frame(Time=ti1)

# create data frame df2 with random timespans, i.e. start and end dates
start <- sort(sample(seq(as.POSIXct("2020/01/01", tz="UTC"), 
as.POSIXct("2020/06/01", tz="UTC"), by="1 mins"), 5000))

end   <- start + 120
df2 <- data.frame(start=start, end=end)

Everything I tried (ifelse combined with sapply or for loops) has been 
very very very slow. Thus, I am looking for a reasonably fast solution.


Thanks a lot for any hint in advance !

Cheers,

Thomas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.