You nailed it, It was a tz issue. Thanks!


> Date: Mon, 1 Oct 2012 09:41:35 +0200
> From: achim.zeil...@uibk.ac.at
> To: vindo...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] merge.zoo returns unmatched dates
> 
> On Mon, 1 Oct 2012, Vindoggy ! wrote:
> 
> >
> > Sorry for the lack of reproducible data, but this seems to be a problem 
> > inherent to my dataset and I can't figure out where the issue is.
> >
> > I have several data frames set up as a time series with identical POSIXct 
> > date formats. If I keep the original data in data frame format and merge 
> > them using base merge- everything is perfect and everyone is happy.
> >
> > If I transform the data frames to zoo objects, and then do a merge.zoo- the 
> > data seem to become uncoupled from the original data. Even more unusual is 
> > that some dates in the new merged data set  are prior to the original data 
> > set. I've attempted bellow to show what this looks like, and I hope someone 
> > has a suggestion as to what may be causing the problem.
> >
> > Here is one set of data in data.frame format
> >
> > head(Vup)
> >                 Date Velocity_m/s
> > 1 2010-01-21 07:42:00     1.217943
> > 2 2010-01-21 07:43:00     1.624395
> > 3 2010-01-21 07:44:00     1.526379
> > 4 2010-01-21 07:45:00     1.456831
> > 5 2010-01-21 07:46:00     1.245390
> > 6 2010-01-21 07:47:00     1.374330
> >
> > str(Vup)
> > 'data.frame':    7168 obs. of  2 variables:
> > $ Date        : POSIXct, format: "2010-01-21 07:42:00" "2010-01-21 
> > 07:43:00" ...
> > $ Velocity_m/s: num  1.22 1.62 1.53 1.46 1.25 ...
> >
> > And here is a second in data.frame format:
> >
> > head(PAS)
> >                 Date               PAS
> > 1 2010-01-21 05:01:00   0.0013938
> > 2 2010-01-21 05:02:00   0.0015331
> > 3 2010-01-21 05:03:00   0.0016725
> > 4 2010-01-21 05:04:00   0.0016725
> > 5 2010-01-21 05:05:00   0.0012265
> > 6 2010-01-21 05:06:00   0.0015889
> >
> > str(PAS)
> > 'data.frame':    5520 obs. of  2 variables:
> > $ Date       : POSIXct, format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" 
> > ...
> > $ PAS: num  0.00139 0.00153 0.00167 0.00167 0.00123 ...
> >
> >
> >
> > Using zoo:
> >
> > PASmin<-zoo(as.matrix(PAS[,2]),as.POSIXct(PAS[,1],format="%Y-%m-%d 
> > %H:%M:%S",tz="UTC"))
> >
> > str(PASmin)
> > ?zoo? series from 2010-01-21 05:01:00 to 2010-01-27 13:01:00
> >  Data: num [1:5520, 1] 0.00139 0.00153 0.00167 0.00167 0.00123 ...
> > - attr(*, "dimnames")=List of 2
> >  ..$ : NULL
> >  ..$ : chr "PAS"
> >  Index:  POSIXct[1:5520], format: "2010-01-21 05:01:00" "2010-01-21 
> > 05:02:00" "2010-01-21 05:03:00" ...
> >
> >
> >
> >
> > ADP_UPmin<-zoo(as.matrix(Vup[,2]),as.POSIXct(Vup[,1], format="%Y-%m-%d 
> > %H:%M",tz="UTC"))
> >
> > str(ADP_UPmin)
> > ?zoo? series from 2010-01-21 07:42:00 to 2010-01-26 20:12:00
> >  Data: num [1:7168, 1] 1.22 1.62 1.53 1.46 1.25 ...
> > - attr(*, "dimnames")=List of 2
> >  ..$ : NULL
> >  ..$ : chr "UP_Velocity_m/s"
> >  Index:  POSIXct[1:7168], format: "2010-01-21 07:42:00" "2010-01-21 
> > 07:43:00" "2010-01-21 07:44:00" ...
> >
> >
> > And if I merge the two zoo objects I get this:
> >
> > M<-merge(ADP_UPmin,PASmin)
> > head(M)
> >
> >                    UP_Velocity_m/s       PAS
> > 2010-01-20 21:01:00              NA 0.0013938
> > 2010-01-20 21:02:00              NA 0.0015331
> > 2010-01-20 21:03:00              NA 0.0016725
> > 2010-01-20 21:04:00              NA 0.0016725
> > 2010-01-20 21:05:00              NA 0.0012265
> > 2010-01-20 21:06:00              NA 0.0015889
> >
> >
> > ?zoo? series from 2010-01-20 21:01:00 to 2010-01-27 05:01:00
> >  Data: num [1:8499, 1:2] NA NA NA NA NA NA NA NA NA NA ...
> > - attr(*, "dimnames")=List of 2
> >  ..$ : NULL
> >  ..$ : chr [1:2] "UP_Velocity_m/s" "PAR"
> >  Index:  POSIXct[1:8499], format: "2010-01-20 21:01:00" "2010-01-20 
> > 21:02:00" "2010-01-20 21:03:00" ...
> >
> >
> > For some reason I can not figure out, even though both the PAS data frame 
> > and PAS zoo object starts at 2010-01-21 05:01:00, once merged the PAS data 
> > starts a day earlier at 2010-01-20 21:01:00.  The actual numeric data looks 
> > good, but both variables have no come uncoupled from the time series dates 
> > (The Velocity data is similarity uncoupled). And as stated before, doing an 
> > non-zoo merge on the data.frame data works fine.
> >
> > Anyone got any ideas what's going on?
> 
> My guess is that you create both zoo series with time zone UTC but that 
> the TZ attribute gets lost upon the merge. Then, the time is displayed in 
> your systems time zone (which you haven't told us) which apparently is a 
> couple of hours before UTC.
> 
> On my system (which is in CET) I can create a series with UTC times
> 
> R> x <- zoo(1:2, as.POSIXct(c("2012-01-01 00:00:00",
> +    "2012-01-01 01:00:00"), format = "%Y-%m-%d %H:%M:%S", tz = "UTC"))
> R> x
> 2012-01-01 00:00:00 2012-01-01 01:00:00
>                    1                   2
> 
> The times are in UTC as requested, but applying the c() method, they get 
> dropped. See ?c.POSIXct.
> 
> R> time(x)
> [1] "2012-01-01 00:00:00 UTC" "2012-01-01 01:00:00 UTC"
> R> c(time(x))
> [1] "2012-01-01 01:00:00 CET" "2012-01-01 02:00:00 CET"
> 
> Hence:
> 
> R> merge(x, x)
>                      x x
> 2012-01-01 01:00:00 1 1
> 2012-01-01 02:00:00 2 2
> 
> But you can set the system time in your R session to UTC which gives the 
> desired result:
> 
> R> Sys.setenv(TZ = "UTC")
> R> merge(x, x)
>                      x x
> 2012-01-01 00:00:00 1 1
> 2012-01-01 01:00:00 2 2
> 
> hth,
> Z
> 
> >
> >
> >     [[alternative HTML version deleted]]
> >
> >
                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to