I have applied these patches to R-devel and in my limited testing they appear to work as desired. I have to say that I never ran into the problem these patches were meant to solve so I may not be the best person to do the testing.
-roger Marc Schwartz wrote: > Hi all, > > Apologies for the delay in my engaging in this thread. I was traveling > this week. > > The problem that Gabor raised was caused by the patch that I submitted > to fix a problem with the referenced functions when using 'months' and > 'years' as the interval. The prior versions were problematic: > > https://stat.ethz.ch/pipermail/r-devel/2008-January/048004.html > > The patch fixed the error, but since I used hist.Date() as the reference > model and did not note the subtle difference in cut.Date() relative to > specifying the breaks increment value, this functionality was lost when > the same modification was made to the code in cut.Date(). > > Roger's patch helps, but does not totally remedy the situation. One also > needs to modify the method used for specifying the max value 'end' for > the breaks in order to include the max 'x' Date value in the result. > > Hence, I am attaching proposed patches against R-devel for > base:::dates.R and base:::datetime.R. > > I am also attaching a patch for tests:::reg-tests-1.R to add a check for > this situation to the regression tests that were also added subsequent > to that prior set of patches that I had submitted. > > If perhaps Roger and Gabor could so some testing on these patches before > they are considered for inclusion into the R-devel tree, it would be > helpful to check to see if I have missed something else here. > > Thanks for raising this issue. > > Regards, > > Marc Schwartz > > Roger D. Peng wrote: >> Seems changes in r44116 force the interval to be single months (or >> years) instead of whatever the user specified. I think the attached >> patches correct this. >> >> Interestingly, 'cut' and 'seq' allow for the 'breaks' specification to >> be something like "3 months" but the documentation for 'hist' does not >> allow for this type of specification. >> >> -roger >> >> Gabor Grothendieck wrote: >>> cut.Date and cut.POSIXt indicate that the breaks argument >>> can be an integer followed by a space followed by "year", etc. >>> but it seems the integer is ignored. >>> >>> For example, I assume that breaks = "3 months" is supposed >>> to cut it into quarters but, in fact, it cuts it into months as if >>> 3 had not been there. >>> >>>> d <- seq(Sys.Date(), length = 12, by = "month") >>>> cut(d, "3 months") >>> [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01 >>> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01 >>> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 >>> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 >>> 2009-02-01 >>>> cut(as.POSIXct(d), "3 months") >>> [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01 >>> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01 >>> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 >>> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 >>> 2009-02-01 >>>> cut(as.POSIXlt(d), "3 months") >>> [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01 >>> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01 >>> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 >>> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 >>> 2009-02-01 >>> > > ------------------------------------------------------------------------ > > --- datesORIG.R 2008-03-20 14:25:13.000000000 -0500 > +++ dates.R 2008-03-20 14:38:21.000000000 -0500 > @@ -322,17 +322,19 @@ > if(valid == 3) { > start$mday <- 1 > end <- as.POSIXlt(max(x, na.rm = TRUE)) > - end <- as.POSIXlt(end + (31 * 86400)) > + step <- ifelse(length(by2) == 2, as.integer(by2[1]), 1) > + end <- as.POSIXlt(end + (31 * step * 86400)) > end$mday <- 1 > - breaks <- as.Date(seq(start, end, "months")) > + breaks <- as.Date(seq(start, end, breaks)) > } else if(valid == 4) { > start$mon <- 0 > start$mday <- 1 > end <- as.POSIXlt(max(x, na.rm = TRUE)) > - end <- as.POSIXlt(end + (366 * 86400)) > + step <- ifelse(length(by2) == 2, as.integer(by2[1]), 1) > + end <- as.POSIXlt(end + (366 * step * 86400)) > end$mon <- 0 > end$mday <- 1 > - breaks <- as.Date(seq(start, end, "years")) > + breaks <- as.Date(seq(start, end, breaks)) > } else { > start <- .Internal(POSIXlt2Date(start)) > if (length(by2) == 2) incr <- incr * as.integer(by2[1]) > > > ------------------------------------------------------------------------ > > --- datetimeORIG.R 2008-03-20 14:25:20.000000000 -0500 > +++ datetime.R 2008-03-20 15:25:49.000000000 -0500 > @@ -727,17 +727,19 @@ > if(valid == 6) { > start$mday <- 1 > end <- as.POSIXlt(max(x, na.rm = TRUE)) > - end <- as.POSIXlt(end + (31 * 86400)) > + step <- ifelse(length(by2) == 2, as.integer(by2[1]), 1) > + end <- as.POSIXlt(end + (31 * step * 86400)) > end$mday <- 1 > - breaks <- seq(start, end, "months") > + breaks <- seq(start, end, breaks) > } else if(valid == 7) { > start$mon <- 0 > start$mday <- 1 > end <- as.POSIXlt(max(x, na.rm = TRUE)) > - end <- as.POSIXlt(end + (366 * 86400)) > + step <- ifelse(length(by2) == 2, as.integer(by2[1]), 1) > + end <- as.POSIXlt(end + (366 * step* 86400)) > end$mon <- 0 > end$mday <- 1 > - breaks <- seq(start, end, "years") > + breaks <- seq(start, end, breaks) > } else { > if (length(by2) == 2) incr <- incr * as.integer(by2[1]) > maxx <- max(x, na.rm = TRUE) > > > ------------------------------------------------------------------------ > > --- reg-tests-1ORIG.R 2008-03-20 09:18:19.000000000 -0500 > +++ reg-tests-1.R 2008-03-20 15:15:56.000000000 -0500 > @@ -5025,7 +5025,7 @@ > ## was about 0.0005 in 2.6.1 patched > > > -## tests of problems fixed by Marc Schwarz's patch for > +## tests of problems fixed by Marc Schwartz's patch for > ## cut/hist for Dates and POSIXt > Dates <- seq(as.Date("2005/01/01"), as.Date("2009/01/01"), "day") > months <- format(Dates, format = "%m") > @@ -5036,20 +5036,32 @@ > stopifnot(identical(hist(Dates, "month", plot = FALSE)$counts, mn)) > # Test cut.Date() for months > stopifnot(identical(as.vector(table(cut(Dates, "month"))), mn)) > +# Test cut.Date() for 3 months > +stopifnot(identical(as.vector(table(cut(Dates, "3 months"))), > + as.integer(colSums(matrix(c(mn, 0, 0), nrow = 3))))) > # Test hist.Date() for years > stopifnot(identical(hist(Dates, "year", plot = FALSE)$counts, ty)) > # Test cut.Date() for years > stopifnot(identical(as.vector(table(cut(Dates, "years"))),ty)) > +# Test cut.Date() for 3 years > +stopifnot(identical(as.vector(table(cut(Dates, "3 years"))), > + as.integer(colSums(matrix(c(ty, 0), nrow = 3))))) > > Dtimes <- as.POSIXlt(Dates) > # Test hist.POSIXt() for months > stopifnot(identical(hist(Dtimes, "month", plot = FALSE)$counts, mn)) > # Test cut.POSIXt() for months > stopifnot(identical(as.vector(table(cut(Dtimes, "month"))), mn)) > +# Test cut.POSIXt() for 3 months > +stopifnot(identical(as.vector(table(cut(Dtimes, "3 months"))), > + as.integer(colSums(matrix(c(mn, 0, 0), nrow = 3))))) > # Test hist.POSIXt() for years > stopifnot(identical(hist(Dtimes, "year", plot = FALSE)$counts, ty)) > # Test cut.POSIXt() for years > stopifnot(identical(as.vector(table(cut(Dtimes, "years"))), ty)) > +# Test cut.POSIXt() for 3 years > +stopifnot(identical(as.vector(table(cut(Dtimes, "3 years"))), > + as.integer(colSums(matrix(c(ty, 0), nrow = 3))))) > ## changed in 2.6.2 > > -- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel