I think '[.POSIXlt' and '[<-.POSIXlt' don't need to normalize out-of-range values. I think they just make same length for all components, to ensure correct extraction or replacement for arbitrary index.
I have a thought of adding an optional argument for 'as.POSIXlt' applied to "POSIXlt" object. Possible name: normalize adjust fixup To allow recycling only without changing content, instead of TRUE or FALSE, maybe choice, like fixup = c("none", "balance", "normalize") , where "normalize" implies "balance", or adjust = c("none", "length", "content", "value") , where "content" and "value" are synonymous. By the way, Inf in 'sec' component is out-of-range! For 'gmtoff', NA or 0 should be put for unknown. A known 'gmtoff' may be [ositive, negative, or zero. The documentation says ‘gmtoff’ (Optional.) The offset in seconds from GMT: positive values are East of the meridian. Usually ‘NA’ if unknown, but ‘0’ could mean unknown. dlt <- .POSIXlt(list(sec = c(-999, 10000 + c(1:10,-Inf, NA)) + pi, # "out of range", non-finite, fractions min = 45L, hour = c(21L, 3L, NA, 4L), mday = 6L, mon = c(11L, NA, 3L), year = 116L, wday = 2L, yday = 340L, isdst = 1L)) as.POSIXct(dlt)[1] is NA on Linux with timezone without DST. For example, after Sys.setenv(TZ = "EST") ---------------- >>>>> Martin Maechler >>>>> on Wed, 12 Oct 2022 10:17:28 +0200 writes: >>>>> Kurt Hornik >>>>> on Tue, 11 Oct 2022 16:44:13 +0200 writes: >>>>> Davis Vaughan writes: >>> I've got a bit more information about this one. It seems like it >>> (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see >>> it before on my Mac, which defaults to `TZ = ""`. I think this is at >>> least explainable by the fact that those "optional" fields aren't >>> technically needed when the time zone is UTC. >> Exactly. Debugging `[<-.POSIlt` with >> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >> Sys.setenv(TZ = "UTC") >> x[1] <- NA >> shows we get into >> value <- unclass(as.POSIXlt(value)) >> if (ici) { >> for (n in names(x)) names(x[[n]]) <- nms >> } >> for (n in names(x)) x[[n]][i] <- value[[n]] >> where >> Browse[2]> names(value) >> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst" >> Browse[2]> names(x) >> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" >> [9] "isdst" "zone" "gmtoff" >> Without having looked at the code, the docs say >> ‘zone’ (Optional.) The abbreviation for the time zone in force at >> that time: ‘""’ if unknown (but ‘""’ might also be used for >> UTC). >> ‘gmtoff’ (Optional.) The offset in seconds from GMT: positive >> values are East of the meridian. Usually ‘NA’ if unknown, >> but ‘0’ could mean unknown. >> so perhaps we should fill with the values for the unknown case? >> -k > Well, > I think you both know I'm in the midst of dealing with these > issues, to fix both > [.POSIXlt and > [<-.POSIXlt > Yes, one needs a way to not only "fill" the partially filled > entries but also to *normalize* out-of-range values > (say negative seconds, minutes > 60, etc) > All this is available in our C code, but not on the R level, > so yesterday, I wrote a C function to be called via .Internal(.) > from a new R that provides this. > Provisionally called > balancePOSIXlt() > because it both balances the 9 to 11 list-components of POSIXlt > and it also puts all numbers of (sec, min, hour, mday, mon) > into a correct range (and also computes correctl wday and yday numbers). > but I'm happy for proposals of better names. > I had contemplated validatePOSIXlt() as alternative, but then > dismissed that as in some sense we now do agree that > "imbalanced" POSIXlt's are not really invalid .. > .. and yes, to Davis: Even though I've spent so many hours with > POSIXlt, POSIXct and Date during the last week, I'm still > surprised more often than I like by the effects of timezone > settings there. > Martin I have committed the new R and C code now, defining balancePOSIXlt(), to get feedback from the community. I've extended the documentation in help(DateTimeClasses), and notably factored out the description of POSIXlt mentioning the "ragged" and "out-of-range" cases. This needs more testing and experiments, and I have not announced it NEWS yet. Planned next is to use it in [.POSIXlt and [<-.POSIXlt so they will work correctly. But please share your thoughts, propositions, ... Martin [snip] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel