>>>>> Kurt Hornik >>>>> on Tue, 11 Oct 2022 16:44:13 +0200 writes:
>>>>> Davis Vaughan writes: >> I've got a bit more information about this one. It seems like it >> (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see >> it before on my Mac, which defaults to `TZ = ""`. I think this is at >> least explainable by the fact that those "optional" fields aren't >> technically needed when the time zone is UTC. > Exactly. Debugging `[<-.POSIlt` with > x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) > Sys.setenv(TZ = "UTC") > x[1] <- NA > shows we get into > value <- unclass(as.POSIXlt(value)) > if (ici) { > for (n in names(x)) names(x[[n]]) <- nms > } > for (n in names(x)) x[[n]][i] <- value[[n]] > where > Browse[2]> names(value) > [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst" > Browse[2]> names(x) > [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" > [9] "isdst" "zone" "gmtoff" > Without having looked at the code, the docs say > ‘zone’ (Optional.) The abbreviation for the time zone in force at > that time: ‘""’ if unknown (but ‘""’ might also be used for > UTC). > ‘gmtoff’ (Optional.) The offset in seconds from GMT: positive > values are East of the meridian. Usually ‘NA’ if unknown, > but ‘0’ could mean unknown. > so perhaps we should fill with the values for the unknown case? > -k Well, I think you both know I'm in the midst of dealing with these issues, to fix both [.POSIXlt and [<-.POSIXlt Yes, one needs a way to not only "fill" the partially filled entries but also to *normalize* out-of-range values (say negative seconds, minutes > 60, etc) All this is available in our C code, but not on the R level, so yesterday, I wrote a C function to be called via .Internal(.) from a new R that provides this. Provisionally called balancePOXIXlt() because it both balances the 9 to 11 list-components of POSIXlt and it also puts all numbers of (sec, min, hour, mday, mon) into a correct range (and also computes correctl wday and yday numbers). but I'm happy for proposals of better names. I had contemplated validatePOSIXlt() as alternative, but then dismissed that as in some sense we now do agree that "imbalanced" POSIXlt's are not really invalid .. .. and yes, to Davis: Even though I've spent so many hours with POSIXlt, POSIXct and Date during the last week, I'm still surprised more often than I like by the effects of timezone settings there. Martin >> I can reproduce this now on my personal Mac: >> ``` >> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >> Sys.setenv(TZ = "") >> x[1] <- NA >> x >> #> [1] NA >> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >> Sys.setenv(TZ = "America/New_York") >> x[1] <- NA >> x >> #> [1] NA >> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >> Sys.setenv(TZ = "UTC") >> x[1] <- NA >> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero >> x >> #> [1] "2013-01-31 CST" >> ``` >> Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub >> Actions platforms where the bug exists (note they all set `TZ = "UTC"`!): >> Linux: >> ``` >>> sessionInfo() >> R version 4.2.1 (2022-06-23) >> Platform: x86_64-pc-linux-gnu (64-bit) >> Running under: Ubuntu 18.04.6 LTS >> Matrix products: default >> BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 >> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so >> locale: >> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 >> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 >> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C >> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> loaded via a namespace (and not attached): >> [1] compiler_4.2.1 >>> Sys.getenv("TZ") >> [1] "UTC" >> ``` >> Mac: >> ``` >>> sessionInfo() >> R version 4.2.1 (2022-06-23) >> Platform: x86_64-apple-darwin17.0 (64-bit) >> Running under: macOS Big Sur ... 10.16 >> Matrix products: default >> BLAS: >> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib >> LAPACK: >> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> loaded via a namespace (and not attached): >> [1] compiler_4.2.1 >>> Sys.getenv("TZ") >> [1] "UTC" >> ``` >> Windows: >> This is the best I can get you, sorry (remote worker issues), but note that >> it does also say `tz UTC` like the others. >> ``` >> version R version 4.2.1 (2022-06-23 ucrt) >> os Windows Server x64 (build 20348) >> system x86_64, mingw32 >> ui RTerm >> language (EN) >> collate English_United States.utf8 >> ctype English_United States.utf8 >> tz UTC >> date 2022-10-11 >> ``` >> And here is my Mac where the bug doesn't show up by default because `TZ = >> ""`: >> ``` >>> sessionInfo() >> R version 4.2.1 (2022-06-23) >> Platform: x86_64-apple-darwin17.0 (64-bit) >> Running under: macOS Big Sur ... 10.16 >> Matrix products: default >> BLAS: >> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib >> LAPACK: >> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> loaded via a namespace (and not attached): >> [1] compiler_4.2.1 >>> Sys.getenv("TZ") >> [1] "" >>> Sys.timezone() >> [1] "America/New_York" >> ``` >> -Davis >> On Thu, Oct 6, 2022 at 9:33 AM Davis Vaughan <da...@rstudio.com> wrote: >>> Hi all, >>> >>> I have found another POSIXlt bug while I've been fiddling around with it. >>> This one only appears on specific OSes, because it has to do with the fact >>> that the `gmtoff` field is optional, and isn't always used on all OSes. It >>> also doesn't seem to be specific to r-devel, I think it has been there >>> awhile. >>> >>> Here is the bug: >>> >>> ``` >>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >>> >>> # Oh no! >>> x[1] <- NA >>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero >>> ``` >>> >>> If you look at the objects, you can see that `x` has a `gmtoff` field, but >>> `NA` (when converted to POSIXlt, which is what `[<-.POSIXlt` does) does not: >>> >>> ``` >>> unclass(x) >>> #> $sec >>> #> [1] 0 >>> #> >>> #> $min >>> #> [1] 0 >>> #> >>> #> $hour >>> #> [1] 0 >>> #> >>> #> $mday >>> #> [1] 31 >>> #> >>> #> $mon >>> #> [1] 0 >>> #> >>> #> $year >>> #> [1] 113 >>> #> >>> #> $wday >>> #> [1] 4 >>> #> >>> #> $yday >>> #> [1] 30 >>> #> >>> #> $isdst >>> #> [1] 0 >>> #> >>> #> $zone >>> #> [1] "CST" >>> #> >>> #> $gmtoff >>> #> [1] -21600 >>> #> >>> #> attr(,"tzone") >>> #> [1] "America/Chicago" "CST" "CDT" >>> >>> unclass(as.POSIXlt(NA)) >>> #> $sec >>> #> [1] NA >>> #> >>> #> $min >>> #> [1] NA >>> #> >>> #> $hour >>> #> [1] NA >>> #> >>> #> $mday >>> #> [1] NA >>> #> >>> #> $mon >>> #> [1] NA >>> #> >>> #> $year >>> #> [1] NA >>> #> >>> #> $wday >>> #> [1] NA >>> #> >>> #> $yday >>> #> [1] NA >>> #> >>> #> $isdst >>> #> [1] -1 >>> #> >>> #> attr(,"tzone") >>> #> [1] "UTC" >>> ``` >>> >>> The problem seems to be that `[<-.POSIXlt` assumes that if the field was >>> there in `x` then it must also be there in `value`: >>> >>> https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/library/base/R/datetime.R#L1303-L1304 >>> >>> But this isn't the case for the `NA` value that was converted to POSIXlt. >>> >>> I can't reproduce this on my personal Mac, but it affects the Linux, Mac, >>> and Windows machines we use for the lubridate CI checks through GitHub >>> Actions. >>> >>> Thanks, >>> Davis >>> ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel