[ https://issues.apache.org/jira/browse/SPARK-31702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-31702: --------------------------------- Description: Old POSIXlt, POSIXct and Date become corrupt in SparkR. For example, see below: {code} # Non-existent timestamp in hybrid Julian and Gregorian Calendar showDF(createDataFrame(as.data.frame(list(list(POSIXct=as.POSIXct("1582-10-10 00:01:00"), POSIXlt=as.POSIXlt("1582-10-10 00:01:00")))))) {code} {code} +-------------------+-------------------+ | POSIXct| POSIXlt| +-------------------+-------------------+ |1582-09-30 00:33:08|1582-09-30 00:33:08| +-------------------+-------------------+ {code} See https://docs.google.com/document/d/1an3Mzv6s0naO4mDwGFHJ48gLT--6EliA1GG3kbgBymo/edit?usp=sharing Note that the results seem wrong from the very first implementation. The cause seems because R side uses Proleptic Gregorian calendar but JVM side is using hybrid Juilian and Gregoiran calendar. was: Old POSIXlt, POSIXct and Date become corrupt in SparkR. For example, see below: {code} # Non-existent timestamp in hybrid Julian and Gregorian Calendar showDF(createDataFrame(as.data.frame(list(list(POSIXct=as.POSIXct("1582-10-10 00:01:00"), POSIXlt=as.POSIXlt("1582-10-10 00:01:00")))))) {code} {code} +-------------------+-------------------+ | POSIXct| POSIXlt| +-------------------+-------------------+ |1582-09-30 00:33:08|1582-09-30 00:33:08| +-------------------+-------------------+ {code} See https://docs.google.com/document/d/1Upf6c5fNM59Q6nko-ipjLLae86x9mBejwuXshii-Azg/edit?usp=sharing Note that the results seem wrong from the very first implementation. The cause seems because R side uses Proleptic Gregorian calendar but JVM side is using hybrid Juilian and Gregoiran calendar. > Old POSIXlt, POSIXct and Date become corrupt due to calendar difference > ----------------------------------------------------------------------- > > Key: SPARK-31702 > URL: https://issues.apache.org/jira/browse/SPARK-31702 > Project: Spark > Issue Type: Bug > Components: SparkR, SQL > Affects Versions: 2.4.5, 3.0.0 > Reporter: Hyukjin Kwon > Priority: Major > > Old POSIXlt, POSIXct and Date become corrupt in SparkR. For example, see > below: > {code} > # Non-existent timestamp in hybrid Julian and Gregorian Calendar > showDF(createDataFrame(as.data.frame(list(list(POSIXct=as.POSIXct("1582-10-10 > 00:01:00"), POSIXlt=as.POSIXlt("1582-10-10 00:01:00")))))) > {code} > {code} > +-------------------+-------------------+ > | POSIXct| POSIXlt| > +-------------------+-------------------+ > |1582-09-30 00:33:08|1582-09-30 00:33:08| > +-------------------+-------------------+ > {code} > See > https://docs.google.com/document/d/1an3Mzv6s0naO4mDwGFHJ48gLT--6EliA1GG3kbgBymo/edit?usp=sharing > Note that the results seem wrong from the very first implementation. The > cause seems because R side uses Proleptic Gregorian calendar but JVM side is > using hybrid Juilian and Gregoiran calendar. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org