[ 
https://issues.apache.org/jira/browse/SPARK-31702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-31702:
---------------------------------
    Description: 
Old POSIXlt, POSIXct and Date become corrupt in SparkR. For example, see below:

{code}
# Non-existent timestamp in hybrid Julian and Gregorian Calendar
showDF(createDataFrame(as.data.frame(list(list(POSIXct=as.POSIXct("1582-10-10 
00:01:00"), POSIXlt=as.POSIXlt("1582-10-10 00:01:00"))))))
{code}

{code}
+-------------------+-------------------+
|            POSIXct|            POSIXlt|
+-------------------+-------------------+
|1582-09-30 00:33:08|1582-09-30 00:33:08|
+-------------------+-------------------+
{code}

See 
https://docs.google.com/document/d/1an3Mzv6s0naO4mDwGFHJ48gLT--6EliA1GG3kbgBymo/edit?usp=sharing

Note that the results seem wrong from the very first implementation. The cause 
seems because R side uses Proleptic Gregorian calendar but JVM side is using 
hybrid Juilian and Gregoiran calendar.

  was:
Old POSIXlt, POSIXct and Date become corrupt in SparkR. For example, see below:

{code}
# Non-existent timestamp in hybrid Julian and Gregorian Calendar
showDF(createDataFrame(as.data.frame(list(list(POSIXct=as.POSIXct("1582-10-10 
00:01:00"), POSIXlt=as.POSIXlt("1582-10-10 00:01:00"))))))
{code}

{code}
+-------------------+-------------------+
|            POSIXct|            POSIXlt|
+-------------------+-------------------+
|1582-09-30 00:33:08|1582-09-30 00:33:08|
+-------------------+-------------------+
{code}

See 
https://docs.google.com/document/d/1Upf6c5fNM59Q6nko-ipjLLae86x9mBejwuXshii-Azg/edit?usp=sharing

Note that the results seem wrong from the very first implementation. The cause 
seems because R side uses Proleptic Gregorian calendar but JVM side is using 
hybrid Juilian and Gregoiran calendar.


> Old POSIXlt, POSIXct and Date become corrupt due to calendar difference
> -----------------------------------------------------------------------
>
>                 Key: SPARK-31702
>                 URL: https://issues.apache.org/jira/browse/SPARK-31702
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR, SQL
>    Affects Versions: 2.4.5, 3.0.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> Old POSIXlt, POSIXct and Date become corrupt in SparkR. For example, see 
> below:
> {code}
> # Non-existent timestamp in hybrid Julian and Gregorian Calendar
> showDF(createDataFrame(as.data.frame(list(list(POSIXct=as.POSIXct("1582-10-10 
> 00:01:00"), POSIXlt=as.POSIXlt("1582-10-10 00:01:00"))))))
> {code}
> {code}
> +-------------------+-------------------+
> |            POSIXct|            POSIXlt|
> +-------------------+-------------------+
> |1582-09-30 00:33:08|1582-09-30 00:33:08|
> +-------------------+-------------------+
> {code}
> See 
> https://docs.google.com/document/d/1an3Mzv6s0naO4mDwGFHJ48gLT--6EliA1GG3kbgBymo/edit?usp=sharing
> Note that the results seem wrong from the very first implementation. The 
> cause seems because R side uses Proleptic Gregorian calendar but JVM side is 
> using hybrid Juilian and Gregoiran calendar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to