[ https://issues.apache.org/jira/browse/SPARK-31212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064017#comment-17064017 ]
Maxim Gekk commented on SPARK-31212: ------------------------------------ The isLeapYear() function in 2.4 assumes Proleptic Gregorian calendar: https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L600-L602 but actually Spark 2.4 is based on the hybrid calendar Julian+Gregorian as we can see at https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L513-L517 It means the following functions in DateTimeUtils return incorrect results for dates before Gregorian cutover days: # getQuarter # splitDate # getMonth # getDayOfMonth # firstDayOfMonth # dateAddMonths # stringToTimestamp # stringToDate # monthsBetween # getLastDayOfMonth /cc [~cloud_fan] [~hyukjin.kwon] > Failure of casting the '1000-02-29' string to the date type > ----------------------------------------------------------- > > Key: SPARK-31212 > URL: https://issues.apache.org/jira/browse/SPARK-31212 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.5 > Reporter: Maxim Gekk > Priority: Major > > The '1000-02-29' is valid date in the Julian calendar used in Spark 2.4.5 for > dates before 1582-10-15 but casting the string to the date type fails: > {code:scala} > scala> val df = > Seq("1000-02-29").toDF("dateS").select($"dateS".cast("date").as("date")) > df: org.apache.spark.sql.DataFrame = [date: date] > scala> df.show > +----+ > |date| > +----+ > |null| > +----+ > {code} > Creating a dataset from java.sql.Date w/ the same input string works > correctly: > {code:scala} > scala> val df2 = > Seq(java.sql.Date.valueOf("1000-02-29")).toDF("dateS").select($"dateS".as("date")) > df2: org.apache.spark.sql.DataFrame = [date: date] > scala> df2.show > +----------+ > | date| > +----------+ > |1000-02-29| > +----------+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org