[ 
https://issues.apache.org/jira/browse/SPARK-30668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025651#comment-17025651
 ] 

Maxim Gekk commented on SPARK-30668:
------------------------------------

> This is not mentioned in the migration guide.

It is mentioned:
{code}
    - The `unix_timestamp`, `date_format`, `to_unix_timestamp`, 
`from_unixtime`, `to_date`, `to_timestamp` functions. New implementation 
supports pattern formats as described here 
https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html
 and performs strict checking of its input. For example, the `2015-07-22 
10:00:00` timestamp cannot be parse if pattern is `yyyy-MM-dd` because the 
parser does not consume whole input. Another example is the `31/01/2015 00:00` 
input cannot be parsed by the `dd/MM/yyyy hh:mm` pattern because `hh` supposes 
hours in the range `1-12`.
{code}
 
> Do we have a simple way to remove such a behavior change? 

The change is related to the migration to Proleptic Gregorian calendar. To 
remove the behavior, you need to revert most of 
https://issues.apache.org/jira/browse/SPARK-26651 and maybe more.

> For example, converting the pattern for users?

Even it is possible to convert patterns, the result can be different for old 
dates due to the calendar system.

> Can we let users choose different parsing mechanisms between SimpleDateFormat 
> and DateTimeFormat?

No, a flag was removed 1 year ago, see 
https://issues.apache.org/jira/browse/SPARK-26503 and see 
https://github.com/apache/spark/pull/23391#discussion_r244414750

> to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz"
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-30668
>                 URL: https://issues.apache.org/jira/browse/SPARK-30668
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Xiao Li
>            Priority: Blocker
>
> {code:java}
> SELECT to_timestamp("2020-01-27T20:06:11.847-0800", 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz")
> {code}
> This can return a valid value in Spark 2.4 but return NULL in the latest 
> master



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to