[ 
https://issues.apache.org/jira/browse/SPARK-30668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025549#comment-17025549
 ] 

Maxim Gekk commented on SPARK-30668:
------------------------------------

Date/timestamp parsing is based on Java 8 DateTimeFormat in Spark 3.0 which may 
have different notion of pattern letters (see 
[https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]):
{code}
   V       time-zone ID                zone-id           America/Los_Angeles; 
Z; -08:30
   z       time-zone name              zone-name         Pacific Standard Time; 
PST
   O       localized zone-offset       offset-O          GMT+8; GMT+08:00; 
UTC-08:00;
   X       zone-offset 'Z' for zero    offset-X          Z; -08; -0830; -08:30; 
-083015; -08:30:15;
   x       zone-offset                 offset-x          +0000; -08; -0830; 
-08:30; -083015; -08:30:15;
   Z       zone-offset                 offset-Z          +0000; -0800; -08:00;
{code}
As you can see 'z' is for time zone name, but you is going to parse zone 
offsets. You can use 'x' or 'Z' in the pattern instead of 'z':
{code}
scala> spark.sql("""SELECT to_timestamp("2020-01-27T20:06:11.847-0800", 
"yyyy-MM-dd'T'HH:mm:ss.SSSZ")""").show(false)
+----------------------------------------------------------------------------+
|to_timestamp('2020-01-27T20:06:11.847-0800', 'yyyy-MM-dd\'T\'HH:mm:ss.SSSZ')|
+----------------------------------------------------------------------------+
|2020-01-28 07:06:11.847                                                     |
+----------------------------------------------------------------------------+
{code}

Parsing in Spark 2.4 is based on SimpleDateFormat (see 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html) 
where 'z' has slightly different meaning.

> to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz"
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-30668
>                 URL: https://issues.apache.org/jira/browse/SPARK-30668
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Xiao Li
>            Priority: Blocker
>
> {code:java}
> SELECT to_timestamp("2020-01-27T20:06:11.847-0800", 
> "yyyy-MM-dd'T'HH:mm:ss.SSSz")
> {code}
> This can return a valid value in Spark 2.4 but return NULL in the latest 
> master



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to