[ https://issues.apache.org/jira/browse/SPARK-30668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025549#comment-17025549 ]
Maxim Gekk commented on SPARK-30668: ------------------------------------ Date/timestamp parsing is based on Java 8 DateTimeFormat in Spark 3.0 which may have different notion of pattern letters (see [https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]): {code} V time-zone ID zone-id America/Los_Angeles; Z; -08:30 z time-zone name zone-name Pacific Standard Time; PST O localized zone-offset offset-O GMT+8; GMT+08:00; UTC-08:00; X zone-offset 'Z' for zero offset-X Z; -08; -0830; -08:30; -083015; -08:30:15; x zone-offset offset-x +0000; -08; -0830; -08:30; -083015; -08:30:15; Z zone-offset offset-Z +0000; -0800; -08:00; {code} As you can see 'z' is for time zone name, but you is going to parse zone offsets. You can use 'x' or 'Z' in the pattern instead of 'z': {code} scala> spark.sql("""SELECT to_timestamp("2020-01-27T20:06:11.847-0800", "yyyy-MM-dd'T'HH:mm:ss.SSSZ")""").show(false) +----------------------------------------------------------------------------+ |to_timestamp('2020-01-27T20:06:11.847-0800', 'yyyy-MM-dd\'T\'HH:mm:ss.SSSZ')| +----------------------------------------------------------------------------+ |2020-01-28 07:06:11.847 | +----------------------------------------------------------------------------+ {code} Parsing in Spark 2.4 is based on SimpleDateFormat (see https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html) where 'z' has slightly different meaning. > to_timestamp failed to parse 2020-01-27T20:06:11.847-0800 using pattern > "yyyy-MM-dd'T'HH:mm:ss.SSSz" > ---------------------------------------------------------------------------------------------------- > > Key: SPARK-30668 > URL: https://issues.apache.org/jira/browse/SPARK-30668 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.0.0 > Reporter: Xiao Li > Priority: Blocker > > {code:java} > SELECT to_timestamp("2020-01-27T20:06:11.847-0800", > "yyyy-MM-dd'T'HH:mm:ss.SSSz") > {code} > This can return a valid value in Spark 2.4 but return NULL in the latest > master -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org