[ https://issues.apache.org/jira/browse/SPARK-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-20166: --------------------------------- Description: We can use {{XXX}} format instead of {{ZZ}}. {{ZZ}} seems a {{FastDateFormat}} specific Please see https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone and https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html {{ZZ}} supports "ISO 8601 extended format time zones" but it seems {{FastDateFormat}} specific option. It seems we better replace {{ZZ}} to {{XXX}} because they look use the same strategy - https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930. I also checked the codes and manually debugged it for sure. It seems both cases use the same pattern {code}( Z|(?:[+-]\\d{2}(?::)\\d{2})) {code}. Note that this is a fix about documentation not the behaviour change because {{ZZ}} seems invalid date format in {{SimpleDateFormat}} as documented in {{DataFrameReader}}: {quote} * <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSZZ`): sets the string that * indicates a timestamp format. Custom date formats follow the formats at * `java.text.SimpleDateFormat`. This applies to timestamp type.</li> {quote} {code} scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017 scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017 scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00" at java.text.DateFormat.parse(DateFormat.java:366) ... 48 elided scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z" at java.text.DateFormat.parse(DateFormat.java:366) ... 48 elided {code} {code} scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017 scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017 scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017 scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017 {code} was: We can use {{XXX}} format instead of {{ZZ}}. {{ZZ}} seems a {{FastDateFormat}} specific Please see https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone and https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html {{ZZ}} supports "ISO 8601 extended format time zones" but it seems {{FastDateFormat}} specific option. It seems we better replace {{ZZ}} to {{XXX}} because they look use the same strategy - https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930. I also checked the codes and manually debugged it for sure. It seems both cases use the same patter {{"(Z|(?:[+-]\\d{2}(?::)\\d{2}))"}}. Note that this is a fix about documentation not the behaviour change because {{ZZ}} seems invalid date format in {{SimpleDateFormat}} as documented in {{DataFrameReader}}: {quote} * <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSZZ`): sets the string that * indicates a timestamp format. Custom date formats follow the formats at * `java.text.SimpleDateFormat`. This applies to timestamp type.</li> {quote} {code} scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017 scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017 scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00" at java.text.DateFormat.parse(DateFormat.java:366) ... 48 elided scala> new java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z" at java.text.DateFormat.parse(DateFormat.java:366) ... 48 elided {code} {code} scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017 scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017 scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017 scala> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017 {code} > Use XXX for ISO timezone instead of ZZ which is FastDateFormat specific in > CSV/JSON time related options > -------------------------------------------------------------------------------------------------------- > > Key: SPARK-20166 > URL: https://issues.apache.org/jira/browse/SPARK-20166 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.2.0 > Reporter: Hyukjin Kwon > Priority: Trivial > > We can use {{XXX}} format instead of {{ZZ}}. {{ZZ}} seems a > {{FastDateFormat}} specific Please see > https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone > and > https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html > {{ZZ}} supports "ISO 8601 extended format time zones" but it seems > {{FastDateFormat}} specific option. > It seems we better replace {{ZZ}} to {{XXX}} because they look use the same > strategy - > https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930. > > I also checked the codes and manually debugged it for sure. It seems both > cases use the same pattern {code}( Z|(?:[+-]\\d{2}(?::)\\d{2})) {code}. > Note that this is a fix about documentation not the behaviour change because > {{ZZ}} seems invalid date format in {{SimpleDateFormat}} as documented in > {{DataFrameReader}}: > {quote} > * <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSZZ`): sets the > string that > * indicates a timestamp format. Custom date formats follow the formats at > * `java.text.SimpleDateFormat`. This applies to timestamp type.</li> > {quote} > {code} > scala> new > java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") > res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017 > scala> new > java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") > res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017 > scala> new > java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") > java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00" > at java.text.DateFormat.parse(DateFormat.java:366) > ... 48 elided > scala> new > java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") > java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z" > at java.text.DateFormat.parse(DateFormat.java:366) > ... 48 elided > {code} > {code} > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00") > res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017 > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z") > res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017 > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00") > res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017 > scala> > org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z") > res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org