Nathan Beyer created SPARK-17545:
------------------------------------

             Summary: Spark SQL Catalyst doesn't handle ISO 8601 date with 
colon in offset
                 Key: SPARK-17545
                 URL: https://issues.apache.org/jira/browse/SPARK-17545
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Nathan Beyer


When parsing a CSV with a date/time column that contains a variant ISO 8601 
that doesn't include a colon in the offset, casting to Timestamp fails.

Here's a simple, example CSV content.
{quote}
time
"2015-07-20T15:09:23.736-0500"
"2015-07-20T15:10:51.687-0500"
"2015-11-21T23:15:01.499-0600"
{quote}

Here's the stack trace that results from processing this data.
{quote}
16/09/14 15:22:59 ERROR Utils: Aborting task
java.lang.IllegalArgumentException: 2015-11-21T23:15:01.499-0600
        at 
org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl$Parser.skip(Unknown 
Source)
        at 
org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl$Parser.parse(Unknown 
Source)
        at 
org.apache.xerces.jaxp.datatype.XMLGregorianCalendarImpl.<init>(Unknown Source)
        at 
org.apache.xerces.jaxp.datatype.DatatypeFactoryImpl.newXMLGregorianCalendar(Unknown
 Source)
        at 
javax.xml.bind.DatatypeConverterImpl._parseDateTime(DatatypeConverterImpl.java:422)
        at 
javax.xml.bind.DatatypeConverterImpl.parseDateTime(DatatypeConverterImpl.java:417)
        at 
javax.xml.bind.DatatypeConverter.parseDateTime(DatatypeConverter.java:327)
        at 
org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTime(DateTimeUtils.scala:140)
        at 
org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:287)
{quote}

Somewhat related, I believe Python standard libraries can produce this form of 
zone offset. The system I got the data from is written in Python.
https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to