[ https://issues.apache.org/jira/browse/SPARK-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419073#comment-15419073 ]
Barry Becker commented on SPARK-17039: -------------------------------------- There are literal ?'s in the datafile. The "nullValue" option indicates that those ?'s should be read as null values. I also added the "dateFormat" option which describes how the dates in the file should be read. Let me try to provide more information so you can reproduce. Here is the schema that I am specifiying (dfSchema above): {code} StructType(StructField(string normal,StringType,true), StructField(Years,TimestampType,true), StructField(Months,TimestampType,true), StructField(WeekDays,TimestampType,true), StructField(Days,TimestampType,true), StructField(DaysWithNull,TimestampType,true), StructField(Hours,TimestampType,true), StructField(Minutes,TimestampType,true), StructField(normal dates,TimestampType,true), StructField(Wide Range Dates,TimestampType,true), StructField(Narrow,TimestampType,true), StructField(Far Future,TimestampType,true), StructField(Mostly Null,TimestampType,true), StructField(All Same Date,TimestampType,true), StructField(Past/Future,TimestampType,true), StructField(All nulls,TimestampType,true), StructField(Seconds,TimestampType,true)) {code} and here is the contents of the csv datafile (note that there are lots of nulls). This worked using databricks spark-csv lib as a dependency in spark 1.6.2 {code} foo 2015-03-09T00:00:00 2015-03-09T00:00:00 2015-03-09T00:00:00 2015-03-09T00:00:00 2015-03-09T00:00:00 2015-03-09T00:00:00 2015-03-09T00:01:00 2007-11-09T00:00:00 1967-11-09T00:00:00 2015-03-09T12:00:00 2700-01-01T00:00:00 2015-03-09T00:00:00 2015-03-09T00:00:00 1983-03-09T00:00:00 ? 2015-03-09T12:01:00 bar 2016-03-09T00:00:00 2015-04-09T00:00:00 2015-03-10T00:00:00 2015-03-10T00:00:00 ? 2015-03-09T01:00:00 2015-03-09T00:03:00 2007-10-02T00:00:00 1987-10-02T00:00:00 2015-03-09T12:03:00 3701-01-01T00:00:00 2015-04-09T00:00:00 2015-03-09T00:00:00 1865-04-09T00:00:00 ? 2015-03-09T12:01:01 baz 2017-03-09T00:00:00 2015-05-09T00:00:00 2015-03-11T00:00:00 2015-03-11T00:00:00 2015-03-11T00:00:00 2015-03-09T02:00:00 2015-03-09T00:05:00 1999-04-04T03:00:00 1999-02-03T00:00:00 2015-03-09T12:08:00 4702-01-01T00:00:00 ? 2015-03-09T00:00:00 1777-05-09T00:00:00 ? 2015-03-09T12:01:03 but 2018-03-09T00:00:00 2015-06-09T00:00:00 2015-03-12T00:00:00 2015-03-12T00:00:00 2015-03-12T00:00:00 2015-03-09T03:00:00 2015-03-09T00:08:00 2025-10-10T00:00:00 2025-10-10T00:00:00 2015-03-09T12:10:00 4103-01-01T00:00:00 2015-06-09T00:00:00 2015-03-09T00:00:00 2089-06-09T00:00:00 ? 2015-03-09T12:01:05 fooo 2019-03-09T00:00:00 2015-07-09T00:00:00 2015-03-13T00:00:00 2015-03-13T00:00:00 2015-03-13T00:00:00 2015-03-09T04:00:00 2015-03-09T00:09:00 2004-02-23T00:00:00 2004-02-23T00:00:00 2015-03-09T12:15:00 4204-01-01T00:00:00 ? 2015-03-09T00:00:00 2125-07-09T00:00:00 ? 2015-03-09T12:01:07 bar 2020-03-09T00:00:00 2015-08-09T00:00:00 2015-03-16T00:00:00 2015-03-14T00:00:00 2015-03-14T00:00:00 2015-03-09T05:00:00 2015-03-09T00:12:00 2019-03-04T00:00:00 3019-03-04T00:00:00 2015-03-09T12:20:00 4305-01-01T00:00:00 2015-08-09T00:00:00 2015-03-09T00:00:00 2215-08-09T00:00:00 ? 2015-03-09T12:01:09 baz 2021-03-09T00:00:00 2015-09-09T00:00:00 2015-03-17T00:00:00 2015-03-15T00:00:00 2015-03-15T00:00:00 2015-03-09T06:00:00 2015-03-09T00:20:00 1999-04-04T02:34:00 ? 2015-03-09T12:25:00 4406-01-01T00:00:00 2015-09-09T00:00:00 2015-03-09T00:00:00 1754-09-09T00:00:00 ? 2015-03-09T12:01:11 but 2022-03-09T00:00:00 2015-10-09T00:00:00 2015-03-18T00:00:00 2015-03-16T00:00:00 ? 2015-03-09T07:00:00 2015-03-09T00:30:00 1999-03-01T00:00:00 1909-03-01T00:00:00 2015-03-09T12:30:00 4507-01-01T00:00:00 ? 2015-03-09T00:00:00 1958-10-09T00:00:00 ? 2015-03-09T12:01:00 bar 2023-03-09T00:00:00 2015-11-09T00:00:00 2015-03-19T00:00:00 2015-03-17T00:00:00 2015-03-17T00:00:00 2015-03-09T08:00:00 2015-03-09T00:35:00 2001-02-12T00:00:00 ? 2015-03-09T12:35:00 4608-01-01T00:00:00 2015-11-09T00:00:00 2015-03-09T00:00:00 3000-11-09T00:00:00 ? 2015-03-09T12:01:00 here is a really really really long string value 2024-03-09T00:00:00 2015-12-09T00:00:00 2015-03-20T00:00:00 2015-03-18T00:00:00 2015-03-18T00:00:00 2015-03-09T09:00:00 2015-03-09T00:40:00 1999-04-04T17:17:00 1999-01-15T00:00:00 2015-03-09T12:40:00 4709-01-01T00:00:00 2015-12-09T00:00:00 2015-03-09T00:00:00 4015-12-09T00:00:00 ? 2015-03-09T12:01:00 foo 2025-03-09T00:00:00 2016-01-09T00:00:00 2015-03-23T00:00:00 2015-03-19T00:00:00 2015-03-19T00:00:00 2015-03-09T10:00:00 2015-03-09T00:41:00 1999-02-28T00:00:00 1999-02-28T00:00:00 2015-03-09T12:45:00 4710-01-01T00:00:00 2016-01-09T00:00:00 2015-03-09T00:00:00 2000-01-09T00:00:00 ? 2015-03-09T12:01:00 bar 2026-03-09T00:00:00 2016-02-09T00:00:00 2015-03-24T00:00:00 2015-03-20T00:00:00 2015-03-20T00:00:00 2015-03-09T11:00:00 2015-03-09T00:42:00 1999-04-04T14:14:00 2999-01-17T00:00:00 2015-03-09T12:55:00 4811-01-01T00:00:00 ? 2015-03-09T00:00:00 1999-02-09T00:00:00 ? 2015-03-09T12:01:00 bar 2027-03-09T00:00:00 2016-03-09T00:00:00 2015-03-25T00:00:00 2015-03-21T00:00:00 2015-03-21T00:00:00 2015-03-09T12:00:00 2015-03-09T00:43:00 2015-03-07T10:10:00 2999-06-04T00:00:00 2015-03-09T12:59:00 4912-01-01T00:00:00 2016-03-09T00:00:00 2015-03-09T00:00:00 1856-03-09T00:00:00 ? 2015-03-09T12:01:00 foo 2028-03-09T00:00:00 2016-04-09T00:00:00 2015-03-26T00:00:00 2015-03-22T00:00:00 2015-03-22T00:00:00 2015-03-09T13:00:00 2015-03-09T00:44:00 ? ? ? ? ? ? ? ? 2015-03-09T12:01:00 bar 2029-03-09T00:00:00 2016-05-09T00:00:00 2015-03-27T00:00:00 2015-03-23T00:00:00 2015-03-23T00:00:00 2015-03-09T14:00:00 2015-03-09T00:46:00 2007-11-08T00:00:00 1907-11-09T00:00:00 2015-03-09T12:00:00 3700-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 baz 2030-03-09T00:00:00 2016-06-09T00:00:00 2015-03-30T00:00:00 2015-03-24T00:00:00 2015-03-24T00:00:00 2015-03-09T15:00:00 2015-03-09T00:47:00 2007-10-03T00:00:00 1919-10-02T00:00:00 2015-03-09T12:03:00 4701-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 foo 2031-03-09T00:00:00 2016-07-09T00:00:00 2015-03-31T00:00:00 2015-03-25T00:00:00 ? 2015-03-09T16:00:00 2015-03-09T00:48:00 1999-04-06T03:00:00 2000-02-03T00:00:00 2015-03-09T12:08:00 4602-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 foo 2032-03-09T00:00:00 2016-08-09T00:00:00 2015-04-01T00:00:00 2015-03-26T00:00:00 2015-03-26T00:00:00 2015-03-09T17:00:00 2015-03-09T00:49:00 2025-10-12T00:00:00 2025-10-10T00:00:00 2015-03-09T12:10:00 4213-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 but 2033-03-09T00:00:00 2016-09-09T00:00:00 2015-04-02T00:00:00 2015-03-27T00:00:00 2015-03-27T00:00:00 2015-03-09T18:00:00 2015-03-09T00:51:00 2004-02-20T00:00:00 2014-02-23T00:00:00 2015-03-09T12:15:00 4304-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 foo 2034-03-09T00:00:00 2016-10-09T00:00:00 2015-04-03T00:00:00 2015-03-28T00:00:00 2015-03-28T00:00:00 2015-03-09T19:00:00 2015-03-09T00:52:00 2019-03-05T00:00:00 3019-03-04T00:00:00 2015-03-09T12:20:00 4405-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 foo 2035-03-09T00:00:00 2016-11-09T00:00:00 2015-04-06T00:00:00 2015-03-29T00:00:00 2015-03-29T00:00:00 2015-03-09T20:00:00 2015-03-09T00:54:00 1999-04-05T02:39:00 ? 2015-03-09T12:25:00 4506-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 foo 2036-03-09T00:00:00 2016-12-09T00:00:00 2015-04-07T00:00:00 2015-03-30T00:00:00 2015-03-30T00:00:00 2015-03-09T21:00:00 2015-03-09T00:55:00 1999-03-03T00:00:00 1911-03-02T00:00:00 2015-03-09T12:30:00 4607-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 foo 2037-03-09T00:00:00 2017-01-09T00:00:00 2015-04-08T00:00:00 2015-03-31T00:00:00 2015-03-31T00:00:00 2015-03-09T22:00:00 2015-03-09T00:57:00 2001-02-14T00:00:00 ? 2015-03-09T12:35:00 4618-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:00 foo 2038-03-09T00:00:00 2017-02-09T00:00:00 2015-04-09T00:00:00 2015-04-01T00:00:00 2015-04-01T00:00:00 2015-03-09T23:00:00 2015-03-09T00:59:00 1999-04-07T16:16:00 1999-01-14T00:00:00 2015-03-09T12:40:00 4659-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? 2015-03-09T12:01:28 foo 2039-03-09T00:00:00 2017-03-09T00:00:00 2015-04-10T00:00:00 2015-04-02T00:00:00 2015-04-02T00:00:00 2015-03-09T00:00:00 ? 1999-02-27T00:00:00 1999-02-25T00:00:00 2015-03-09T12:44:00 4612-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? ? foo 2040-03-09T00:00:00 2017-04-09T00:00:00 2015-04-13T00:00:00 2015-04-03T00:00:00 2015-04-03T00:00:00 2015-03-10T01:00:00 2015-03-09T00:54:00 1999-04-03T14:14:00 2999-01-12T00:00:00 2015-03-09T12:54:00 4821-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? ? foo 2041-03-09T00:00:00 2017-05-09T00:00:00 2015-04-14T00:00:00 2015-04-04T00:00:00 2015-04-04T00:00:00 2015-03-10T02:00:00 ? 2015-03-06T10:10:00 2999-06-03T00:00:00 2015-03-09T12:58:00 5912-01-01T00:00:00 ? 2015-03-09T00:00:00 ? ? ? bar 2042-03-09T00:00:00 2017-06-09T00:00:00 2015-04-15T00:00:00 2015-04-05T00:00:00 2015-04-05T00:00:00 2015-03-11T03:00:00 2015-03-09T00:54:00 ? ? ? ? ? ? ? ? ? bar 2043-03-09T00:00:00 2017-07-09T00:00:00 2015-04-16T00:00:00 2015-04-06T00:00:00 ? 2015-03-10T04:00:00 ? ? ? ? ? ? ? ? ? ? bar 2044-03-09T00:00:00 2017-08-09T00:00:00 2015-04-17T00:00:00 2015-04-07T00:00:00 2015-04-07T00:00:00 2015-03-11T05:00:00 ? ? ? ? ? ? ? ? ? ? bar 2045-03-09T00:00:00 2017-09-09T00:00:00 2015-04-20T00:00:00 2015-04-08T00:00:00 2015-04-08T00:00:00 2015-03-10T06:00:00 ? ? ? ? ? ? ? ? ? ? bar 2046-03-09T00:00:00 2017-10-09T00:00:00 2015-04-21T00:00:00 2015-04-09T00:00:00 2015-04-09T00:00:00 2015-03-11T07:00:00 ? ? ? ? ? ? ? ? ? ? bar 2047-03-09T00:00:00 2017-11-09T00:00:00 2015-04-22T00:00:00 2015-04-10T00:00:00 2015-04-10T00:00:00 2015-03-10T08:00:00 ? ? ? ? ? ? ? ? ? ? bar 2048-03-09T00:00:00 2017-12-09T00:00:00 2015-04-23T00:00:00 2015-04-11T00:00:00 2015-04-11T00:00:00 2015-03-11T09:00:00 ? ? ? ? ? ? ? ? ? ? bar 2049-03-09T00:00:00 2018-01-09T00:00:00 2015-04-24T00:00:00 2015-04-12T00:00:00 2015-04-12T00:00:00 2015-03-10T10:00:00 ? ? ? ? ? ? ? ? ? ? bar 2050-03-09T00:00:00 2018-02-09T00:00:00 2015-04-27T00:00:00 2015-04-13T00:00:00 2015-04-13T00:00:00 2015-03-11T11:00:00 ? ? ? ? ? ? ? ? ? ? bar 2051-03-09T00:00:00 2018-03-09T00:00:00 2015-04-28T00:00:00 2015-04-14T00:00:00 2015-04-14T00:00:00 2015-03-10T12:00:00 ? ? ? ? ? ? ? ? ? ? bar 2052-03-09T00:00:00 2018-04-09T00:00:00 2015-04-29T00:00:00 2015-04-15T00:00:00 2015-04-15T00:00:00 2015-03-10T13:00:00 ? ? ? ? ? ? ? ? ? ? bar 2053-03-09T00:00:00 2018-05-09T00:00:00 2015-04-30T00:00:00 2015-04-16T00:00:00 2015-04-16T00:00:00 2015-03-10T14:00:00 ? ? ? ? ? ? ? ? ? ? ? 2054-03-09T00:00:00 2018-06-09T00:00:00 2015-05-01T00:00:00 2015-04-17T00:00:00 2015-04-17T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2055-03-09T00:00:00 2018-07-09T00:00:00 2015-05-04T00:00:00 2015-04-18T00:00:00 2015-04-18T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2056-03-09T00:00:00 2018-08-09T00:00:00 2015-05-05T00:00:00 2015-04-19T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? ? 2057-03-09T00:00:00 2018-09-09T00:00:00 2015-05-06T00:00:00 2015-04-20T00:00:00 2015-04-20T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2058-03-09T00:00:00 2018-10-09T00:00:00 2015-05-07T00:00:00 2015-04-21T00:00:00 2015-04-21T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2059-03-09T00:00:00 2018-11-09T00:00:00 2015-05-08T00:00:00 2015-04-22T00:00:00 2015-04-22T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2060-03-09T00:00:00 2018-12-09T00:00:00 2015-05-11T00:00:00 2015-04-23T00:00:00 2015-04-23T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2061-03-09T00:00:00 2019-01-09T00:00:00 2015-05-12T00:00:00 2015-04-24T00:00:00 2015-04-24T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2062-03-09T00:00:00 2019-02-09T00:00:00 2015-05-13T00:00:00 2015-04-25T00:00:00 2015-04-25T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? 2063-03-09T00:00:00 2019-03-09T00:00:00 2015-05-14T00:00:00 2015-04-26T00:00:00 2015-04-26T00:00:00 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? {code} > cannot read null dates from csv file > ------------------------------------ > > Key: SPARK-17039 > URL: https://issues.apache.org/jira/browse/SPARK-17039 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 2.0.0 > Reporter: Barry Becker > > I see this exact same bug as reported in this [stack overflow > post|http://stackoverflow.com/questions/38265640/spark-2-0-pre-csv-parsing-error-if-missing-values-in-date-column] > using Spark 2.0.0 (released version). > In scala, I read a csv using > sqlContext.read > .format("csv") > .option("header", "false") > .option("inferSchema", "false") > .option("nullValue", "?") > .option("dateFormat", "yyyy-MM-dd'T'HH:mm:ss") > .schema(dfSchema) > .csv(dataFile) > The data contains some null dates (represented with ?). > The error I get is: > {code} > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 8.0 failed 1 times, most recent failure: Lost task 0.0 in stage 8.0 > (TID 10, localhost): java.text.ParseException: Unparseable date: "?" > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org