Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23202#discussion_r238141702 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala --- @@ -98,6 +100,7 @@ class CSVInferSchema(options: CSVOptions) extends Serializable { compatibleType(typeSoFar, tryParseDecimal(field)).getOrElse(StringType) case DoubleType => tryParseDouble(field) case TimestampType => tryParseTimestamp(field) + case DateType => tryParseDate(field) --- End diff -- I mean, IIRC, if the pattern is, for instance, `yyyy-MM-dd`, 2010-10-10 and also 2018-12-02T21:04:00.123567 are parsed as dates because the current parsing library checks if the string is matched and ignore the rest of them. So, if we try date first, it will works for its default value but it we do some weird patterns, it wouldn't work again. I was thinking we can fix it if we use `DateTimeFormatter`.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org