Jonathancui123 commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r906232351
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala: ########## @@ -197,34 +199,50 @@ class UnivocityParser( Decimal(decimalParser(datum), dt.precision, dt.scale) } - case _: TimestampType => (d: String) => + case _: DateType => (d: String) => nullSafeDatum(d, name, nullable, options) { datum => try { - timestampFormatter.parse(datum) + dateFormatter.parse(datum) } catch { case NonFatal(e) => // If fails to parse, then tries the way used in 2.0 and 1.x for backwards // compatibility. val str = DateTimeUtils.cleanLegacyTimestampStr(UTF8String.fromString(datum)) - DateTimeUtils.stringToTimestamp(str, options.zoneId).getOrElse(throw e) + DateTimeUtils.stringToDate(str).getOrElse(throw e) } } - case _: TimestampNTZType => (d: String) => - nullSafeDatum(d, name, nullable, options) { datum => - timestampNTZFormatter.parseWithoutTimeZone(datum, false) - } - - case _: DateType => (d: String) => + case _: TimestampType => (d: String) => nullSafeDatum(d, name, nullable, options) { datum => try { - dateFormatter.parse(datum) + timestampFormatter.parse(datum) } catch { case NonFatal(e) => // If fails to parse, then tries the way used in 2.0 and 1.x for backwards // compatibility. val str = DateTimeUtils.cleanLegacyTimestampStr(UTF8String.fromString(datum)) - DateTimeUtils.stringToDate(str).getOrElse(throw e) + DateTimeUtils.stringToTimestamp(str, options.zoneId).getOrElse { + // There may be date type entries in timestamp column due to schema inference + if (options.inferDate) { + daysToMicros(dateFormatter.parse(datum), options.zoneId) Review Comment: @bersprockets Thanks for the suggestion! Do you know what is the advantage of allowing Legacy Formatter? i.e. what is a date format that the legacy formatter can handle but the current formatter cannot? I'm wondering if there will be a sufficient population of users who want to infer date in schema and also use legacy date formats cc: @Yaohua628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org