Github user javierluraschi commented on a diff in the pull request: https://github.com/apache/spark/pull/22913#discussion_r230607581 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala --- @@ -71,6 +71,7 @@ object ArrowUtils { case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale) case date: ArrowType.Date if date.getUnit == DateUnit.DAY => DateType case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => TimestampType + case date: ArrowType.Date if date.getUnit == DateUnit.MILLISECOND => TimestampType --- End diff -- Good catch, thanks. Yes, this should be mapped to `Date` in the Arrow schema, not `TimeStamp`. To give more background, Arrow Dates can have a unit of `DateUnit.DAY` or `DateUnit.MILLISECOND` (see [arrow/vector/types/DateUnit.java#L21-L22](https://github.com/apache/arrow/blob/73d379f4631cd3013371f60876a52615171e6c3b/java/vector/src/main/java/org/apache/arrow/vector/types/DateUnit.java#L21-L22)), currently, if a date with milliseconds is passed, this simply fails; therefore, this change does not affect other type conversions and is fine to map all Arrow dates to Spark dates since now all cases are properly handled.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org