Github user javierluraschi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22913#discussion_r230607581
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala 
---
    @@ -71,6 +71,7 @@ object ArrowUtils {
         case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale)
         case date: ArrowType.Date if date.getUnit == DateUnit.DAY => DateType
         case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType
    +    case date: ArrowType.Date if date.getUnit == DateUnit.MILLISECOND => 
TimestampType
    --- End diff --
    
    Good catch, thanks. Yes, this should be mapped to `Date` in the Arrow 
schema, not `TimeStamp`.
    
    To give more background, Arrow Dates can have a unit of `DateUnit.DAY` or 
`DateUnit.MILLISECOND` (see 
[arrow/vector/types/DateUnit.java#L21-L22](https://github.com/apache/arrow/blob/73d379f4631cd3013371f60876a52615171e6c3b/java/vector/src/main/java/org/apache/arrow/vector/types/DateUnit.java#L21-L22)),
 currently, if a date with milliseconds is passed, this simply fails; 
therefore, this change does not affect other type conversions and is fine to 
map all Arrow dates to Spark dates since now all cases are properly handled.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to