[ https://issues.apache.org/jira/browse/SPARK-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li resolved SPARK-21375. ----------------------------- Resolution: Fixed Assignee: Bryan Cutler Fix Version/s: 2.3.0 > Add date and timestamp support to ArrowConverters for toPandas() collection > --------------------------------------------------------------------------- > > Key: SPARK-21375 > URL: https://issues.apache.org/jira/browse/SPARK-21375 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL > Affects Versions: 2.3.0 > Reporter: Bryan Cutler > Assignee: Bryan Cutler > Fix For: 2.3.0 > > > Date and timestamp are not yet supported in DataFrame.toPandas() using > ArrowConverters. These are common types for data analysis used in both Spark > and Pandas and should be supported. > There is a discrepancy with the way that PySpark and Arrow store timestamps, > without timezone specified, internally. PySpark takes a UTC timestamp that > is adjusted to local time and Arrow is in UTC time. Hopefully there is a > clean way to resolve this. > Spark internal storage spec: > * *DateType* stored as days > * *Timestamp* stored as microseconds -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org