[ 
https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498593#comment-16498593
 ] 

Eric Wohlstadter commented on HIVE-19723:
-----------------------------------------

[~teddy.choi]

The Arrow serializer appears to truncate down to MILLISECONDS, but the Jira 
description calls for MICROSECONDS.

This is motivated by {{org.apache.spark.sql.execution.arrow.ArrowUtils.scala}}
{code:java}
case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType{code}

My understanding is that since the primary use-case for {{ArrowUtils}} is 
Python integration, some of the conversions are currently somewhat particular 
for Python. Perhaps Python/Pandas only supports MICROSECOND timestamps. 

FYI: [~hyukjin.kwon] [~bryanc]



> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -----------------------------------------------------------------
>
>                 Key: HIVE-19723
>                 URL: https://issues.apache.org/jira/browse/HIVE-19723
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.1.0, 4.0.0
>
>         Attachments: HIVE-19723.1.patch, HIVE-19732.2.patch
>
>
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity. 
> Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need 
> to change the assertion to test microsecond. And we'll need to add this to 
> documentation on supported datatypes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to