[ 
https://issues.apache.org/jira/browse/ARROW-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705210#comment-16705210
 ] 

David Lee commented on ARROW-3907:
----------------------------------

passing in safe=False works, but it is pretty hacky.. Another problem also pops 
up with ParquetWriter.write_table(). I'll open a separate ticket for that one.

The conversion from pandas nanoseconds to whatever timestamp resolution 
declared using pa.timestamp() in the schema object worked fine in 0.11.0.

Having to pass in coerce_timestamps, allow_truncated_timestamps and safe is 
pretty messy.

 

> [Python] from_pandas errors when schemas are used with lower resolution 
> timestamps
> ----------------------------------------------------------------------------------
>
>                 Key: ARROW-3907
>                 URL: https://issues.apache.org/jira/browse/ARROW-3907
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.11.1
>            Reporter: David Lee
>            Priority: Major
>
> When passing in a schema object to from_pandas a resolution error occurs if 
> the schema uses a lower resolution timestamp. Do we need to also add 
> "coerce_timestamps" and "allow_truncated_timestamps" parameters found in 
> write_table() to from_pandas()?
> Error:
> pyarrow.lib.ArrowInvalid: ('Casting from timestamp[ns] to timestamp[ms] would 
> lose data: 1532015191753713000', 'Conversion failed for column modified with 
> type datetime64[ns]')
> Code:
>  
> {code:java}
> processed_schema = pa.schema([
> pa.field('Id', pa.string()),
> pa.field('modified', pa.timestamp('ms')),
> pa.field('records', pa.int32())
> ])
> pa.Table.from_pandas(df, schema=processed_schema, preserve_index=False)
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to