[ https://issues.apache.org/jira/browse/ARROW-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705210#comment-16705210 ]
David Lee commented on ARROW-3907: ---------------------------------- passing in safe=False works, but it is pretty hacky.. Another problem also pops up with ParquetWriter.write_table(). I'll open a separate ticket for that one. The conversion from pandas nanoseconds to whatever timestamp resolution declared using pa.timestamp() in the schema object worked fine in 0.11.0. Having to pass in coerce_timestamps, allow_truncated_timestamps and safe is pretty messy. > [Python] from_pandas errors when schemas are used with lower resolution > timestamps > ---------------------------------------------------------------------------------- > > Key: ARROW-3907 > URL: https://issues.apache.org/jira/browse/ARROW-3907 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.11.1 > Reporter: David Lee > Priority: Major > > When passing in a schema object to from_pandas a resolution error occurs if > the schema uses a lower resolution timestamp. Do we need to also add > "coerce_timestamps" and "allow_truncated_timestamps" parameters found in > write_table() to from_pandas()? > Error: > pyarrow.lib.ArrowInvalid: ('Casting from timestamp[ns] to timestamp[ms] would > lose data: 1532015191753713000', 'Conversion failed for column modified with > type datetime64[ns]') > Code: > > {code:java} > processed_schema = pa.schema([ > pa.field('Id', pa.string()), > pa.field('modified', pa.timestamp('ms')), > pa.field('records', pa.int32()) > ]) > pa.Table.from_pandas(df, schema=processed_schema, preserve_index=False) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)