[
https://issues.apache.org/jira/browse/ARROW-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675139#comment-16675139
]
Jakub Okoński commented on ARROW-2298:
--------------------------------------
I've looked into this a bit more and I realized that Arrow already supports
what I want to do with pandas nulls and overflows, I just need to set
`safe=False` when doing the conversion.
Still, I would like to add another parameter to `CastOptions`, called
`allow_float_mantissa_overflow`. This one would check for out of range values
and return errors.
This however, won't be very useful unless we expand the `safe` parameter in the
Python API, so that users could specify, for example:
`pyarrow.Table.from_pandas(df, schema=schema, safe=True,
safe_allow_float_mantissa_overflow=False)` - this invocation would work with
truncated floats, int overflow, truncated timestamps, but it would reject
floating point values outside of the allowed range (+/- 1 << 53 for float64,
+/- 1 << 24 for float32).
[~wesmckinn] What do you think of this approach? Would it expose too much to
the user?
> [Python] Add option to not consider NaN to be null when converting to an
> integer Arrow type
> -------------------------------------------------------------------------------------------
>
> Key: ARROW-2298
> URL: https://issues.apache.org/jira/browse/ARROW-2298
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Wes McKinney
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.12.0
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> Follow-on work to ARROW-2135
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)