[ 
https://issues.apache.org/jira/browse/ARROW-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675139#comment-16675139
 ] 

Jakub Okoński commented on ARROW-2298:
--------------------------------------

I've looked into this a bit more and I realized that Arrow already supports 
what I want to do with pandas nulls and overflows, I just need to set 
`safe=False` when doing the conversion.

 

Still, I would like to add another parameter to `CastOptions`, called 
`allow_float_mantissa_overflow`. This one would check for out of range values 
and return errors.

 

This however, won't be very useful unless we expand the `safe` parameter in the 
Python API, so that users could specify, for example:

 

`pyarrow.Table.from_pandas(df, schema=schema, safe=True, 
safe_allow_float_mantissa_overflow=False)` - this invocation would work with 
truncated floats, int overflow, truncated timestamps, but it would reject 
floating point values outside of the allowed range (+/- 1 << 53 for float64, 
+/- 1 << 24 for float32).

 

[~wesmckinn] What do you think of this approach? Would it expose too much to 
the user?

> [Python] Add option to not consider NaN to be null when converting to an 
> integer Arrow type
> -------------------------------------------------------------------------------------------
>
>                 Key: ARROW-2298
>                 URL: https://issues.apache.org/jira/browse/ARROW-2298
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.12.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Follow-on work to ARROW-2135



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to