sanjay20m opened a new pull request, #47604:
URL: https://github.com/apache/arrow/pull/47604
This PR improves Python integer type inference in Arrow.
Previously, TypeInferrer would always default to int64, which caused
an OverflowError when encountering values larger than 2**63 - 1.
Changes in this PR:
- Introduced InferIntegerType helper to select the smallest fitting type
(int8, int16, int32, int64, uint8, uint16, uint32, uint64).
- Extended TypeInferrer to track min_int_ and max_int_ while processing
integers.
- Replaced the unconditional int64 fallback with InferIntegerType.
- Added safe PyObject reference handling for min/max tracking.
Impact:
- Fixes incorrect inference for large positive integers by enabling uint64.
- Prevents overflows and improves efficiency by choosing smaller integer
types
when possible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]