itamarst commented on a change in pull request #7169:
URL: https://github.com/apache/arrow/pull/7169#discussion_r427336072
##########
File path: python/pyarrow/pandas_compat.py
##########
@@ -699,6 +699,17 @@ def _reconstruct_block(item, columns=None,
extension_columns=None):
block_arr = item.get('block', None)
placement = item['placement']
+
+ if (
+ (block_arr is not None) and
+ (block_arr.dtype.type == np.datetime64) and
+ (block_arr.dtype.name != "datetime64[ns]")
+ ):
+ # Non-nanosecond timestamps can express much larger values than
+ # nanosecond timestamps, and pandas checks that the values fit into
+ # nanosecond range, so this needs to be an object as dtype.
+ block_arr = block_arr.astype(np.dtype("O"))
Review comment:
If the dtype is `timestamp[ms]`, Pandas says "oh it's a timestamp, I
should force it to nanosecond" and then it blows up for timestamps out of
range, undoing the point of the exercise.
Specifically, you get the following error:
```
pyarrow/pandas_compat.py:740: in _reconstruct_block
block = _int.make_block(block_arr, placement=placement)
../pyarrow/lib/python3.7/site-packages/pandas/core/internals/blocks.py:3047:
in make_block
return klass(values, ndim=ndim, placement=placement)
../pyarrow/lib/python3.7/site-packages/pandas/core/internals/blocks.py:2170:
in __init__
values = self._maybe_coerce_values(values)
../pyarrow/lib/python3.7/site-packages/pandas/core/internals/blocks.py:2194:
in _maybe_coerce_values
values = conversion.ensure_datetime64ns(values)
pandas/_libs/tslibs/conversion.pyx:123: in
pandas._libs.tslibs.conversion.ensure_datetime64ns
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _
> ???
E pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds
nanosecond timestamp: 1-01-01 00:00:00
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]