itamarst commented on a change in pull request #7169:
URL: https://github.com/apache/arrow/pull/7169#discussion_r427336072



##########
File path: python/pyarrow/pandas_compat.py
##########
@@ -699,6 +699,17 @@ def _reconstruct_block(item, columns=None, 
extension_columns=None):
 
     block_arr = item.get('block', None)
     placement = item['placement']
+
+    if (
+            (block_arr is not None) and
+            (block_arr.dtype.type == np.datetime64) and
+            (block_arr.dtype.name != "datetime64[ns]")
+    ):
+        # Non-nanosecond timestamps can express much larger values than
+        # nanosecond timestamps, and pandas checks that the values fit into
+        # nanosecond range, so this needs to be an object as dtype.
+        block_arr = block_arr.astype(np.dtype("O"))

Review comment:
       If the dtype is `timestamp[ms]`, Pandas says "oh it's a timestamp, I 
should force it to nanosecond" and then it blows up for timestamps out of 
range, undoing the point of the exercise.
   
   Specifically, you get the following error:
   
   ```
   pyarrow/pandas_compat.py:740: in _reconstruct_block
       block = _int.make_block(block_arr, placement=placement)
   ../pyarrow/lib/python3.7/site-packages/pandas/core/internals/blocks.py:3047: 
in make_block
       return klass(values, ndim=ndim, placement=placement)
   ../pyarrow/lib/python3.7/site-packages/pandas/core/internals/blocks.py:2170: 
in __init__
       values = self._maybe_coerce_values(values)
   ../pyarrow/lib/python3.7/site-packages/pandas/core/internals/blocks.py:2194: 
in _maybe_coerce_values
       values = conversion.ensure_datetime64ns(values)
   pandas/_libs/tslibs/conversion.pyx:123: in 
pandas._libs.tslibs.conversion.ensure_datetime64ns
       ???
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ 
   
   >   ???
   E   pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds 
nanosecond timestamp: 1-01-01 00:00:00
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to