jorisvandenbossche commented on code in PR #39609:
URL: https://github.com/apache/arrow/pull/39609#discussion_r1453093512
##########
python/pyarrow/pandas_compat.py:
##########
@@ -950,7 +950,7 @@ def _reconstruct_index(table, index_descriptors,
all_columns, types_mapper=None)
index = index_arrays[0]
if not isinstance(index, pd.Index):
# Box anything that wasn't boxed above
- index = pd.Index(index, name=index_names[0])
+ index = pd.Index(index.infer_objects(), name=index_names[0])
Review Comment:
OK, so putting a breakpoint at the above location and then running that
test, I see:
```
(Pdb) index
0 (0, 1]
1 (1, 2]
2 (2, 3]
dtype: object
(Pdb) index.values
array([Interval(0, 1, closed='right'), Interval(1, 2, closed='right'),
Interval(2, 3, closed='right')], dtype=object)
(Pdb) pd.Index(index, name=index_names[0])
IntervalIndex([(0, 1], (1, 2], (2, 3]], dtype='interval[int64, right]')
```
So `index` is an object-dtype Series with scalar Interval objects, and then
converting that to an Index on the line above currently infers that it are
interval objects, and converts that to interval dtype, while in the future it
will preserve the object dtype.
Now, I do wonder why we have an object dtype series of interval objects,
though. Because we created that data (converted from a column in the pyarrow
table), so I would expect that we create proper interval dtype to start with.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]