agkphysics commented on issue #39914:
URL: https://github.com/apache/arrow/issues/39914#issuecomment-2809153621

   I think this still fails when specifying a subset of columns to load that 
doesn't include the list column:
   ```python
   import pandas as pd
   import pyarrow as pa
   
   a = pd.Series(pa.array([[1, 2, 3]]), 
dtype=pd.ArrowDtype(pa.list_(pa.int64())))
   b = pd.Series(pa.array([1]), dtype=pd.ArrowDtype(pa.int64()))
   df = pd.DataFrame({"a": a, "b": b})
   df.to_parquet("test.parquet", index=False)
   pd.read_parquet("test.parquet", dtype_backend="pyarrow")  # Works
   pd.read_parquet("test.parquet", dtype_backend="pyarrow", columns=["a"])  # 
Works
   pd.read_parquet("test.parquet", dtype_backend="pyarrow", columns=["b"])  # 
Fails
   ```
   
   Fails with
   ```
   Traceback (most recent call last):
     File "/.../test.py", line 10, in <module>
       pd.read_parquet("test.parquet", dtype_backend="pyarrow", columns=["b"])  
# Fails
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/.../.venv/lib/python3.12/site-packages/pandas/io/parquet.py", line 
667, in read_parquet
       return impl.read(
              ^^^^^^^^^^
     File "/.../.venv/lib/python3.12/site-packages/pandas/io/parquet.py", line 
281, in read
       result = pa_table.to_pandas(**to_pandas_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "pyarrow/array.pxi", line 889, in 
pyarrow.lib._PandasConvertible.to_pandas
     File "pyarrow/table.pxi", line 5132, in pyarrow.lib.Table._to_pandas
     File "/.../.venv/lib/python3.12/site-packages/pyarrow/pandas_compat.py", 
line 796, in table_to_dataframe
       ext_columns_dtypes = _get_extension_dtypes(
                            ^^^^^^^^^^^^^^^^^^^^^^
     File "/.../.venv/lib/python3.12/site-packages/pyarrow/pandas_compat.py", 
line 899, in _get_extension_dtypes
       pandas_dtype = _pandas_api.pandas_dtype(dtype)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "pyarrow/pandas-shim.pxi", line 150, in 
pyarrow.lib._PandasAPIShim.pandas_dtype
     File "pyarrow/pandas-shim.pxi", line 153, in 
pyarrow.lib._PandasAPIShim.pandas_dtype
     File 
"/.../.venv/lib/python3.12/site-packages/pandas/core/dtypes/common.py", line 
1645, in pandas_dtype
       npdtype = np.dtype(dtype)
                 ^^^^^^^^^^^^^^^
   TypeError: data type 'list<item: int64>[pyarrow]' not understood
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to