Joost Hoozemans created ARROW-17991:
---------------------------------------

             Summary: [Python] pyarrow.dataset IPC format does not support 
compresion
                 Key: ARROW-17991
                 URL: https://issues.apache.org/jira/browse/ARROW-17991
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
            Reporter: Joost Hoozemans
            Assignee: Joost Hoozemans


When trying to write an IPC dataset using pyarrow.dataset, it is not possible 
to pass a compression argument:

Trying to pass a pyarrow.ipc.IpcWriteOptions object:

>>> ds.write_dataset(f, "./thing.arrow", format=ds.IpcFileFormat(), 
>>> file_options=ipc.IpcWriteOptions(compression='lz4'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File 
"/home/joost/.cache/pypoetry/virtualenvs/datalogistik-rL_l_suP-py3.8/lib/python3.8/site-packages/pyarrow/dataset.py",
 line 940, in write_dataset
    if format != file_options.format:
AttributeError: 'pyarrow.lib.IpcWriteOptions' object has no attribute 'format'

 

Alternatively, pyarrow.dataset.IpcFileFormat().make_write_options() does not 
support a compression parameter



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to