[I] [Python] Support serialization of Arrow files on disk without the identifier "Feather" [arrow]

via GitHub Mon, 30 Oct 2023 09:30:38 -0700


jason-s opened a new issue, #38515:
URL: https://github.com/apache/arrow/issues/38515


   ### Describe the enhancement requested
   
   The documentation for [Arrow Columnar 
Format](https://arrow.apache.org/docs/format/Columnar.html#ipc-file-format) 
suggests that the separate Feather project has been subsumed into Arrow, and 
that it (Feather) is really just the canonical serialization format for Arrow 
tables:
   
   > We recommend the “.arrow” extension for files created with this format. 
Note that files created with this format are sometimes called “Feather V2” or 
with the “.feather” extension, the name and the extension derived from “Feather 
(V1)”, which was a proof of concept early in the Arrow project for 
language-agnostic fast data frame storage for Python (pandas) and R.
   
   The Python support of Arrow serialization still uses the identifier 
`feather`: (see [the 
Cookbook](https://arrow.apache.org/cookbook/py/io.html#write-a-feather-file))
   
   > Once we have a table, it can be written to a Feather File using the 
functions provided by the `pyarrow.feather` module
   >
   > ```
   > import pyarrow.feather as ft
   > 
   > ft.write_feather(table, 'example.feather')
   > ```
   
   This functionality should be kept as is, for backwards compatibility, but I 
wonder if the `pyarrow` module should just have a `write()` function, without 
requiring the need to import the `pyarrow.feather` package or use the term 
`feather`. This would help to reduce confusion about file extensions and the 
relationship between "Arrow" and "Feather".
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Python] Support serialization of Arrow files on disk without the identifier "Feather" [arrow]

Reply via email to