Got you, thanks Neal. It does seem like it'll be tricky to implement in
javascript without native compression.

We've managed to trick our server to serve .feather.gz as .feather with
compression headers, so we save on storage & runtime server compression,
and the native browser http decompression.



On Fri, 18 Dec 2020, 17:49 Neal Richardson, <neal.p.richard...@gmail.com>
wrote:

> A few clarifications: Feather, in it's version 2, _is_ the Arrow IPC file
> format. We've kept the Feather name as a way of referring to Arrow files.
> The original Feather file format, which had differences from the Arrow IPC
> format, did not support compression. The Arrow IPC format may include
> compression (https://issues.apache.org/jira/browse/ARROW-300), but as
> Micah
> brought up on the user mailing list thread, it's only the C++
> implementation and libraries using it that have implemented yet, and the
> feature is not well documented yet.
>
> So all Arrow libraries support Feather v2 (as it is the IPC file format),
> but currently only C++ (thus Python, R, and glib/Ruby) supports Feather/IPC
> files with compression.
>
> Neal
>
> On Fri, Dec 18, 2020 at 8:18 AM Brian Hulette <bhule...@apache.org> wrote:
>
> >  Hi Andrew,
> > I'm glad you got this working! The javascript library only implements the
> > arrow IPC spec, it doesn't have any special handling for feather and its
> > compression support. It's good to know that you can read uncompressed
> > feather files, but I'd only expect it to read an IPC stream or file. This
> > is what I did for the Intro to Arrow JS notebook [1], see scrabble.py
> here
> > [2]. Note that python script was written many versions of arrow ago, I'm
> > sure there's less boilerplate required for this in pyarrow 2.0.
> >
> > Support for feather and compression would certainly be a welcome
> > contribution
> >
> > [1] https://observablehq.com/@theneuralbit/introduction-to-apache-arrow
> > [2]
> https://gist.github.com/TheNeuralBit/64d8cc13050c9b5743281dcf66059de5
> >
> > On Thu, Dec 17, 2020 at 10:10 AM Andrew Clancy <n...@achren.org> wrote:
> >
> > > So, I figured out the issue here - I had to remove compression from the
> > > pyarrow feather.write_feather(compression='uncompressed'). Is there any
> > way
> > > to read a compressed feather file in arrow js?
> > > See the comment under the first answer here:
> > >
> > >
> >
> https://stackoverflow.com/questions/64629670/how-to-write-a-pandas-dataframe-to-arrow-file/64648955#64648955
> > > I couldn't find anything in the arrow docs or notebooks on this - I'm
> > > assuming that's related to javascript compression libraries being so
> > > limited.
> > >
> > > On Mon, 14 Dec 2020 at 19:02, Andrew Clancy <n...@achren.org> wrote:
> > >
> > > > Hi,
> > > >
> > > > I have a simple feather file created via a pandas to_feather with a
> > > > datetime64[ns] column, and cannot get timestamps in javascript
> > > > apache-arrow@2.0.0
> > > >
> > > > See this notebook:
> > > > https://observablehq.com/@nite/apache-arrow-timestamp-investigation
> > > >
> > > > I'm guessing I'm missing something, has anyone got any suggestions,
> or
> > > > decent examples of reading a file created in pandas? I've seen in
> > > examples
> > > > of apache-arrow@0.3.1 where dates stored as an array of 2 ints.
> > > >
> > > > File was created with:
> > > >
> > > > import pandas as pd
> > > > pd.read_parquet('sample.parquet')
> > > > df.to_feather('sample-seconds.feather')
> > > >
> > > > Final Q: I'm assuming this is the best place for this question? Happy
> > to
> > > > post elsewhere if there's any other forums, or if this should be a
> JIRA
> > > > ticket?
> > > >
> > > > Thanks!
> > > > Andy
> > > >
> > >
> >
>

Reply via email to