> The former is about facilities (like extension points) for implementing
> custom data sources in Arrow whereas the latter is about facilities for
> integrating in PyArrow (existing or future) data sources written/wrapped in
> Python
Any C++ extension point could become a pyarrow extension poi
I think the biggest benefit of RLE is not on-the-wire compression, as that
can be done via more general purpose compression schemes as Antoine
mentions.
The biggest benefit of RLE is that it allows operating directly and very
efficiently on the "encoded" form -- for example, you can apply filters
Thanks for the detailed overview, Weston. I agree with David this would be very
useful to have in a public doc.
Weston and David's discussion is a good one, however, I see it as separate from
the discussion I brought up. The former is about facilities (like extension
points) for implementing cu