I think the biggest benefit of RLE is not on-the-wire compression, as that
can be done via more general purpose compression schemes as Antoine
mentions.

The biggest benefit of RLE is that it allows operating directly and very
efficiently on the "encoded" form -- for example, you can apply filters
directly to RLE encoded data, as well as update aggregations. The benefit
can be especially large when operating on data stored in formats such as
parquet which already use RLE as the size of the decoded values prior to
filtering can be much lower

Anerew

On Fri, Jun 3, 2022 at 12:52 PM Tobias Zagorni <tob...@zagorni.eu.invalid>
wrote:

> Am Freitag, dem 03.06.2022 um 09:32 -0700 schrieb Micah Kornfield:
> > >
> > > Thinking about compatibility with existing software, RLE could
> > > possibly
> > > even made an Extension Type that follows the layout of a struct of
> > > int32 and the encoded value type. I'm wondering wether this would
> > > be
> > > better for compatibility.
> >
> >
> > I might be misunderstanding this proposal, but I don't think this
> > works. Wouldn't the structs with RLE have different row lengths then
> > any
> > Array in the same record/batch and table?  I think this means that
> > validation would fail on them.
>
> I think you understood it corrently. I'm not really familar with
> validation of arrow and didn't think of that problem.
>
> Best,
> Tobias
> >
>
>

Reply via email to