Thanks Antoine! I'll go respond to your comments now!
On Mon, Jan 9, 2023 at 11:01 AM Antoine Pitrou wrote:
>
> I've commented on the PR. I'm +1 on the principle and on the proposed
> format / layout additions.
>
> Regards
>
> Antoine.
>
>
> Le 14/12/2022 à 17:27, Matt Topol a écrit :
> >
I've commented on the PR. I'm +1 on the principle and on the proposed
format / layout additions.
Regards
Antoine.
Le 14/12/2022 à 17:27, Matt Topol a écrit :
Hello,
I'd like to propose adding the RLE type based on earlier discussions[1][2]
to the Arrow format:
- Columnar Format
Huzzah!
That brings us to 3 +1 (binding) votes, and 1 +1 (non-binding) vote!
The vote passes! I've updated the PR for the format changes (on their own)
here: https://github.com/apache/arrow/pull/14176 and will follow it up with
updating the other PRs as I can. If anyone could comment / approve
@Matt Topol: Yes, a change of the name to "run-end encoding" changes
my (non-binding) vote to a +1.
On Mon, Dec 19, 2022 at 3:32 PM Matthew Topol
wrote:
>
> Okay, slight edit to my previous email: It was brought to my attention that
> we need at least 3 +1 binding votes, so this vote is still
+1
Thanks a lot for all this. Really exciting!!
On Mon, 19 Dec 2022, 17:56 Matt Topol, wrote:
> That leaves us with a total vote of +1.5 so the vote carries with the
> caveat of changing the name to be Run End Encoded rather than Run Length
> Encoded (unless this means I need to do a new vote
Okay, slight edit to my previous email: It was brought to my attention that
we need at least 3 +1 binding votes, so this vote is still open for the
moment.
@IanCook: With the change of the name to RunEndEncoding is that sufficient
to change your vote to a +1?
On Mon, Dec 19, 2022 at 12:57 PM
That leaves us with a total vote of +1.5 so the vote carries with the
caveat of changing the name to be Run End Encoded rather than Run Length
Encoded (unless this means I need to do a new vote with the changed name?
This is my first time doing one of these so please correct me if I need to
do a
+1
I agree that run-end encoding makes more sense but also don't see it
as a deal breaker.
The most compelling counter-argument I've seen for new types is to
avoid a schism where some implementations do not support the newer
types. However, for the type proposed here I think the risk is low
+1 on the proposal as written
I think it makes sense and offers exciting opportunities for faster
computation (especially for cases where parquet files can be decoded
directly into such an array and avoid unpacking. RLE encoded dictionary are
quite compelling)
I would prefer to use the term
I'm not at all opposed to renaming it as `Run-End-Encoding` if that would
be preferable. Hopefully others will chime in with their feedback.
--Matt
On Wed, Dec 14, 2022 at 12:09 PM Ian Cook wrote:
> Thank you Matt, Tobias, and others for the great work on this.
>
> I am -0.5 on this proposal
Thank you Matt, Tobias, and others for the great work on this.
I am -0.5 on this proposal in its current form because (pardon the
pedantry) what we have implemented here is not run-length encoding; it
is run-end encoding. Based on community input, the choice was made to
store run ends instead of
Hello,
I'd like to propose adding the RLE type based on earlier discussions[1][2]
to the Arrow format:
- Columnar Format description:
https://github.com/apache/arrow/pull/1/files#diff-8b68cf6859e881f2357f5df64bb073135d7ff6eeb51f116418660b3856564c60
- Flatbuffers changes:
12 matches
Mail list logo