Hello all,
Does anybody else want to give an opinion on this?
Thank you
Antoine.
On Tue, 17 Nov 2020 12:28:06 +0100
Antoine Pitrou wrote:
> Hello,
>
> The format spec and the C++ implementation disagree on one point:
>
> * The spec says that dense union offsets should be increasing:
> """
Last time this was discussed [1] I think we determined the specification
was written as intended and Wes mentioned there that he was also weakly
supportive of removing the constraint.
>From a previous discussion [2], it sounds like users of JS library were
explicitly using the "dictionary" feature
I think the Java implementation is not aligning with the spec, either.
IMO, option 2 provides more performance optimization opportunities.
However, it may lead to some unexpected behaviors. For example, when we
change the value of one slot, the values of several other slots may be
changed as well.
In principle I'm in favor of #2 -- the only question is what kinds of
problems it might pose for forward compatibility.
Note
* This is completely backward compatible (any data conforming to the
spec to the letter will continue to be conforming)
* It is also forward compatible at a protocol level,
Hello,
The format spec and the C++ implementation disagree on one point:
* The spec says that dense union offsets should be increasing:
"""The respective offsets for each child value array must be in order /
increasing."""
(from https://arrow.apache.org/docs/format/Columnar.html#dense-union)