Re: [Discuss] Should dense union offsets be always increasing?

2020-11-24 Thread Antoine Pitrou
Hello all, Does anybody else want to give an opinion on this? Thank you Antoine. On Tue, 17 Nov 2020 12:28:06 +0100 Antoine Pitrou wrote: > Hello, > > The format spec and the C++ implementation disagree on one point: > > * The spec says that dense union offsets should be increasing: > """

Re: [Discuss] Should dense union offsets be always increasing?

2020-11-19 Thread Micah Kornfield
Last time this was discussed [1] I think we determined the specification was written as intended and Wes mentioned there that he was also weakly supportive of removing the constraint. >From a previous discussion [2], it sounds like users of JS library were explicitly using the "dictionary" feature

Re: [Discuss] Should dense union offsets be always increasing?

2020-11-19 Thread Fan Liya
I think the Java implementation is not aligning with the spec, either. IMO, option 2 provides more performance optimization opportunities. However, it may lead to some unexpected behaviors. For example, when we change the value of one slot, the values of several other slots may be changed as well.

Re: [Discuss] Should dense union offsets be always increasing?

2020-11-17 Thread Wes McKinney
In principle I'm in favor of #2 -- the only question is what kinds of problems it might pose for forward compatibility. Note * This is completely backward compatible (any data conforming to the spec to the letter will continue to be conforming) * It is also forward compatible at a protocol level,

[Discuss] Should dense union offsets be always increasing?

2020-11-17 Thread Antoine Pitrou
Hello, The format spec and the C++ implementation disagree on one point: * The spec says that dense union offsets should be increasing: """The respective offsets for each child value array must be in order / increasing.""" (from https://arrow.apache.org/docs/format/Columnar.html#dense-union)