>
> Ok, then perhaps you might have some thoughts on the original question: is
> the JavaScript implementation currently incorrect?


I think whether it is a bug or not depends on the contract of the builder.
If the contract is that the builder assumes users will ensure equal lengths
of all the children, then it is probably correct as is.  If it is more
consistent with the code that the builder of the struct should manage
appending a placeholder-value to its children then it is a reasonable
change.

I seem to recall at least in C++ that we actually changed the behavior of
builders in this regard at some point, but I might be misremembering (the
change might have been appending a place-holder value instead
appends nulls, to lower the chances of needing validity buffers on children
arrays if all values in the struct are null).

Whatever the implementation is, the post-condition for the resulting struct
array is that its length is equal to the length of all of its children
arrays.

Cheers,
Micah



On Fri, Feb 18, 2022 at 1:12 PM Phillip Cloud <cpcl...@gmail.com> wrote:

> On Fri, Feb 18, 2022 at 3:44 PM Antoine Pitrou <anto...@python.org> wrote:
>
> >
> > Le 18/02/2022 à 21:32, Phillip Cloud a écrit :
> > >
> > > I am really struggling to see how anything I've said is inconsistent
> with
> > > the spec or what you are saying here.
> > >
> > > To recap what I've said:
> > >
> > > 1. Appending a null sentinel to the values buffer isn't _required_
> unless
> > > the type requires it.
> > > Ex: "joemark" in the spec example. No sentinels were append for the two
> > > null values in the parent struct array.
> >
> > There is no notion of sentinel in the Arrow format, so I don't
> > understand what you're saying.
> >
>
> The word "sentinel" is a linguistic placeholder for "some set of bytes".
> Hopefully that's clear from the context.
>
>
> >
> > (a sentinel is a physical value having a specific meaning, for example a
> > data format that has no separate validity bitmap could use the integer
> > value 42 to indicate null values in an integer array; the Arrow format
> > has a separate validity bitmap and therefore doesn't make use of
> > sentinel values)
>
>
> > > 2. Appending a null value sentinel is _allowed_ to be there if the type
> > > does not require it.
> > > Ex: "joefoofoomark" extending the spec example, assuming the other
> > > associated buffers (validity, offsets) are correctly constructed.
> > >
> > > Is either of those statements incorrect?
> >
> > To me, they simply don't make sense given that sentinels don't exist in
> > Arrow.
>
>
> Do they make sense after substituting in "a null entry in a string array
> with a non-zero number of bytes"?
>
>
> >
> > That said, a null entry in a string array can be backed by a non-zero
> > number of bytes in the values buffer. That is unrelated to the question
> > about struct arrays. For example, "joefoofoomark" can very well be the
> > values buffer for a string array with the logical values ["joe", null,
> > "mark"]. In this case, the offsets will be [0, 3, 9, 13].
> >
>
> Ok, then perhaps you might have some thoughts on the original question: is
> the JavaScript implementation currently incorrect?
>
>
> >
> > Regards
> >
> > Antoine.
> >
>

Reply via email to