A change in the length of an array is equivalent to a change in at least one of its buffers (i.e. length is always physical).
* Primitive arrays (i32, i64, etc): the arrays' length is equal to the length of the buffer divided by the size of the type. E.g. buffer.len() = 8 and i32 <=> length = 2) * Variable length (binary, list, utf8): the arrays' length is equal to the length of the offset buffer divided by the size of the offset type minus one (e.g. buffer.len() = 12 and i32 <=> length = 2) * StructArray: the arrays' length is equal to the length of any of its fields. * ... When appending a slot to a StructArray (null or not), we need to append one item to each of its fields * a primitive array field the values buffer is increased by the size of the backing type (and, if it exists, its validity is increased by 1 bit) * In variable length arrays the values offsets buffer is increased by the size of the offset type (and, if it exists, its validity is increased by 1 bit) * ... What we append on each of its fields is underdetermined. Most implementations append a null item, but anything is ok. For example, if the field is a primitive array and has no validity, it may make more sense to append a slot with value 0 to avoid allocating a validity. But if the field itself is deeply nested, a null may be cheaper (less pushes on its children). Best, Jorge On Fri, Feb 18, 2022 at 8:02 PM Phillip Cloud <cpcl...@gmail.com> wrote: > I think I'm confused by where this appended value lives. Is it only a > logical value or does the value show up in memory? > For example, appending another null to the name field is only going to > change the validity map, offsets array and length and there will not be any > changes the values buffer. > > The value is logically there, but there's no additional values-buffer > memory. > > Is that correct? > > On Fri, Feb 18, 2022 at 1:44 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > > > > > It is definitely required according to my understanding, and to how the > > > C++ implementation works. The validation functions in the C++ > > > implementation also check for this (if a child buffer is too small for > > > the number of values advertised by the parent, it is an error). > > > > +1. > > > > I think the wording is confusing. "While a struct does not have > physical > > storage for each of its semantic slots" refers to the fact that all > fields > > in the struct are stored in separate child arrays and not as buffers on > the > > Struct array itself. The actual value used in the child Array isn't > > important i the struct is null but it must be appended so the length of > the > > struct is equal to the length of all of its children. > > > > -Micah > > > > On Fri, Feb 18, 2022 at 10:39 AM Antoine Pitrou <anto...@python.org> > > wrote: > > > > > > > > Le 18/02/2022 à 19:29, Phillip Cloud a écrit : > > > > > > > > The description underneath the example says: > > > > > > > >> While a struct does not have physical storage for each of its > semantic > > > > slots > > > >> (i.e. each scalar C-like struct), an entire struct slot can be set > to > > > > null via the validity bitmap. > > > > > > > > To me this suggests that appending a sentinel value to the values > > buffer > > > > for a field is allowed, > > > > but not required. > > > > > > > > Am I understanding this correctly? > > > > > > It is definitely required according to my understanding, and to how the > > > C++ implementation works. The validation functions in the C++ > > > implementation also check for this (if a child buffer is too small for > > > the number of values advertised by the parent, it is an error). > > > > > > Regards > > > > > > Antoine. > > > > > >