adriangb commented on code in PR #37877: URL: https://github.com/apache/arrow/pull/37877#discussion_r1990208311
########## docs/source/format/Columnar.rst: ########## @@ -487,6 +499,103 @@ will be represented as follows: :: |-------------------------------|-----------------------| | 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 | unspecified (padding) | +ListView Layout +~~~~~~~~~~~~~~~ + +The ListView layout is defined by three buffers: a validity bitmap, an offsets +buffer, and an additional sizes buffer. Sizes and offsets have the identical bit +width and both 32-bit and 64-bit signed integer options are supported. + +As in the List layout, the offsets encode the start position of each slot in the +child array. In contrast to the List layout, list lengths are stored explicitly +in the sizes buffer instead of inferred. This allows offsets to be out of order. +Elements of the child array do not have to be stored in the same order they +logically appear in the list elements of the parent array. + +Every list-view value, including null values, has to guarantee the following +invariants: :: + + 0 <= offsets[i] <= length of the child array + 0 <= offsets[i] + size[i] <= length of the child array + +A list-view type is specified like ``ListView<T>``, where ``T`` is any type +(primitive or nested). In these examples we use 32-bit offsets and sizes where +the 64-bit version would be denoted by ``LargeListView<T>``. + +**Example Layout: ``ListView<Int8>`` Array** + +We illustrate an example of ``ListView<Int8>`` with length 4 having values:: + + [[12, -7, 25], null, [0, -127, 127, 50], []] Review Comment: Makes total sense, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org