[GitHub] [arrow] lidavidm commented on a change in pull request #11646: ARROW-14634: [Flatbuffers] introduction of ColumnBag

GitBox Tue, 09 Nov 2021 07:39:19 -0800


lidavidm commented on a change in pull request #11646:
URL: https://github.com/apache/arrow/pull/11646#discussion_r745740314




##########
File path: format/Message.fbs
##########
@@ -117,6 +117,40 @@ table DictionaryBatch {
   isDelta: bool = false;
 }
 
+/// A range of field nodes, identified by their offset in the schema.
+/// The offsets are zero-indexed.
+struct FieldNodeRange {
+  /// The starting offset (inclusive)
+  start: long;
+
+  /// The ending offset (exclusive)
+  end: long;
+}
+
+/// A data header describing the shared memory layout of a "bag" of "columns".
+/// It is similar to a RecordBatch but not every top level FieldNode is 
required
+/// to be included in the wire payload.
+table ColumnBag {
+  /// If not provided, all field nodes are included and this payload is
+  /// identical to a RecordBatch. Otherwise the reader needs to skip
+  /// top level FieldNodes that were not included.
+  includedNodes: [FieldNodeRange];

Review comment:
       Sorry, I guess what I mean is "Can we make it explicit that only 
top-level arrays can be skipped, and top-level arrays must be skipped as a 
whole", i.e. we can't "patch" a child of a nested array. (And that 
implementations must reject "degenerate" messages that skip, say, only one 
child of a struct, since that makes no sense in the first place, but should be 
validated.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] lidavidm commented on a change in pull request #11646: ARROW-14634: [Flatbuffers] introduction of ColumnBag

Reply via email to