mapleFU commented on issue #35697: URL: https://github.com/apache/arrow/issues/35697#issuecomment-1557051817
```c++ TEST(ArrowReadWrite, NestedNonFixedSizeList3) { using ::arrow::field; using ::arrow::list; using ::arrow::struct_; auto type = list(list(::arrow::int16())); const char* json = R"([ [[1, 2], [3, 4]], null, [[5, 6], null], [null, [7, 8]]])"; auto array = ::arrow::ArrayFromJSON(type, json); auto table = ::arrow::Table::Make(::arrow::schema({field("root", type)}), {array}); auto props_store_schema = ArrowWriterProperties::Builder().store_schema()->build(); CheckSimpleRoundtrip(table, 2, props_store_schema); } ``` By the way, this case can pass the test. I gothrough the code, and I guess I've find out the reason. The test arrow expect to write batch with size "2" The batch1: ``` [[1, 2], [3, 4]], null ``` The batch2: ``` [[5, 6], null], [null, [7, 8]] ``` Now, for `List` ( not fixed-size list ), the underlying data (in array) are: ``` 1 2 3 4 5 6 7 8 ``` So, when calling `WritePath` in `src/parquet/arrow/path_internal.cc`, the underlying data is successive, so `RecordPostListVisit` will concat them together: ``` // Incorporates |range| into visited elements. If the |range| is contiguous // with the last range, extend the last range, otherwise add |range| separately // to the list. void RecordPostListVisit(const ElementRange& range) { if (!visited_elements.empty() && range.start == visited_elements.back().end) { visited_elements.back().end = range.end; return; } visited_elements.push_back(range); } ``` However, for `FixedSizeList`, the underlying data is: ``` 1 2 ? ? ? ? 3 4 5 6 ? ? ? ? 7 8 ``` So, underlying data is **not** successive, and `WritePath` will trigger: ``` size_t visited_component_size = result.post_list_visited_elements.size(); DCHECK_GT(visited_component_size, 0); if (visited_component_size != 1) { return Status::NotImplemented( "Lists with non-zero length null components are not supported"); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org