jorisvandenbossche commented on a change in pull request #11385: URL: https://github.com/apache/arrow/pull/11385#discussion_r729128250
########## File path: cpp/src/parquet/schema.cc ########## @@ -73,6 +73,29 @@ std::shared_ptr<ColumnPath> ColumnPath::FromNode(const Node& node) { return std::make_shared<ColumnPath>(std::move(path)); } +std::shared_ptr<ColumnPath> ColumnPath::ShortFromNode(const Node& node) { + // Build the path in reverse order as we traverse the nodes to the top + std::vector<std::string> rpath_; + const Node* cursor = &node; + while (cursor->parent()) { + if (cursor->is_group()) { + auto group = dynamic_cast<const GroupNode*>(cursor); + // If we have a parent list node, remove the names of the two direct + // child nodes (list.element) + if (group->logical_type()->is_list()) { + rpath_.pop_back(); + rpath_.pop_back(); Review comment: Hmm, that's difficult to test if we can't write such files. Or do we have some test files for that? How do we test those special cases in our reading code? Looking at the parquet-testing repo, there is a special case (https://github.com/apache/parquet-testing/blob/master/data/repeated_no_annotation.parquet), but that's with a struct inside the list inside a list, with no logical type annotation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org