neilechao commented on PR #45375: URL: https://github.com/apache/arrow/pull/45375#issuecomment-2737552566
Here's what I'm seeing as differences between the [Geometry](https://github.com/apache/arrow/pull/45459) and Variant PRs so far - - Geometry has a reader_test which calls test_utils for creating sample objects. For Variant, this doesn't make sense until we put in the encoding / decoding piece - There are lots of metadata and statistics for Geometry. For Variant, since we're starting with unshredded, we don't have statistics so far - Geometry has a stricter thrift definition, whereas unshredded variant is a Group of two binaries, metadata and value. The metadata and value don't make sense until encoding is done. So logic using thrift defs doesn't make sense until encoding is in. Here is a very loose set of steps to get full Variant C++ support, which I'm sure is missing some/many pieces. Please fill in the missing steps and capabilities 1. Get a logical type skeleton merged. Reading / writing binary with no interpretation of what it means a. Ideally we could get this PR to just do 1 and get it merged before moving on to the next steps. For this, I'm not sure what is remaining 2. Parquet-java checks in sample parquet files with variant into parquet-testing 3. Add decoding in some variant_util class(es) using those parquet-testing files 4. Add encoding a. Reading/writing tests are unblocked 5. Move on to shredded variants. This is a very large work item that will definitely get expanded into multiple sub items -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org