jagill commented on PR #6643: URL: https://github.com/apache/arrow-rs/pull/6643#issuecomment-2564932382
@alamb: With this update I've 1. included the hoisting of the conditional above the loop, which I missed last time due to atrophied `git add` muscles, 2. renamed and extracted StructParseMode to a top level StructMode in arrow_json, since I've... 3. enabled writing structs to JSON lists in the writer, 4. included some more examples in the `with_struct_mode` methods of ReaderBuilder and WriterBuilder, along with links to the scanty Presto/Trino documentation. One thing that makes me sad is that the Writer has an `explicit_nulls` field which is not orthogonal to StructMode. In particular, writing to a JSON List _must_ include nulls, so `explicit_nulls == false` and `StructMode::ListOnly` is not a consistent state. I saw three main approaches here: 1. Split StructMode::ObjectOnly to ObjectOnlyWithImplictNulls and ObjectOnlyWithExplicitNulls (but with better names). This means only valid states will exist, and the Reader can choose between accepting implicit nulls or not. But this is a breaking change. 2. Error/panic in WriterBuilder if `explicit_nulls == false` and `struct_mode == ListOnly`. This prevents users from using a non-consistent config, but adds failure paths. 3. Only skip nulls if `explicit_nulls == false` and `struct_mode == ObjectOnly`, and add that to the documentation for `with_explicit_nulls` and `with_struct_mode`. I chose 3 but in theory I'm open to any of these (or other suggestions). In practice, I think that 1 would involve too much yak shaving, and we can always do the breaking change combining the two options later. I'm also happy to add more documentation for rational or examples, with the following caveats: 1. I wanted to add a doctest but constructing an expected StructArray was so verbose it obfuscated the purpose of the doctest. Maybe there's a better way than I know? 2. Getting the doc update PR accepted/merged into Presto and Trino will take an unknown amount of time. Maybe a week, maybe 3 months. I'm reticent to block this update for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
