jagill commented on PR #6643:
URL: https://github.com/apache/arrow-rs/pull/6643#issuecomment-2564932382

   @alamb: With this update I've
   1. included the hoisting of the conditional above the loop, which I missed 
last time due to atrophied `git add` muscles,
   2. renamed and extracted StructParseMode to a top level StructMode in 
arrow_json, since I've...
   3. enabled writing structs to JSON lists in the writer,
   4. included some more examples in the `with_struct_mode` methods of 
ReaderBuilder and WriterBuilder, along with links to the scanty Presto/Trino 
documentation.
   
   One thing that makes me sad is that the Writer has an `explicit_nulls` field 
which is not orthogonal to StructMode. In particular, writing to a JSON List 
_must_ include nulls, so `explicit_nulls == false` and `StructMode::ListOnly` 
is not a consistent state.  I saw three main approaches here:
   1. Split StructMode::ObjectOnly to ObjectOnlyWithImplictNulls and 
ObjectOnlyWithExplicitNulls (but with better names).  This means only valid 
states will exist, and the Reader can choose between accepting implicit nulls 
or not.  But this is a breaking change.
   2. Error/panic in WriterBuilder if `explicit_nulls == false` and 
`struct_mode == ListOnly`.  This prevents users from using a non-consistent 
config, but adds failure paths.
   3. Only skip nulls if `explicit_nulls == false` and `struct_mode == 
ObjectOnly`, and add that to the documentation for `with_explicit_nulls` and 
`with_struct_mode`.
   
   I chose 3 but in theory I'm open to any of these (or other suggestions).  In 
practice, I think that 1 would involve too much yak shaving, and we can always 
do the breaking change combining the two options later.
   
   I'm also happy to add more documentation for rational or examples, with the 
following caveats:
   1. I wanted to add a doctest but constructing an expected StructArray was so 
verbose it obfuscated the purpose of the doctest.  Maybe there's a better way 
than I know?
   2. Getting the doc update PR accepted/merged into Presto and Trino will take 
an unknown amount of time. Maybe a week, maybe 3 months.  I'm reticent to block 
this update for that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to