etseidl commented on issue #7909:
URL: https://github.com/apache/arrow-rs/issues/7909#issuecomment-3073814710

   > What behavior would we want from a thrift parser? To avoid this problem, 
each generated enum would need to have some `UNKNOWN` / `UNSUPPORTED` variant.
   
   As I understand it, thrift is meant to be forward compatible, and 
`ColumnOrder` is definitely intended to be so. In C++, `union`s are modeled as 
a struct of structs along with a bitfield indicating which, if any, of the 
members are valid. If an unknown value is encountered, it simply returns an 
empty struct with all bit fields '0'.
   
   For `enum`, the rust thrift implementation was changed from a rust `enum` to 
a `struct` with a single `i32` for the discriminant, with constants defined for 
each known value. An unknown value `i` is returned as `EnumName(i)`, which 
removes the need for an `UNDEFINED` special value.
   
   We could take a similar approach for `union`. A struct with an "enum struct" 
discriminant, along with `Option<struct>` fields for each variant. An unknown 
`i` would then have `i` for the enum part and `None` for all members. Getters 
for the fields could test the discriminant and return `None` if it is incorrect 
for the field.
   
   > For `ColumnOrder` we could just ignore the statistics in that case, but 
I'm not sure whether in general this would just move the error to another place 
in the code.
   
   In the specific `ColumnOrder` case, its as-designed behavior is that when an 
unknown order is encountered, statistics for that column are to be ignored. 
This is not currently enforced in rust, but @adriangb and I attempted to 
address it in https://github.com/apache/datafusion/pull/15821. arrow-rs should 
likely also be changed to not populate statistics for columns with undefined 
orders.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to