Resurrecting a thread from earlier in the month regarding inconsistent
use of the file_offset field [1][2]. It seems like the preferred path
forward is to deprecate this (AFAICT) unused field to prevent further
confusion. If there are no violent objections, I'll submit a PR to do so
in a few days.
One question I have, though, is how to handle the requiredness of the
file_offset (currently required) and meta_data (currently optional)
fields in ColumnChunk. I'd prefer to switch them, and make file_offset
optional and meta_data required, but I'm not clear on how that will
impact existing parsers. I believe most (all) implementations ignore
file_offset anyway, and expect meta_data to be present, so maybe this is
a non-issue.
Thanks,
Ed
[1] https://lists.apache.org/thread/q5r43ks61q4wcbvwsk1jyw4h30fvg68t
[2] https://github.com/apache/parquet-java/pull/1369