Resurrecting a thread from earlier in the month regarding inconsistent use of the file_offset field [1][2]. It seems like the preferred path forward is to deprecate this (AFAICT) unused field to prevent further confusion. If there are no violent objections, I'll submit a PR to do so in a few days.

One question I have, though, is how to handle the requiredness of the file_offset (currently required) and meta_data (currently optional) fields in ColumnChunk. I'd prefer to switch them, and make file_offset optional and meta_data required, but I'm not clear on how that will impact existing parsers. I believe most (all) implementations ignore file_offset anyway, and expect meta_data to be present, so maybe this is a non-issue.

Thanks,
Ed

[1] https://lists.apache.org/thread/q5r43ks61q4wcbvwsk1jyw4h30fvg68t
[2] https://github.com/apache/parquet-java/pull/1369

Reply via email to