> > If the ColumnMetaData in the footer is de facto > required, then I think we should at a minimum change the thrift to make > it so.
If the result of changing the thrift is that parquet files that can not be read by old readers, I disagree with this tradeoff. I think it is better for the spec to be less than ideal but for the files to keep working. > Similarly, the reference implementation (parquet-java) currently > does not write the required metadata, and sets file_offset to an invalid > (but valid seeming) value. If we don't change the requiredness of > file_offset, then either parquet-java needs to start writing the > metadata inline with the chunk data and set file_offset correctly, or, > as I've proposed elsewhere[1], simply write 0 for the required field, I agree changing parquet-java to have more sensible behavior makes sense to me.