jhorstmann commented on code in PR #6117:
URL: https://github.com/apache/arrow-rs/pull/6117#discussion_r1693936255
##########
parquet/src/file/writer.rs:
##########
@@ -649,13 +649,10 @@ impl<'a, W: Write + Send> SerializedRowGroupWriter<'a, W>
{
));
}
- let file_offset = self.buf.bytes_written() as i64;
-
let map_offset = |x| x - src_offset + write_offset as i64;
let mut builder =
ColumnChunkMetaData::builder(metadata.column_descr_ptr())
.set_compression(metadata.compression())
.set_encodings(metadata.encodings().clone())
- .set_file_offset(file_offset)
Review Comment:
> the write below is correct as this is writing the copy of the
ColumnMetaData in the footer
The code is hard to follow, but my understanding is that this method copies
an existing `ColumnChunk` from a `ChunkReader`. The `buf` seems to be the same
that page data is written too, and I assume that `append_column` could be
called multiple times . The metadata seems to also get collected inside the
`on_close` closure and will in the end be written by
`SerializedFileWriter::write_metadata`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]