PiotrSrebrny opened a new issue, #8534: URL: https://github.com/apache/arrow-rs/issues/8534
When working with `ArrowWriter` I would like to flush buffered rows onto the disk. However, when calling `ArrowWriter<W>::flush()` only part of the data is flushed. The reason is that `parquet::file::writer::TrackedWrite` that is used by `ArrowWriter` inserts `BufWriter` on top of user supplied writer `W`. This `BufWriter` is not flushed() when `ArrowWriter<W>::flush()` is called. The best solution to this problem would be to remove `BufWriter` from `TrackedWrite` and just use the user supplied `Writer`. The `BufWriter` suppose to buffer small writes, but this function is not needed when writing to memory and most operating systems employ this sort of mechanism. Thus, it is redundant. Maybe, `BufWriter` could be beneficial when working with bare-metal system, but then a user could just wrap its writer in `BufWriter` and give it to `ArrowWriter`. Nonetheless, I guess that DataFusion is not ofter run on bare-metal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
