PiotrSrebrny opened a new issue, #8534:
URL: https://github.com/apache/arrow-rs/issues/8534

   When working with `ArrowWriter` I would like to flush buffered rows onto the 
disk. However, when calling `ArrowWriter<W>::flush()` only part of the data is 
flushed. The reason is that `parquet::file::writer::TrackedWrite` that is used 
by `ArrowWriter` inserts `BufWriter` on top of user supplied writer `W`. This 
`BufWriter` is not flushed() when `ArrowWriter<W>::flush()` is called. 
   
   The best solution to this problem would be to remove `BufWriter` from 
`TrackedWrite` and just use the user supplied `Writer`. The `BufWriter` suppose 
to buffer small writes, but this function is not needed when writing to memory 
and most operating systems employ this sort of mechanism. Thus, it is 
redundant. Maybe, `BufWriter` could be beneficial when working with bare-metal 
system, but then a user could just wrap its writer in `BufWriter` and give it 
to `ArrowWriter`. Nonetheless, I guess that DataFusion is not ofter run on 
bare-metal. 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to