Jane Lewis created AVRO-4063: -------------------------------- Summary: `apache_avro::Writer::flush` does not call `std::io::Write::flush` on the inner writer Key: AVRO-4063 URL: https://issues.apache.org/jira/browse/AVRO-4063 Project: Apache Avro Issue Type: Bug Components: rust Reporter: Jane Lewis
The Rust documentation for {{apache_avro::Writer::flush}} describes the function as follows: {quote}Flush the content appended to a {{{}Writer{}}}. Call this function to make sure all the content has been written before releasing the {{{}Writer{}}}. {quote} However, this function does not actually guarantee that all the content will be written out after the {{flush()}} call, because it does not call {{std::io::Write::flush}} on the inner {{{}writer{}}}. This can be a problem when the inner writer uses its own buffer. Here's an example of how this can lead to misleading behavior: {code:java} fn main() { let buffered_writer = std::io::BufWriter::new(std::fs::File::create("test.avro").unwrap()); let schema = apache_avro::Schema::parse_str( r#" { "type": "record", "name": "example_schema", "fields": [ {"name": "example_field", "type": "string"} ] } "#, ) .unwrap(); let mut writer = apache_avro::Writer::new(&schema, buffered_writer); let mut record = apache_avro::types::Record::new(writer.schema()).unwrap(); record.put("example_field", "value"); writer.append(record).unwrap(); writer.flush().unwrap(); let test_file_contents = std::fs::read("test.avro").unwrap(); assert_ne!(test_file_contents.len(), 0); // this will fail } {code} In this example, the internal {{BufWriter}} had not yet flushed its internal buffer after {{writer.flush().unwrap()}} was called. In fact, the buffer is only written out once {{writer}} is dropped. To fix this issue, I propose that {{.flush()}} should be called on the inner writer at the end of {{{}apache_avro::Writer::flush{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)