Jane Lewis created AVRO-4063:
--------------------------------

             Summary: `apache_avro::Writer::flush` does not call 
`std::io::Write::flush` on the inner writer
                 Key: AVRO-4063
                 URL: https://issues.apache.org/jira/browse/AVRO-4063
             Project: Apache Avro
          Issue Type: Bug
          Components: rust
            Reporter: Jane Lewis


The Rust documentation for {{apache_avro::Writer::flush}} describes the 
function as follows:
{quote}Flush the content appended to a {{{}Writer{}}}. Call this function to 
make sure all the content has been written before releasing the {{{}Writer{}}}.
{quote}
However, this function does not actually guarantee that all the content will be 
written out after the {{flush()}} call, because it does not call 
{{std::io::Write::flush}} on the inner {{{}writer{}}}.

This can be a problem when the inner writer uses its own buffer. Here's an 
example of how this can lead to misleading behavior:

 
{code:java}
fn main() {
    let buffered_writer = 
std::io::BufWriter::new(std::fs::File::create("test.avro").unwrap());
    let schema = apache_avro::Schema::parse_str(
        r#"
    {
        "type": "record",
        "name": "example_schema",
        "fields": [
            {"name": "example_field", "type": "string"}
        ]
    }
"#,
    )
    .unwrap();
    let mut writer = apache_avro::Writer::new(&schema, buffered_writer);
    let mut record = apache_avro::types::Record::new(writer.schema()).unwrap();
    record.put("example_field", "value");
    writer.append(record).unwrap();
    writer.flush().unwrap();
    let test_file_contents = std::fs::read("test.avro").unwrap();
    assert_ne!(test_file_contents.len(), 0); // this will fail
}
{code}
In this example, the internal {{BufWriter}} had not yet flushed its internal 
buffer after {{writer.flush().unwrap()}} was called. In fact, the buffer is 
only written out once {{writer}} is dropped.

To fix this issue, I propose that {{.flush()}} should be called on the inner 
writer at the end of {{{}apache_avro::Writer::flush{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to