[ 
https://issues.apache.org/jira/browse/AVRO-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831568#action_12831568
 ] 

Doug Cutting commented on AVRO-295:
-----------------------------------

I am not convinced that we should automatically flush after each object is 
serialized, as this may adversely affect performance when writing many small 
objects.  Mightn't it be better to add a flush() method to DatumWriter and 
leave control of when things are flushed to the application?

> JsonEncoder  is not flushed after writing using ReflectDatumWriter
> ------------------------------------------------------------------
>
>                 Key: AVRO-295
>                 URL: https://issues.apache.org/jira/browse/AVRO-295
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.3.0
>            Reporter: Jonathan Hsieh
>            Assignee: Thiruvalluvan M. G.
>         Attachments: AVRO-295-test.patch, AVRO-295.patch
>
>
> JsonEncoder needs to be flushed otherwise data may be left in its buffers.  
> Ideally behavior should be the same regardless of what kind of Encoder passed 
> in. Here is some example code: 
> {code}
> class  A { 
>   long timestamp;
> }
>   public void testEventSchemaSerializeBinary() throws IOException {
>     A e = new A();
>     e.timestamp = 1234;
>     ReflectData reflectData = ReflectData.get();
>     Schema schm = reflectData.getSchema(A.class);
>     System.out.println(schm);
>     ReflectDatumWriter writer = new ReflectDatumWriter(schm);
>     ByteArrayOutputStream out = new ByteArrayOutputStream();
>     Encoder json = new BinaryEncoder(out);
>     writer.write(e, json); // only one calls
>     byte[] bs = out.toByteArray();
>     int len = bs.length; // length is 2, which is reasonable.
>     System.out.println("output size: " + len);
>   }
> public void testSerializeJson() throws IOException {
>     A a = new A();
>     a.timestamp = 1234;
>     ReflectData reflectData = ReflectData.get();
>     Schema schm = reflectData.getSchema(A.class);
>     ReflectDatumWriter writer = new ReflectDatumWriter(schm);
>     ByteArrayOutputStream out = new ByteArrayOutputStream();
>     JsonEncoder json = new JsonEncoder(schm, out);
>     writer.write(e, json); /// only one call
>     // did not flush
>     byte[] bs = out.toByteArray();
>     int len = bs.length; // len == 0;  this is unexpected!
>     System.out.println("output size: " + len); 
>  
>     // flushed this time. this is a bit unwieldy
>     json.flush(); 
>     bs = out.toByteArray();
>     len = bs.length; // len == 18; this is better!
>     System.out.println("output size: " + len);
> }
> {code}
> One way to deal with it is to  have either all Encoders have flush method so 
> the DatumWriter can always flush it, and potentially add a flush method to 
> DatumWriter as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to