[ 
https://issues.apache.org/jira/browse/AVRO-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Goyal updated AVRO-1554:
-------------------------------

    Attachment: AVRO-1554_2.patch

{quote}
That works if the values are not null, but if they're null it fails.

The bug is with AllowNull, since it changes the schema to be a union, which has 
a different encoding. Custom encodings are associated with the field's type and 
know nothing of the union that AllowNull has inserted. So allowNull() should 
perhaps override getFieldAccessor() and wrap the value of 
super.getFieldAccessor() with an implementation that handles unions with null.
{quote}
Great catch! I have fixed this in the new patch.
However, I could not override getFieldAccessor() because the corresponding 
*Field* object is not there. So I added the methods 
ReflectDatumReader#readFieldWithAccessor() and 
ReflectDatumWriter#writeFieldWithAccessor(). Please suggest how you feel about 
this.

\\
\\
{quote}
Also, if the DatumWriter is passed an AllowNull, shouldn't the DatumReader be 
passed one too?
And, yes, I don't think we should add AvroConfiguration in this patch.
{quote}
AvroConfiguration has been removed.

\\
\\
Also updated the test to use a parameterized version such that it tests with 
ReflectData.AllowNull as well as with plain ReflectData.

> Avro should have support for common constructs like UUID and Date
> -----------------------------------------------------------------
>
>                 Key: AVRO-1554
>                 URL: https://issues.apache.org/jira/browse/AVRO-1554
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.6
>            Reporter: Sachin Goyal
>         Attachments: AVRO-1554.patch, AVRO-1554_2.patch, 
> CustomEncodingUnionBug.zip
>
>
> Consider the following code:
> {code}
> public class AvroExample
> {
>     public static void main (String [] args) throws Exception
>     {
>         ReflectData rdata = ReflectData.AllowNull.get();
>         Schema schema = rdata.getSchema(Temp.class);
>         
>         ReflectDatumWriter<Temp> datumWriter = 
>                new ReflectDatumWriter (Temp.class, rdata);
>         DataFileWriter<Temp> fileWriter = 
>                new DataFileWriter<Temp> (datumWriter);
>         ByteArrayOutputStream baos = new ByteArrayOutputStream();
>         fileWriter.create(schema, baos);
>         fileWriter.append(new Temp());
>         fileWriter.close();
>         byte[] bytes = baos.toByteArray();
>         GenericDatumReader<GenericRecord> datumReader = 
>                 new GenericDatumReader<GenericRecord> ();
>         SeekableByteArrayInput avroInputStream = 
>                 new SeekableByteArrayInput(bytes);
>         DataFileReader<GenericRecord> fileReader = 
>                 new DataFileReader<GenericRecord>(avroInputStream, 
> datumReader);
>         schema = fileReader.getSchema();
>         GenericRecord record = null;
>         record = fileReader.next(record);
>         System.out.println (record);
>         System.out.println (record.get("id"));
>     }
> }
> class Temp
> {
>     UUID id = UUID.randomUUID();
>     Date date = new Date();
>     BigInteger bi = BigInteger.TEN;
> }
> {code}
> Output from this code is:
> {code:javascript}
> {"id": {}, "date": {}, "bi": "10"}
> {code}
> UUID and Date type fields are very common in Java and can be found a lot in 
> third-party code as well (where it may be difficult to put annotations).
> So Avro should include a default serialization/deserialization support for 
> such fields.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to