[ 
https://issues.apache.org/jira/browse/AVRO-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jamie Olson updated AVRO-1456:
------------------------------

    Description: 
org.apache.avro.mapred.AvroAsTextInputFormat relies on the toString() method 
rather than using org.apache.avro.generic.GenericDatumWriter.write() and 
org.apache.avro.io.JsonEncoder as in org.apache.avro.tool.DataFileReadTool.  
This results in a serialization of the data element, without the fully 
qualified name as specified in the Avro Specifications JSON Encoding section: 
http://avro.apache.org/docs/1.7.6/spec.html#json_encoding


The specification indicates that for a union type: \["null","string","Foo"\], 
data should be serialized with:

* null as null;
* the string "a" as {"string": "a"}; and
* a Foo instance as {"Foo": {...}}, where {...} indicates the JSON encoding of 
a Foo instance.

Instead, AvroAsTextInputFormat is serializing these values as

* null as null;
* the string "a" as "a"; and
* a Foo instance as {...}, where {...} indicates the JSON encoding of a Foo 
instance.


  was:
org.apache.avro.mapred.AvroAsTextInputFormat relies on the toString() method 
rather than using org.apache.avro.generic.GenericDatumWriter.write() and 
org.apache.avro.io.JsonEncoder as in org.apache.avro.tool.DataFileReadTool.  
This results in a serialization of the data element, without the fully 
qualified name as specified in the Avro Specifications JSON Encoding section: 
http://avro.apache.org/docs/1.7.6/spec.html#json_encoding


The specification indicates that for a union type: ["null","string","Foo"], 
data should be serialized with:
* null as null;
* the string "a" as {"string": "a"}; and
* a Foo instance as {"Foo": {...}}, where {...} indicates the JSON encoding of 
a Foo instance.

Instead, AvroAsTextInputFormat is serializing these values as
* null as null;
* the string "a" as "a"; and
* a Foo instance as {...}, where {...} indicates the JSON encoding of a Foo 
instance.



> AvroAsTextInputFormat is inconsistent with the Avro JSON Encoding described 
> in the Avro Specification
> -----------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-1456
>                 URL: https://issues.apache.org/jira/browse/AVRO-1456
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.6
>            Reporter: Jamie Olson
>
> org.apache.avro.mapred.AvroAsTextInputFormat relies on the toString() method 
> rather than using org.apache.avro.generic.GenericDatumWriter.write() and 
> org.apache.avro.io.JsonEncoder as in org.apache.avro.tool.DataFileReadTool.  
> This results in a serialization of the data element, without the fully 
> qualified name as specified in the Avro Specifications JSON Encoding section: 
> http://avro.apache.org/docs/1.7.6/spec.html#json_encoding
> The specification indicates that for a union type: \["null","string","Foo"\], 
> data should be serialized with:
> * null as null;
> * the string "a" as {"string": "a"}; and
> * a Foo instance as {"Foo": {...}}, where {...} indicates the JSON encoding 
> of a Foo instance.
> Instead, AvroAsTextInputFormat is serializing these values as
> * null as null;
> * the string "a" as "a"; and
> * a Foo instance as {...}, where {...} indicates the JSON encoding of a Foo 
> instance.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to