Xiangrui Meng created AVRO-1357:
-----------------------------------

             Summary: Allow to force reading generic records for input data and 
map output data
                 Key: AVRO-1357
                 URL: https://issues.apache.org/jira/browse/AVRO-1357
             Project: Avro
          Issue Type: New Feature
          Components: java
    Affects Versions: 1.7.4
            Reporter: Xiangrui Meng


In AvroJob/AvroInputFormat/AvroRecordReader, we can choose either 
SpecificDatumReader or ReflectDatumReader to read input data and map output 
data, but not GenericDatumReader. We may want to force reading generic records 
for some jobs.

For example, assume that the input records contain a field called "category" 
and we want to compute the number of records for each category. If we can force 
reading generic records, we can get the category string by calling 
get("category"). Otherwise, the input record might be loaded as a GenericRecord 
instance or a SpecificRecord instance. The latter does not implement 
GenericRecord.

To add this feature, we can change the booleans 
IS_REFLECT/MAP_OUTPUT_IS_REFLECT into enums called 
INPUT_AVRO_DESERIALIZATION_TYPE/MAP_OUTPUT_AVRO_DESERIALIZATION_TYPE, and 
return the corresponding DatumReader based on the type.

We can add 
setDeserializationType/setInputDeserializationType/setMapOutputDeserializationType
 to AvroJob while deprecating setReflect/setInputReflect/setMapOutputReflect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to