[ 
https://issues.apache.org/jira/browse/MRUNIT-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Dalsass updated MRUNIT-197:
-----------------------------------
    Attachment: fix_avro_serialization.patch

Hi,

There's currently another bug when using Avro objects with MrUnit, which 
prevents them to be used at all. The problem is in Serialization : Avro 
internally uses a proxy before the outputBuffer (an encoder), and only writes 
to the buffer when the encoder is closed.

This patch for the serializer closing before reading the outputBuffer again, 
which fixes the problem for Avro, without I think affecting other 
serializations. (I checked with classic Hadoop Serialization)

> Problems using Avro with MRUnit
> -------------------------------
>
>                 Key: MRUNIT-197
>                 URL: https://issues.apache.org/jira/browse/MRUNIT-197
>             Project: MRUnit
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Matthew Hayes
>         Attachments: MemberEventCountUnitTest.java, 
> fix_avro_serialization.patch
>
>
> I'm not able to use MRUnit with Avro in a particular use case.  See the 
> exception below.  I've attached a sample test that demonstrates the problem.
> When the input is just a plain integer it works fine.  However if the input 
> is a record that contains an integer it doesn't work.  I stepped through the 
> code with a debugger to try understanding what is going on.  In the 
> Serialization class's copy method, the serializer it gets on this line is 
> wrong:
> serializer = (Serializer<Object>) serializationFactory
>           .getSerializer(clazz);
> When I look at the schema within this object is is "int" instead of the 
> record's schema.
> {noformat}
> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record 
> cannot be cast to java.lang.Number
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:78)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>       at 
> org.apache.avro.hadoop.io.AvroSerializer.serialize(AvroSerializer.java:104)
>       at 
> org.apache.avro.hadoop.io.AvroSerializer.serialize(AvroSerializer.java:46)
>       at 
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:74)
>       at 
> org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91)
>       at 
> org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104)
>       at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:608)
>       at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:612)
>       at 
> org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:118)
>       at 
> org.apache.hadoop.mrunit.MapDriverBase.withInput(MapDriverBase.java:207)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to