Andrew,

Something looks odd in this stack trace:

Caused by: java.lang.ClassCastException:
org.apache.pig.data.BinSedesTuple cannot be cast to
org.apache.avro.generic.IndexedRecord
>         at org.apache.avro.generic.GenericData.getField(GenericData.java:525)
>         at org.apache.avro.generic.GenericData.getField(GenericData.java:540)
>         at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:103)
>         at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
>         at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:99)

PigAvroDatumWriter overrides 'GenericDatumWriter.writeRecord' in order
to extract values from a tuple.  Thus, I would expect that the third
method invocation be PigAvroDatumWriter.writeRecord.  Perhaps, someone
else has more insight as to why it's not getting invoked.  In the
meantime, please confirm that both PigAvroDatumWriter and
GenericDatumWriter are loaded from the right jar files. (You can do
this by temporarily changing the pig script to invoke JVM with 'java
-verbose' and 'grep' the output for these classes.)

Best,

stan

On Tue, Jan 10, 2012 at 8:03 AM, Andrew Kenworthy
<adwkenwor...@yahoo.com> wrote:
> Hi Stan,
>
> here's the full stacktrace:
>
> org.apache.avro.file.DataFileWriter$AppendWriteException: 
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be 
> cast to org.apache.avro.generic.IndexedRecord
>         at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:261)
>         at 
> org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.write(PigAvroRecordWriter.java:49)
>         at 
> org.apache.pig.piggybank.storage.avro.AvroStorage.putNext(AvroStorage.java:580)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
>         at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:530)
>         at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>         at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple 
> cannot be cast to org.apache.avro.generic.IndexedRecord
>         at org.apache.avro.generic.GenericData.getField(GenericData.java:525)
>         at org.apache.avro.generic.GenericData.getField(GenericData.java:540)
>         at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:103)
>         at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
>         at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:99)
>         at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
>         at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:255)
>         ... 18 more
>
>
> Andrew
>
>
>
>>________________________________
>> From: Stan Rosenberg <srosenb...@proclivitysystems.com>
>>To: user@pig.apache.org; Andrew Kenworthy <adwkenwor...@yahoo.com>
>>Sent: Monday, January 9, 2012 5:30 PM
>>Subject: Re: Simple AvroStorage LOAD and STORE with Avro 1.6.0
>>
>>Andrew,
>>
>>The source of the problem may be AvroStorage in piggybank.  Could you
>>please include the entire stack trace?
>>
>>stan
>>
>>On Mon, Jan 9, 2012 at 4:15 AM, Andrew Kenworthy <adwkenwor...@yahoo.com> 
>>wrote:
>>> Hallo,
>>>
>>> When I run a simple pig script to LOAD and STORE avro data, I get:-
>>>
>>> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be 
>>> cast to org.apache.avro.generic.IndexedRecord
>>>
>>>
>>> Script:
>>>
>>> REGISTER /tmp/avro-1.6.0.jar;
>>> --REGISTER /tmp/avro-1.5.4.jar
>>> --REGISTER /tmp/avro-1.4.1.jar;
>>>
>>> REGISTER /tmp/piggybank-0.9.1.jar;
>>> REGISTER /tmp/json-simple-1.1.jar;
>>> REGISTER /tmp/jackson-core-asl-1.8.4.jar;
>>> REGISTER /tmp/jackson-mapper-asl-1.8.4.jar;
>>>
>>> avroData=LOAD '$DATA_INPUTDIR' USING 
>>> org.apache.pig.piggybank.storage.avro.AvroStorage();
>>>
>>> dataSubset = FOREACH avroData GENERATE myField1, myField2;
>>> describe dataSubset;
>>> -----------------------------------------------
>>> -- shows:
>>> -- dataSubset : {myField1: int,myField2: int}
>>> -----------------------------------------------
>>> STORE dataSubset INTO '$OUTPUTDIR' USING 
>>> org.apache.pig.piggybank.storage.avro.AvroStorage();
>>>
>>> If I use the 1.5.4 jar I get the same error, but the script works with the 
>>> 1.4.1 version. If I just write one field, then it works with 1.6.0.
>>>
>>> I see there's been a related issue fixed here:
>>>
>>> https://issues.apache.org/jira/browse/PIG-2202
>>> https://issues.apache.org/jira/browse/PIG-2195
>>>
>>> Can anyone confirm that this or similar works with avro 1.6.0, and/or point 
>>> me in the right direction concering where the problem may lie?
>>>
>>> Many thanks,
>>>
>>> Andrew
>>
>>
>>

Reply via email to