[ 
https://issues.apache.org/jira/browse/PIG-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654700#comment-13654700
 ] 

Michael Moss commented on PIG-3297:
-----------------------------------

Niels, I've run into this also (and a similar issue with Hive), and it seems 
that it might be brought on not by the code you patched, but perhaps in the 
avro-1.x.y.jar files itself.

We are serializing strings as avro.java.string and everything was working fine 
on our HDP1.2 (Hortonworks) cluster, but when I upgraded the avro jar that pig 
uses to avro-1.7.4 from avro-1.5.3, I get this exception.

I'm also have this issue on the latest version of CDH4.2 (with Impala1.0) in 
both pig and hive and the culprit there seems to be the avro-1.7.x.jar that 
they use.

I'm just starting to dig into finding out why, but was hoping you or someone 
here might have some insight.

Thanks.
                
> Avro files with stringType set to String cannot be read by the AvroStorage 
> LoadFunc
> -----------------------------------------------------------------------------------
>
>                 Key: PIG-3297
>                 URL: https://issues.apache.org/jira/browse/PIG-3297
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.11.1
>            Reporter: Niels Basjes
>         Attachments: PIG-3297-1.patch, test_record.avro
>
>
> When an Avro file is created there exists the option to set the "String Type" 
> to a different class than the default Utf8.
> A very common situation is that the "String Type" is set to the default 
> String class.
> When trying to read such an Avro file in Pig using the AvroStorage LoadFunc 
> from the included piggybank this gives the following Exception:
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> org.apache.avro.util.Utf8
>         at 
> org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readString(PigAvroDatumReader.java:154)
>         at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to