[ 
https://issues.apache.org/jira/browse/PIG-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255669#comment-13255669
 ] 

Alex Rovner commented on PIG-2493:
----------------------------------

Facing similar issue. I have a feeling this bug is not fully resolved. I have a 
fairly complicated script that I can not easily share. The main though is the 
following: 

I can successfully store A & B relation.
I cannot store C which is A UNION B operation.

Getting the following exception:

java.lang.ClassCastException: java.lang.String cannot be cast to 
java.lang.Integer
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:432)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:330)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:165)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)


Schema:
a: {timestamp: chararray,date_time: chararray,conversion_type: 
chararray,channel_type: chararray,campaign_id: int,adgroup_id: int,order_id: 
chararray,order_sales: double,delta: long,uuid: chararray,ctid: 
long,advertiser_id: int,client_tid: int,conversion_provenance: chararray}

b: {timestamp: chararray,date_time: 
chararray,iponweb_conversions::conversion_type: chararray,channel_type: 
chararray,iponweb_conversions::campaign_id: 
int,iponweb_conversions::adgroup_id: int,order_id: chararray,order_sales: 
double,delta: long,iponweb_conversions::uuid: chararray,ctid: 
long,iponweb_conversions::advertiser_id: int,client_tid: 
long,conversion_provenance: chararray}

Resulting union schema:

c: {timestamp: chararray,date_time: chararray,conversion_type: 
chararray,channel_type: chararray,campaign_id: int,adgroup_id: int,order_id: 
chararray,order_sales: double,delta: long,uuid: chararray,ctid: 
long,advertiser_id: int,client_tid: long,conversion_provenance: chararray}


I tried replicating this issue with a simple script that mimics the two schemas 
and could not reproduce the issue. (This makes me believe the plan is at fault?)

                
> UNION causes casting issues
> ---------------------------
>
>                 Key: PIG-2493
>                 URL: https://issues.apache.org/jira/browse/PIG-2493
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1, 0.10.0
>            Reporter: Anitha Raju
>            Assignee: Vivek Padmanabhan
>             Fix For: 0.10.0, 0.9.3, 0.11
>
>         Attachments: PIG-2493-3.patch, PIG-2493.patch, PIG-2493_2.patch
>
>
> Hi,
> For the below script,
> {code}
> A = load '/user/anithar/ip' as (a);
> B = load '/user/anithar/ip1' as (a);
> C = union  A , B ;
> D = foreach C generate (chararray)a;
> dump D;
> {code}
> it gives casting error at runtime
> {code}
> org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a 
> bytearray from the UDF. Cannot determine how to convert the bytearray to 
> string.
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:660)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1082)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> It looks like in POCast.java the value of "funcSpec" is not getting any 
> value(stays null when there is a UNION involved), causing "caster" to get 
> null and thus the exception.
> The same works in 0.8 without any issue.
> Regards,
> Anitha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to