[
https://issues.apache.org/jira/browse/PIG-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255669#comment-13255669
]
Alex Rovner commented on PIG-2493:
----------------------------------
Facing similar issue. I have a feeling this bug is not fully resolved. I have a
fairly complicated script that I can not easily share. The main though is the
following:
I can successfully store A & B relation.
I cannot store C which is A UNION B operation.
Getting the following exception:
java.lang.ClassCastException: java.lang.String cannot be cast to
java.lang.Integer
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:432)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:330)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:165)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
Schema:
a: {timestamp: chararray,date_time: chararray,conversion_type:
chararray,channel_type: chararray,campaign_id: int,adgroup_id: int,order_id:
chararray,order_sales: double,delta: long,uuid: chararray,ctid:
long,advertiser_id: int,client_tid: int,conversion_provenance: chararray}
b: {timestamp: chararray,date_time:
chararray,iponweb_conversions::conversion_type: chararray,channel_type:
chararray,iponweb_conversions::campaign_id:
int,iponweb_conversions::adgroup_id: int,order_id: chararray,order_sales:
double,delta: long,iponweb_conversions::uuid: chararray,ctid:
long,iponweb_conversions::advertiser_id: int,client_tid:
long,conversion_provenance: chararray}
Resulting union schema:
c: {timestamp: chararray,date_time: chararray,conversion_type:
chararray,channel_type: chararray,campaign_id: int,adgroup_id: int,order_id:
chararray,order_sales: double,delta: long,uuid: chararray,ctid:
long,advertiser_id: int,client_tid: long,conversion_provenance: chararray}
I tried replicating this issue with a simple script that mimics the two schemas
and could not reproduce the issue. (This makes me believe the plan is at fault?)
> UNION causes casting issues
> ---------------------------
>
> Key: PIG-2493
> URL: https://issues.apache.org/jira/browse/PIG-2493
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.9.1, 0.10.0
> Reporter: Anitha Raju
> Assignee: Vivek Padmanabhan
> Fix For: 0.10.0, 0.9.3, 0.11
>
> Attachments: PIG-2493-3.patch, PIG-2493.patch, PIG-2493_2.patch
>
>
> Hi,
> For the below script,
> {code}
> A = load '/user/anithar/ip' as (a);
> B = load '/user/anithar/ip1' as (a);
> C = union A , B ;
> D = foreach C generate (chararray)a;
> dump D;
> {code}
> it gives casting error at runtime
> {code}
> org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a
> bytearray from the UDF. Cannot determine how to convert the bytearray to
> string.
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:660)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1082)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> It looks like in POCast.java the value of "funcSpec" is not getting any
> value(stays null when there is a UNION involved), causing "caster" to get
> null and thus the exception.
> The same works in 0.8 without any issue.
> Regards,
> Anitha
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira