I am facing a similar issues that is described in
https://issues.apache.org/jira/browse/PIG-2493

I am running on latest trunk where the fix for PIG-2493 was committed yet I
am still getting the following exception when attempting to union two
relations.

Schemas

a: {timestamp: chararray,date_time: chararray,conversion_type:
chararray,channel_type: chararray,campaign_id: int,adgroup_id:
int,order_id: chararray,order_sales: double,delta: long,uuid:
chararray,ctid: long,advertiser_id: int,client_tid:
int,conversion_provenance: chararray}

b: {timestamp: chararray,date_time:
chararray,iponweb_conversions::conversion_type: chararray,channel_type:
chararray,iponweb_conversions::campaign_id:
int,iponweb_conversions::adgroup_id: int,order_id: chararray,order_sales:
double,delta: long,iponweb_conversions::uuid: chararray,ctid:
long,iponweb_conversions::advertiser_id: int,client_tid:
long,conversion_provenance: chararray}

Resulting union schema:

c: {timestamp: chararray,date_time: chararray,conversion_type:
chararray,channel_type: chararray,campaign_id: int,adgroup_id:
int,order_id: chararray,order_sales: double,delta: long,uuid:
chararray,ctid: long,advertiser_id: int,client_tid:
long,conversion_provenance: chararray}

Relevant script portion:


describe a;
describe b;
c = UNION a, b;

describe c;

-------------------------

The job works without any issues if I store "a" and "b" without performing
the union. Whats even more interesting is that If I save "a" and "b" to a
file using PigStorage then read them back in another script and Union the
data it works as expected. To me this sounds like the plan is at fault.

Exception:

java.lang.ClassCastException: java.lang.String cannot be cast to
java.lang.Integer
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:432)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:330)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:165)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)

I can not attach the script since it contains proprietary UDF's etc

I might be able to share it with an individual developer.

Thanks
Alex Rovner

Reply via email to