I am trying to cast a byte array to a long value inside a FOREACH. I
understand that in-order for byte array to be casted to long, there needs
to be some sort LoadCaster available. I assumed that a standard UDF like
CONCAT would have that available. Is this expected to work or fail?
Appreciate any help you guys can provide.

Here is my script.

*$ cat cast_bytearray_udf_cast.pig*

A = load 'cast_simple.txt' using PigStorage(',') as (id:int,
name:chararray, count1:bytearray, count2:bytearray);

G = GROUP A BY name;

B = foreach G {

        L = FOREACH A GENERATE CONCAT(count1, count2) as concat_count;

        M = FOREACH L GENERATE (long)concat_count as casted_concat_count;

        N = FOREACH M GENERATE casted_concat_count - 1, casted_concat_count
;

        GENERATE N;

 };

dump B;

*$ cat cast_simple.txt *

1,cat,1234,134

2,cat,1342,213

3,dog,1343,331


I am getting below exception


java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 0: Exception while executing (Name: N: New For Each(false,f.

        at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)

        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)

Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
Exception while executing (Name: N: New For Each(false,false)[bag].

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:257)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNextDataBag(PhysicalOperator.java:411)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:566)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PORelationToExprProject.getNextDataBag(PORelation)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:335)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:405)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:322)

        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:465)

        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapRedu)

        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)

        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:262)

        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

        at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

        at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)

        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
1075: Received a bytearray from the UDF or Union from two different L.

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNextLong(POCast.java:640)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:349)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:405)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:322)

        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)

        ... 20 more

*$ pig -version*

Apache Pig version 0.16.0.2.6.2.0-205 (rUnversioned directory)

compiled Aug 26 2017, 09:34:39



Thanks

Manoj Narayanan

Reply via email to