I am trying to cast a byte array to a long value inside a FOREACH. I understand that in-order for byte array to be casted to long, there needs to be some sort LoadCaster available. I assumed that a standard UDF like CONCAT would have that available. Is this expected to work or fail? Appreciate any help you guys can provide.
Here is my script. *$ cat cast_bytearray_udf_cast.pig* A = load 'cast_simple.txt' using PigStorage(',') as (id:int, name:chararray, count1:bytearray, count2:bytearray); G = GROUP A BY name; B = foreach G { L = FOREACH A GENERATE CONCAT(count1, count2) as concat_count; M = FOREACH L GENERATE (long)concat_count as casted_concat_count; N = FOREACH M GENERATE casted_concat_count - 1, casted_concat_count ; GENERATE N; }; dump B; *$ cat cast_simple.txt * 1,cat,1234,134 2,cat,1342,213 3,dog,1343,331 I am getting below exception java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: N: New For Each(false,f. at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: N: New For Each(false,false)[bag]. at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:257) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNextDataBag(PhysicalOperator.java:411) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:566) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PORelationToExprProject.getNextDataBag(PORelation) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:335) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:405) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:322) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:465) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapRedu) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:262) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a bytearray from the UDF or Union from two different L. at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNextLong(POCast.java:640) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:349) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:405) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:322) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305) ... 20 more *$ pig -version* Apache Pig version 0.16.0.2.6.2.0-205 (rUnversioned directory) compiled Aug 26 2017, 09:34:39 Thanks Manoj Narayanan