The way to debug this is to try to dump the relations named by the aliases leading up to the failure. Try to dump rec_count. Try to dump group_data. ... You will find one that works, which will also show the first alias definition that causes the failure.
I guess that the problem is in your definition of rec_count. You are asking for COUNT($0) but $0 in that case refers to the 'group' field which is a scalar not a bag. Try COUNT($1). Does that work? William F Dowling Senior Technologist Thomson Reuters 0 +1 215 823 3853 -----Original Message----- From: rakesh sharma [mailto:rakeshsharm...@hotmail.com] Sent: Tuesday, September 01, 2015 2:39 AM To: user@pig.apache.org Subject: Getting this exception in pig -- read airline datacontent = load '/home/19659/testData.csv' using PigStorage(',');data = foreach content generate $1 as id , $2 as type;group_data = group data by type;rec_count = foreach group_data generate COUNT($0);limit_rec_count = limit rec_count 1;dump limit_rec_count; the above is my pig script, but I am getting error when I am trying to dump data else there is no error. java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POCast (Name: Cast[bag:{}] - scope-12 Operator Key: scope-12) children: [[POProject (Name: Project[bytearray][0] - scope-11 Operator Key: scope-11) children: null at []]] at []]: java.lang.NullPointerException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POCast (Name: Cast[bag:{}] - scope-12 Operator Key: scope-12) children: [[POProject (Name: Project[bytearray][0] - scope-11 Operator Key: scope-11) children: null at []]] at []]: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:366) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:216) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:270) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextLong(POUserFunc.java:407) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:351) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:383) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:303) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:474) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:442) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:422) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:269) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)Caused by: java.lang.NullPointerException at org.apache.pig.builtin.Utf8StorageConverter.consumeBag(Utf8StorageConverter.java:80) at org.apache.pig.builtin.Utf8StorageConverter.bytesToBag(Utf8StorageConverter.java:335) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNextDataBag(POCast.java:1861) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:337) ... 19 more2015-09-01 12:01:45,920 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1740385966_00022015-09-01 12:01:45,920 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases rec_count2015-09-01 12:01:45,920 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: C: R: rec_count[5,12]2015-09-01 12:01:45,922 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.2015-09-01 12:01:45,923 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1740385966_0002 has failed! Stop running all dependent jobs2015-09-01 12:01:45,923 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete2015-09-01 12:01:45,923 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized2015-09-01 12:01:45,924 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized2015-09-01 12:01:45,924 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!2015-09-01 12:01:45,925 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features2.6.0.2.2.4.2-2 0.14.0.2.2.4.2-2 19659 2015-09-01 12:01:43 2015-09-01 12:01:45 GROUP_BY,LIMIT Some jobs have failed! Stop running all dependent jobs Job Stats (time in seconds):JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputsjob_local95835876_0001 1 1 n/a n/a n/a n/a n/a n/a n/a n/a content,data,group_data,limit_rec_count GROUP_BY Failed Jobs:JobId Alias Feature Message Outputsjob_local1740385966_0002 rec_count Message: Job failed! file:/tmp/temp1464312773/tmp-2093473355, Input(s):Successfully read 46556 records from: "/home/19659/testData.csv" Output(s):Failed to produce result in "file:/tmp/temp1464312773/tmp-2093473355" Counters:Total records written : 0Total bytes written : 0Spillable Memory Manager spill count : 0Total bags proactively spilled: 0Total records proactively spilled: 0 Job DAG:job_local95835876_0001 -> job_local1740385966_0002,job_local1740385966_0002 Can somebody help with this. I am running pig locally.