RE: Getting this exception in pig

william.dowling Tue, 01 Sep 2015 06:08:00 -0700

The way to debug this is to try to dump the relations named by the aliases 
leading up to the failure.  Try to dump rec_count.  Try to dump group_data.  
...  You will find one that works, which will also show the first alias 
definition that causes the failure.


I guess that the problem is in your definition of rec_count.  You are asking 
for COUNT($0) but $0 in that case refers to the 'group' field which is a scalar 
not a bag.  Try COUNT($1).  Does that work?

William F Dowling
Senior Technologist
Thomson Reuters
0 +1 215 823 3853

-----Original Message-----
From: rakesh sharma [mailto:rakeshsharm...@hotmail.com] 
Sent: Tuesday, September 01, 2015 2:39 AM
To: user@pig.apache.org
Subject: Getting this exception in pig

-- read airline datacontent = load '/home/19659/testData.csv' using 
PigStorage(',');data = foreach content generate $1 as id , $2 as 
type;group_data = group data by type;rec_count = foreach group_data generate 
COUNT($0);limit_rec_count = limit rec_count 1;dump limit_rec_count;

the above is my pig script, but I am getting error when I am trying to dump 
data else there is no error.

java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: 
ERROR 0: Exception while executing [POCast (Name: Cast[bag:{}] - scope-12 
Operator Key: scope-12) children: [[POProject (Name: Project[bytearray][0] - 
scope-11 Operator Key: scope-11) children: null at []]] at []]: 
java.lang.NullPointerException        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)   
     at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)Caused 
by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
while executing [POCast (Name: Cast[bag:{}] - scope-12 Operator Key: scope-12) 
children: [[POProject (Name: Project[bytearray][0] - scope-11 Operator Key: 
scope-11) children: null at []]] at []]: java.lang.NullPointerException        
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:366)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:216)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:270)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextLong(POUserFunc.java:407)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:351)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:383)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:303)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:474)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:442)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:422)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:269)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)        
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)        at 
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)        
at java.util.concurrent.FutureTask.run(FutureTask.java:262)        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
       at java.lang.Thread.run(Thread.java:745)Caused by: 
java.lang.NullPointerException        at 
org.apache.pig.builtin.Utf8StorageConverter.consumeBag(Utf8StorageConverter.java:80)
        at 
org.apache.pig.builtin.Utf8StorageConverter.bytesToBag(Utf8StorageConverter.java:335)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNextDataBag(POCast.java:1861)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:337)
        ... 19 more2015-09-01 12:01:45,920 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- HadoopJobId: job_local1740385966_00022015-09-01 12:01:45,920 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Processing aliases rec_count2015-09-01 12:01:45,920 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- detailed locations: M:  C:  R: rec_count[5,12]2015-09-01 12:01:45,922 [main] 
WARN  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop 
immediately on failure.2015-09-01 12:01:45,923 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- job job_local1740385966_0002 has failed! Stop running all dependent 
jobs2015-09-01 12:01:45,923 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 100% complete2015-09-01 12:01:45,923 [main] INFO  
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with 
processName=JobTracker, sessionId= - already initialized2015-09-01 12:01:45,924 
[main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM 
Metrics with processName=JobTracker, sessionId= - already initialized2015-09-01 
12:01:45,924 [main] ERROR 
org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) 
failed!2015-09-01 12:01:45,925 [main] INFO  
org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      
Features2.6.0.2.2.4.2-2 0.14.0.2.2.4.2-2        19659   2015-09-01 12:01:43     
2015-09-01 12:01:45     GROUP_BY,LIMIT
Some jobs have failed! Stop running all dependent jobs
Job Stats (time in seconds):JobId   Maps    Reduces MaxMapTime      MinMapTime  
    AvgMapTime      MedianMapTime   MaxReduceTime   MinReduceTime   
AvgReduceTime   MedianReducetime        Alias   Feature 
Outputsjob_local95835876_0001  1       1       n/a     n/a     n/a     n/a     
n/a     n/a     n/a     n/a     content,data,group_data,limit_rec_count GROUP_BY
Failed Jobs:JobId   Alias   Feature Message Outputsjob_local1740385966_0002     
   rec_count               Message: Job failed!    
file:/tmp/temp1464312773/tmp-2093473355,
Input(s):Successfully read 46556 records from: "/home/19659/testData.csv"
Output(s):Failed to produce result in "file:/tmp/temp1464312773/tmp-2093473355"
Counters:Total records written : 0Total bytes written : 0Spillable Memory 
Manager spill count : 0Total bags proactively spilled: 0Total records 
proactively spilled: 0
Job DAG:job_local95835876_0001  ->      
job_local1740385966_0002,job_local1740385966_0002


Can somebody help with this. I am running pig locally.

RE: Getting this exception in pig

Reply via email to