Hi,

    I'm running Pig 0.10.0 in local mode on some small text files. There is
no intention to run it on Hadoop at all. We have a job that runs every 5
minutes and about 3% of the time, the job fails with the error below. It
happens at random places within the Pig Script.

 

2012-10-19 14:15:37,719 [Thread-15] WARN
org.apache.hadoop.mapred.LocalJobRunner - job_local_0004 
java.lang.NullPointerException 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:286) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperat
ors.POProject.getNext(POProject.java:158) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperat
ors.POProject.getNext(POProject.java:360) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.getNext(PhysicalOperator.java:330) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.processPlan(POForEach.java:332) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:284) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:290) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POFilter.getNext(POFilter.java:95) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:290) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:233) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:290) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POLocalRearrange.getNext(POLocalRearrange.java:256) 
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POUnion.getNext(POUnion.java:165) 
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBa
se.runPipeline(PigGenericMapBase.java:271) 
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBa
se.map(PigGenericMapBase.java:266) 
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBa
se.map(PigGenericMapBase.java:64) 
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

 

In the Pig Log, I get

 

ERROR 2244: Job failed, hadoop does not return any error message 

org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job
failed, hadoop does not return any error message 
        at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) 
        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193
) 
        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165
) 
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) 
        at org.apache.pig.Main.run(Main.java:555) 
        at org.apache.pig.Main.main(Main.java:111) 
============================================================================
====

 

Pig script is attached.

 

Any help gratefully received

 

Thanks

 

Malc

 

 

 

 

 

--Load data from input fie
indata = LOAD '$input' USING PigStorage(',') AS (utc_ts:chararray, 
local_ts:chararray, timezone:chararray, region:chararray, hostname:chararray, 
                                                 stat_type:chararray, 
stat_key:chararray, stat_value:long);


/**********************************************************************************
* Output:       ats_stat                                                        
  *
* Description:  Generate output file of ATS data to load into the ats_stat_tbl  
  *
**********************************************************************************/
ats_total_errors = FILTER indata BY stat_key == 'IntegraStatistics/TotalErrors';
ats_total_txns = FILTER indata BY stat_key == 
'IntegraStatistics/TotalTransactions';
ats_resp_time = FILTER indata BY stat_key == 
'IntegraStatistics/UWMAResponseTime';

ats_join_data = JOIN ats_total_errors BY 
(utc_ts,local_ts,timezone,region,hostname), 
                     ats_total_txns BY 
(utc_ts,local_ts,timezone,region,hostname),
                     ats_resp_time BY 
(utc_ts,local_ts,timezone,region,hostname);

ats_out_data = FOREACH ats_join_data GENERATE $0,$1,$2,$3,$4,$7,$15,$23;
STORE ats_out_data INTO '$outdir/ats_stat.dat.$uniq_id' USING PigStorage(',');



/**********************************************************************************
* Output:       ldap_stat                                                       
  *
* Description:  Generate output file of LDAP data to load into the 
ldap_stat_tbl  *
**********************************************************************************/
ldap_total_errors = FILTER indata BY stat_key == 
'LDAPStatistics/FailedRequests';
ldap_total_txns = FILTER indata BY stat_key == 'LDAPStatistics/TotalRequests';
ldap_resp_time = FILTER indata BY stat_key == 'LDAPStatistics/UWMAResponseTime';

ldap_join_data = JOIN ldap_total_errors BY 
(utc_ts,local_ts,timezone,region,hostname),
                     ldap_total_txns BY 
(utc_ts,local_ts,timezone,region,hostname),
                     ldap_resp_time BY 
(utc_ts,local_ts,timezone,region,hostname);

ldap_out_data = FOREACH ldap_join_data GENERATE $0,$1,$2,$3,$4,$7,$15,$23;
STORE ldap_out_data INTO '$outdir/ldap_stat.dat.$uniq_id' USING PigStorage(',');



/**********************************************************************************
* Output:       pcrf_stat                                                       
  *
* Description:  Generate output file of PCRF data to load into the 
pcrf_stat_tbl  *
**********************************************************************************/
pcrf_total_errors = FILTER indata BY stat_key == 'PcrfStatistics/TotalErrors';
pcrf_total_txns = FILTER indata BY stat_key == 
'PcrfStatistics/TotalRequestsSent';
pcrf_resp_time = FILTER indata BY stat_key == 'PcrfStatistics/UWMAResponseTime';

pcrf_join_data = JOIN pcrf_total_errors BY 
(utc_ts,local_ts,timezone,region,hostname),
                     pcrf_total_txns BY 
(utc_ts,local_ts,timezone,region,hostname),
                     pcrf_resp_time BY 
(utc_ts,local_ts,timezone,region,hostname);

pcrf_out_data = FOREACH pcrf_join_data GENERATE $0,$1,$2,$3,$4,$7,$15,$23;
STORE pcrf_out_data INTO '$outdir/pcrf_stat.dat.$uniq_id' USING PigStorage(',');



/**********************************************************************************
* Output:       sess_stat                                                       
  *
* Description:  Generate output file of Session Counts data to load into the    
  *
*               sess_stat_tbl                                                   
  *
**********************************************************************************/
sess_active = FILTER indata BY stat_key == 'SessionStatistics/ActiveSessions';
sess_total = FILTER indata BY stat_key == 'SessionStatistics/TotalSessions';
sess_duration = FILTER indata BY stat_key == 
'SessionStatistics/UWMASessionLength';

sess_join_data = JOIN sess_active BY (utc_ts,local_ts,timezone,region,hostname),
                      sess_total BY (utc_ts,local_ts,timezone,region,hostname),
                      sess_duration BY 
(utc_ts,local_ts,timezone,region,hostname);
                     
sess_out_data = FOREACH sess_join_data GENERATE $0,$1,$2,$3,$4,$7,$15,$23;
STORE sess_out_data INTO '$outdir/sess_stat.dat.$uniq_id' USING PigStorage(',');



/**********************************************************************************
* Output:       radius_tps                                                      
  *
* Description:  Generate output file of Radius TPS data to load into the        
  *
*               radius_tps_tbl                                                  
  *
**********************************************************************************/
radius_tps_total_interims = FILTER indata BY stat_key == 
'RadiusStatistics/RadiusInterims';
radius_tps_total_starts = FILTER indata BY stat_key == 
'RadiusStatistics/RadiusStarts';
radius_tps_total_stops = FILTER indata BY stat_key == 
'RadiusStatistics/RadiusStops';

radius_tps_join_data = JOIN radius_tps_total_interims BY 
(utc_ts,local_ts,timezone,region,hostname),
                    radius_tps_total_starts BY 
(utc_ts,local_ts,timezone,region,hostname),
                    radius_tps_total_stops BY 
(utc_ts,local_ts,timezone,region,hostname);

radius_tps_out_data = FOREACH radius_tps_join_data GENERATE 
$0,$1,$2,$3,$4,$7,$15,$23;
STORE radius_tps_out_data INTO '$outdir/radius_tps.dat.$uniq_id' USING 
PigStorage(',');

/**********************************************************************************
* Output:       radius_bcast                                                    
  *
* Description:  Generate output file of Radius Broadcast data to load into      
  *
                the radius_bcast_tbl                                            
  *
**********************************************************************************/
radius_bcast_total_errors = FILTER indata BY stat_key == 
'RadiusBroadcast/TotalErrors';
radius_bcast_total_txns = FILTER indata BY stat_key == 
'RadiusBroadcast/TotalTransactions';
radius_bcast_resp_time = FILTER indata BY stat_key == 
'RadiusBroadcast/ResponseTime';

radius_bcast_join_data = JOIN radius_bcast_total_errors BY 
(utc_ts,local_ts,timezone,region,hostname), 
                     radius_bcast_total_txns BY 
(utc_ts,local_ts,timezone,region,hostname),
                     radius_bcast_resp_time BY 
(utc_ts,local_ts,timezone,region,hostname);

radius_bcast_out_data = FOREACH radius_bcast_join_data GENERATE 
$0,$1,$2,$3,$4,$7,$15,$23;
STORE radius_bcast_out_data INTO '$outdir/radius_bcast.dat.$uniq_id' USING 
PigStorage(',');

Reply via email to