I'm getting an exception from a Pig script and haven't been able to nail down
the cause. I'm fairly new to Pig & have searched for various topics based on
the exception I'm getting but haven't been able to find anything meaningful.
From the grunt shell & log I've looked for different variations of these -
unable to read pigs manifest file
java.lang.NegativeArraySizeException: -1
ERROR 1066: Unable to open iterator for alias F. Backend error : -1
I'm using Hadoop version 2.0.0-cdh4.6.0 & Pig version 0.11.0, running from the
Grunt shell.
My Pig script reads a file, does some manipulation on the data (including
calling a Java UDF), joins to an HBase table, then DUMPs the output. Pretty
simple. I can DUMP the intermediate result (alias B) and the data looks fine.
I've tested the Java function from Pig using the same input file and have seen
it return values as I'd expect, and I've tested the function locally outside
the Pig script. The Java function is provided a number of days from 01-01-1900
& uses joda-time v2.7 to return a Datetime. Initially, the UDF was accepting a
tuple as input. I've tried changing the UDF input format to Byte and most
recently String and casting to Datetime in Pig upon returning, but am still
getting the same error.
When I change my Pig script merely to not call the UDF it works fine. The
NegativeArray error sounds like the data is out of whack for the Dump, possibly
from some kind of format issue, but I don't see how.
****************************************************************************
Pig script
A = LOAD 'tst2_SplitGroupMax.txt' using PigStorage(',')
as (id:bytearray, year:int, doy:int, month:int, dayOfMonth:int,
awh_minTemp:double, awh_maxTemp:double,
nws_minTemp:double, nws_maxTemp:double,
wxs_minTemp:double, wxs_maxTemp:double,
tcc_minTemp:double, tcc_maxTemp:double
) ;
register
/import/pool2/home/NA1000APP-TPSDM/ejbles/Test-0.0.1-SNAPSHOT-jar-with-dependencies.jar;
B = FOREACH A GENERATE id as msmtid, SUBSTRING(id,0,8) as gridid,
SUBSTRING(id,9,20) as msmt_days,
year, doy, month, dayOfMonth,
CONCAT(CONCAT(CONCAT((chararray)year,'-'),CONCAT((chararray)month,'-')),(chararray)dayOfMonth)
as msmt_dt,
ToDate(monutil.geoloc.GridIDtoDatetime(id)) as func_msmt_dt,
awh_minTemp, awh_maxTemp,
nws_minTemp, nws_maxTemp,
wxs_minTemp, wxs_maxTemp,
tcc_minTemp, tcc_maxTemp
;
E = LOAD 'hbase://wxgrid_detail' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage
('loc:country, loc:fips, loc:l1 ,loc:l2, loc:latitude, loc:longitude',
'-loadKey=true -caster=HBaseBinaryConverter')
as (wxgrid:bytearray, country:chararray, fips:chararray, l1:chararray,
l2:chararray,
latitude:double, longitude:double);
F = join B by gridid, E by wxgrid;
DUMP F; --- This is where I get the exception
*********************************************************************
Here's an excerpt from what's returned in the Grunt shell -
2015-06-15 12:23:24,204 [main] WARN
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop
immediately on failure.
2015-06-15 12:23:24,205 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_201502081759_916870 has failed! Stop running all dependent jobs
2015-06-15 12:23:24,205 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2015-06-15 12:23:24,221 [main] ERROR
org.apache.pig.tools.pigstats.SimplePigStats - ERROR: -1
2015-06-15 12:23:24,221 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil
- 1 map reduce job(s) failed!
2015-06-15 12:23:24,223 [main] WARN org.apache.pig.tools.pigstats.ScriptState
- unable to read pigs manifest file
2015-06-15 12:23:24,224 [main] INFO
org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.0.0-cdh4.6.0 na1000app-tpsdm 2015-06-15 12:22:39 2015-06-15
12:23:24 HASH_JOIN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201502081759_916870 A,B,E,F HASH_JOIN Message: Job failed!
hdfs://nameservice1/tmp/temp-238648079/tmp-1338617620,
Input(s):
Failed to read data from "hbase://wxgrid_detail"
Failed to read data from
"hdfs://nameservice1/user/na1000app-tpsdm/tst2_SplitGroupMax.txt"
Output(s):
Failed to produce result in
"hdfs://nameservice1/tmp/temp-238648079/tmp-1338617620"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201502081759_916870
2015-06-15 12:23:24,224 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
2015-06-15 12:23:24,234 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
1066: Unable to open iterator for alias F. Backend error : -1
Details at logfile:
/import/pool2/home/NA1000APP-TPSDM/ejbles/pig_1434388844905.log
***********************************************************************
And here's the log -
Backend error message
java.lang.NegativeArraySizeException: -1
at
org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:148)
at
org.apache.hadoop.hbase.mapreduce.TableSplit.readFields(TableSplit.java:133)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:233)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
at
org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356)
at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:640)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Ch
Pig Stack Trace
ERROR 1066: Unable to open iterator for alias F. Backend error : -1
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open
iterator for alias F. Backend error : -1
at org.apache.pig.PigServer.openIterator(PigServer.java:828)
at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:538)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.NegativeArraySizeException: -1
at
org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:148)
at
org.apache.hadoop.hbase.mapreduce.TableSplit.readFields(TableSplit.java:133)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:233)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
at
org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356)
at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:640)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
This e-mail message may contain privileged and/or confidential information, and
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of
this e-mail by you is strictly prohibited.
All e-mails and attachments sent and received are subject to monitoring,
reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage
caused by any such code transmitted by or accompanying
this e-mail or any attachment.
The information contained in this email may be subject to the export control
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
information you are obligated to comply with all
applicable U.S. export laws and regulations.