I'm getting an exception from a Pig script and haven't been able to nail down 
the cause. I'm fairly new to Pig & have searched for various topics based on 
the exception I'm getting but haven't been able to find anything meaningful.
From the grunt shell & log I've looked for different variations of these -
  unable to read pigs manifest file
  java.lang.NegativeArraySizeException: -1
  ERROR 1066: Unable to open iterator for alias F. Backend error : -1

I'm using Hadoop version 2.0.0-cdh4.6.0 & Pig version 0.11.0, running from the 
Grunt shell.

My Pig script reads a file, does some manipulation on the data (including 
calling a Java UDF), joins to an HBase table, then DUMPs the output. Pretty 
simple. I can DUMP the intermediate result (alias B) and the data looks fine.

I've tested the Java function from Pig using the same input file and have seen 
it return values as I'd expect, and I've tested the function locally outside 
the Pig script. The Java function is provided a number of days from 01-01-1900 
& uses joda-time v2.7 to return a Datetime. Initially, the UDF was accepting a 
tuple as input. I've tried changing the UDF input format to Byte and most 
recently String and casting to Datetime in Pig upon returning, but am still 
getting the same error.

When I change my Pig script merely to not call the UDF it works fine. The 
NegativeArray error sounds like the data is out of whack for the Dump, possibly 
from some kind of format issue, but I don't see how.

****************************************************************************
Pig script

    A = LOAD 'tst2_SplitGroupMax.txt' using PigStorage(',')
    as (id:bytearray, year:int, doy:int, month:int, dayOfMonth:int,
     awh_minTemp:double, awh_maxTemp:double,
     nws_minTemp:double, nws_maxTemp:double,
     wxs_minTemp:double, wxs_maxTemp:double,
     tcc_minTemp:double, tcc_maxTemp:double
     ) ;

    register 
/import/pool2/home/NA1000APP-TPSDM/ejbles/Test-0.0.1-SNAPSHOT-jar-with-dependencies.jar;

    B = FOREACH A GENERATE id as msmtid, SUBSTRING(id,0,8) as gridid, 
SUBSTRING(id,9,20) as msmt_days,
     year, doy, month, dayOfMonth,
     
CONCAT(CONCAT(CONCAT((chararray)year,'-'),CONCAT((chararray)month,'-')),(chararray)dayOfMonth)
 as msmt_dt,
     ToDate(monutil.geoloc.GridIDtoDatetime(id)) as func_msmt_dt,
     awh_minTemp, awh_maxTemp,
     nws_minTemp, nws_maxTemp,
     wxs_minTemp, wxs_maxTemp,
     tcc_minTemp, tcc_maxTemp
     ;

    E = LOAD 'hbase://wxgrid_detail' using 
org.apache.pig.backend.hadoop.hbase.HBaseStorage
     ('loc:country, loc:fips, loc:l1 ,loc:l2, loc:latitude, loc:longitude',
     '-loadKey=true -caster=HBaseBinaryConverter')
     as (wxgrid:bytearray, country:chararray, fips:chararray, l1:chararray, 
l2:chararray,
       latitude:double, longitude:double);

    F = join B by gridid, E by wxgrid;

    DUMP F;  --- This is where I get the exception

*********************************************************************
Here's an excerpt from what's returned in the Grunt shell -

2015-06-15 12:23:24,204 [main] WARN  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop 
immediately on failure.
2015-06-15 12:23:24,205 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- job job_201502081759_916870 has failed! Stop running all dependent jobs
2015-06-15 12:23:24,205 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 100% complete
2015-06-15 12:23:24,221 [main] ERROR 
org.apache.pig.tools.pigstats.SimplePigStats - ERROR: -1
2015-06-15 12:23:24,221 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil 
- 1 map reduce job(s) failed!
2015-06-15 12:23:24,223 [main] WARN  org.apache.pig.tools.pigstats.ScriptState 
- unable to read pigs manifest file
2015-06-15 12:23:24,224 [main] INFO  
org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
2.0.0-cdh4.6.0          na1000app-tpsdm 2015-06-15 12:22:39     2015-06-15 
12:23:24     HASH_JOIN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_201502081759_916870 A,B,E,F HASH_JOIN       Message: Job failed!    
hdfs://nameservice1/tmp/temp-238648079/tmp-1338617620,

Input(s):
Failed to read data from "hbase://wxgrid_detail"
Failed to read data from 
"hdfs://nameservice1/user/na1000app-tpsdm/tst2_SplitGroupMax.txt"

Output(s):
Failed to produce result in 
"hdfs://nameservice1/tmp/temp-238648079/tmp-1338617620"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201502081759_916870


2015-06-15 12:23:24,224 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Failed!
2015-06-15 12:23:24,234 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
1066: Unable to open iterator for alias F. Backend error : -1
Details at logfile: 
/import/pool2/home/NA1000APP-TPSDM/ejbles/pig_1434388844905.log

***********************************************************************
And here's the log -
Backend error message
java.lang.NegativeArraySizeException: -1
                at 
org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:148)
                at 
org.apache.hadoop.hbase.mapreduce.TableSplit.readFields(TableSplit.java:133)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
                at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:233)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
                at 
org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356)
                at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:640)
                at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
                at org.apache.hadoop.mapred.Child$4.run(Ch

Pig Stack Trace
ERROR 1066: Unable to open iterator for alias F. Backend error : -1

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open 
iterator for alias F. Backend error : -1
                at org.apache.pig.PigServer.openIterator(PigServer.java:828)
                at 
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
                at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
                at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
                at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
                at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
                at org.apache.pig.Main.run(Main.java:538)
                at org.apache.pig.Main.main(Main.java:157)
                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                at java.lang.reflect.Method.invoke(Method.java:597)
                at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.NegativeArraySizeException: -1
                at 
org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:148)
                at 
org.apache.hadoop.hbase.mapreduce.TableSplit.readFields(TableSplit.java:133)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
                at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:233)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:73)
                at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:44)
                at 
org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356)
                at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:640)
                at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)

This e-mail message may contain privileged and/or confidential information, and 
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please 
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of 
this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, 
reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking 
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage 
caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control 
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and 
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
information you are obligated to comply with all
applicable U.S. export laws and regulations.

Reply via email to