Thank you for the help. I'm attempting to analyze the job history files on our hadoop cluster, and this was mentioned in the Pig documentation here [1]. If this approach does not work, are there other ways to get this done?
[1] http://pig.apache.org/docs/r0.12.1/test.html#hadoop-job-history-loader -- Ramesh. > On Jun 17, 2014, at 6:48 PM, Cheolsoo Park <[email protected]> wrote: > > As far as In know, JobHistory file is incompatible among different Hadoop > versions. > > There were a couple of jiras for similar issues. For eg, this jira fixed an > issue with Hadoop 1.2- > https://issues.apache.org/jira/browse/PIG-3553 > > Unfortunately, you might have to change the parsing logic again for Hadoop > 1.3.2. > > > > > On Tue, Jun 17, 2014 at 2:15 PM, Ramesh Venkitaswaran < > [email protected]> wrote: > >> I’m running Hadoop 1.3.2 and Pig 0.11.1.1.3.2.0-110 and I’m getting this >> exception. I’ve tried running Pig 0.12.1, which I downloaded directly from >> apache, and I’m getting the same error. >> >> Any help on this would be appreciated. >> >> Backend error message >> --------------------- >> java.lang.ArrayIndexOutOfBoundsException: 2 >> at >> >> org.apache.pig.piggybank.storage.HadoopJobHistoryLoader$HadoopJobHistoryReader.nextKeyValue(HadoopJobHistoryLoader.java:184) >> at >> >> org.apache.pig.piggybank.storage.HadoopJobHistoryLoader.getNext(HadoopJobHistoryLoader.java:81) >> at >> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) >> at >> >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:530) >> at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:363) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) >> at org.apache.hadoop.mapred.Child.main(Child.java:249) >> >> The script is below: >> >> REGISTER /usr/lib/pig/piggybank.jar; >> a = LOAD '/mapred/history/done' >> USING org.apache.pig.piggybank.storage.HadoopJobHistoryLoader() >> AS (j:map[], m:map[], r:map[]); >> b = GROUP a by j#'JOBNAME' PARALLEL 5; >> STORE b into '/user/maprd/processed'; >> >> -- >> Thanks >>
