I am a new pig user and have run into “Internal error 2999” .
2011-04-05 15:59:57,445 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
2999: Unexpected internal error. null
Details at logfile:
/proj/CitationSystem/backend/hadoop/testbed-hold/pig_1302033581143.log
That shows:
Pig Stack Trace
---------------
ERROR 2999: Unexpected internal error. null
java.lang.NullPointerException
at
org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.getLoadFuncSpec(TypeCheckingVisitor.java:3116)
at
org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:1793)
at org.apache.pig.impl.logicalLayer.LOCast.visit(LOCast.java:67)
at org.apache.pig.impl.logicalLayer.LOCast.visit(LOCast.java:32)
at
org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:70)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.checkInnerPlan(TypeCheckingVisitor.java:2869)
at
org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2430)
at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:378)
at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:45)
at
org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:70)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:102)
at
org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40)
at
org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30)
at
org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:89)
at
org.apache.pig.impl.logicalLayer.UnionOnSchemaSetter.visit(UnionOnSchemaSetter.java:70)
at org.apache.pig.impl.logicalLayer.LOUnion.visit(LOUnion.java:177)
at org.apache.pig.impl.logicalLayer.LOUnion.visit(LOUnion.java:38)
at
org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:70)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at org.apache.pig.PigServer.compileLp(PigServer.java:1317)
at org.apache.pig.PigServer.compileLp(PigServer.java:1306)
at org.apache.pig.PigServer.compileLp(PigServer.java:1241)
at org.apache.pig.PigServer.compileLp(PigServer.java:1221)
at org.apache.pig.PigServer.execute(PigServer.java:1178)
at org.apache.pig.PigServer.access$100(PigServer.java:128)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:1517)
at org.apache.pig.PigServer.executeBatchEx(PigServer.java:362)
at org.apache.pig.PigServer.executeBatch(PigServer.java:329)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:510)
at org.apache.pig.Main.main(Main.java:107)
Most likely I am doing something wrong, so any advice would be appreciated.
Here is my setup - I have a pig script like this:
[… statements define SrcFuid and NewCitationRel …]
TCRaw = join SrcFuid by citingdocid, NewCitationRel by citeddocid;
describe TCRaw;
dump TCRaw;
TCGroupedByFuid = group TCRaw by CONCAT((chararray)SrcFuid.citingdocid,
SrcFuid.col,
(chararray)SrcFuid.seq);
store TCGroupedByFuid into 'foo';
The log shows the output of the describe and dump commands (I’ve formatted for
readability):
TCRaw: {SrcFuid::citingdocid: int,
SrcFuid::col: bytearray,
SrcFuid::seq: int,
NewCitationRel::citeddocid: int,
NewCitationRel::citingdocid: int,
NewCitationRel::col: bytearray,
NewCitationRel::seq: int,
NewCitationRel::year: int,
NewCitationRel::eds: bytearray}
(14159274,BCI,6,14159274,14159163,BCI,5,1999,BCI.BCI)
(14159274,BCI,6,14159274,14159163,WOS,11,1999,WOS.SCI)
(14159274,WOS,16,14159274,14159163,BCI,5,1999,BCI.BCI)
(14159274,WOS,16,14159274,14159163,WOS,11,1999,WOS.SCI)
What I was hoping for was something like
(‘14159274BCI6’,
{(14159274,BCI,6,14159274,14159163,BCI,5,1999,BCI.BCI),
(14159274,BCI,6,14159274,14159163,WOS,11,1999,WOS.SCI)})
(‘14159274WOS16’,
{(14159274,WOS,16,14159274,14159163,BCI,5,1999,BCI.BCI)
(14159274,WOS,16,14159274,14159163,WOS,11,1999,WOS.SCI)})
If anyone could give me a hint what to do get that I’d appreciate it much.
Thanks!
Will
William F Dowling
Sr Technical Specialist, Software Engineering
Thomson Reuters
0 +1 215 823 3853