Hmmm, I think this might be a bug which is only exposed when one of the mappers
gets zero rows of input.
If you have a Hive build, can you try adding this before line 238 of
GenericUDAFnGrams.java?
if (n == 0) {
return;
}
Just before this line:
if(myagg.n > 0 && n > 0 && myagg.n != n) {
If that fixes it, create a new JIRA issue so we can get a fix committed.
JVS
On Jun 20, 2011, at 8:06 AM, Matthew Rathbone wrote:
> Hoping someone with more expertise could help on this:
>
> I have no idea what's causing this to happen, but here is the exception:
>
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:
> Hive Runtime Error while processing row (tag=0)
> {"key":{},"value":{"_col0":["0","0","0","0"]},"alias":0}
> at
> org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268)
> at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:467)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:415)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
> Error while processing row (tag=0)
> {"key":{},"value":{"_col0":["0","0","0","0"]},"alias":0}
> at
> org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256)
> ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> GenericUDAFnGramEvaluator: mismatch in value for 'n', which usually is caused
> by a non-constant expression. Found '0' and '1'.
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFnGrams$GenericUDAFnGramEvaluator.merge(GenericUDAFnGrams.java:239)
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:142)
> at
> org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:592)
> at
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:816)
> at
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:716)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:470)
> at
> org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247)
>
>
> What does 'mismatch in value for n' mean?
> My query is super simple:
> select ngrams(sentences(text), 1, 50) from messages
>
>
> --
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> [email protected] | @rathboma | 4sq