[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

Mithun Radhakrishnan (JIRA) Tue, 30 Sep 2014 13:58:57 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153741#comment-14153741
 ]


Mithun Radhakrishnan commented on HIVE-8313:
--------------------------------------------

This seems to have to do with the changes introduced in HIVE-4209, to provide 
caching for evaluation of deterministic sub-expressions.

In this particular case, the problem occurs in 
{{ExprNodeGenericFuncEvaluator::_evaluate()}}:

{code:title=ExprNodeGenericFuncEvaluator.java|borderStyle=solid}
  @Override
  protected Object _evaluate(Object row, int version) throws HiveException {
    rowObject = row;
    if (ObjectInspectorUtils.isConstantObjectInspector(outputOI) &&
        isDeterministic()) {
      // The output of this UDF is constant, so don't even bother evaluating.
      return ((ConstantObjectInspector)outputOI).getWritableConstantValue();
    }
    for (int i = 0; i < deferredChildren.length; i++) {
      deferredChildren[i].prepare(version);
    }
    return genericUDF.evaluate(deferredChildren);
  }
{code}

In Hive 0.10, the {{deferredChildren[i].evaluate()}} would be skipped in its 
entirety, for "non-eager" evaluation. In Hive 0.12, that condition is checked 
within the {{prepare()}} function, on every invocation, for *each record*, with 
explosive effect.

A lot of this cost can be saved by skipping prepare() for 
{{ExprNodeEvaluator}}s which yield the same value regardless of the row. E.g. 
{{ExprNodeConstantEvaluator}} and {{ExprNodeNullEvaluator}}. I'll post a patch 
for this shortly.

> Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-8313
>                 URL: https://issues.apache.org/jira/browse/HIVE-8313
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.12.0, 0.13.0, 0.14.0
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>
> Consider the following query:
> {code}
> SELECT foo, bar, goo, id
> FROM myTable
> WHERE id IN { 'A', 'B', 'C', 'D', ... , 'ZZZZZZ' };
> {code}
> One finds that when the IN clause has several thousand elements (and the 
> table has several million rows), the query above takes orders-of-magnitude 
> longer to run on Hive 0.12 than say Hive 0.10.
> I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

Reply via email to