[ 
https://issues.apache.org/jira/browse/DRILL-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132379#comment-14132379
 ] 

Jinfeng Ni edited comment on DRILL-1383 at 9/13/14 12:32 AM:
-------------------------------------------------------------

This feature would not have impact on Drill's external functionality. An 
expression evaluated either by compile & execution model or interpreter model 
should return identical results. 

1. Motivation 
    Given an expression, there are two models to evaluate the expression:
  1)   Compile & execute. This is the current model used by Drill.  In 
execution time, the expression will first be materialized. Then, a run-time 
code generator will produce an evaluation class for the expression, and such 
class will be compiled. Then, given list of incoming RecordBatchs with the 
identical Schema, Drill will use this compiled class to evaluate the expression.
  
   2) Interpreter.  Interpreter model could be used either in planning time or 
execution time. 
     In planning time, interpreter model could be used to compute the part of 
constant expression, and replace the constant expression with its evaluation 
result.  Another usage is in partition pruning, where planner could use the 
interpreter to evaluate whether the searching filter is satisfied for one 
particular partition, in order to pre-determine a subset of candidate 
partitions.

    In execution time, Drill's operator may switch from compile & execution 
model to interpreter, if the input is small, and it would be time-consuming to 
generate the run-time code and compile.

2. Interface
   The critical part to support interpreter model is to statically generate an 
interpreter class for each Drill function template, including build-in or UDF, 
during package build process.  This is different from compile & execution 
model, where the code generation happens in run-time. 

As the first step, we only consider using interpreter model for expression made 
of DrillSimpleFunc.

{code}
public interface DrillSimpleFuncInterpreter extends DrillFuncInterpreter {

  public void doSetup(ValueHolder[] args, RecordBatch incoming);

  public ValueHolder doEval(ValueHolder [] args) ;

}
{code}

A class of InterpreterBuilder would scan all the DrillSimplFunc function 
template in the Drill package, and for each of them,  generate an interpreter 
class. Same as run-time code generation, the static generation would leverage 
JCodeModel as well.

A class of InterpreterEvaluator will be responsible to evaluate an expression, 
given an incoming RecordBatch, and put the result into an outgoing ValueVector:

{code}
  public static void evaluate(RecordBatch incoming, ValueVector outVV, 
LogicalExpression expr)
{code}

The InterpreterEvaluator will leverage a visitor pattern extends 
AbstractExprVisitor. For each row, it iterates through the expression tree, and 
do the evaluation. If one node of the expression tree is a DrillSimpleFunc, it 
will call the statically generated interpreter class for the corresponding 
DrillSimpleFunc, and get the result of ValueHolder.


   
 


was (Author: jni):
This feature would not have impact on Drill's external functionality. An 
expression evaluated either by compile & execution model or interpreter model 
should return identical results. 

1. Motivation 
    Given an expression, there are two models to evaluate the expression:
  1)   Compile & execute. This is the current model used by Drill.  In 
execution time, the expression will first be materialized. Then, a run-time 
code generator will produce an evaluation class for the expression, and such 
class will be compiled. Then, given list of incoming RecordBatchs with the 
identical Schema, Drill will use this compiled class to evaluate the expression.
  
   2) Interpreter.  Interpreter model could be used either in planning time or 
execution time. 
     In planning time, interpreter model could be used to compute the part of 
constant expression, and replace the constant expression with its evaluation 
result.  Another usage is in partition pruning, where planner could use the 
interpreter to evaluate whether the searching filter is satisfied for one 
particular partition, in order to pre-determine a subset of candidate 
partitions.

    In execution time, Drill's operator may switch from compile & execution 
model to interpreter, if the input is small, and it would be time-consuming to 
generate the run-time code and compile.

2. Interface
   The critical part to support interpreter model is to statically generate an 
interpreter class for each Drill function template, including build-in or UDF, 
during package build process.  This is different from compile & execution 
model, where the code generation happens in run-time. 

As the first step, we only consider using interpreter model for expression made 
of DrillSimpleFunc.

{code}
public interface DrillSimpleFuncInterpreter extends DrillFuncInterpreter {

  public void doSetup(ValueHolder[] args, RecordBatch incoming);

  public ValueHolder doEval(ValueHolder [] args) ;

}
{code}

A class of InterpreterBuilder would scan all the DrillSimplFunc function 
template in the Drill package, and for each of them,  generate an interpreter 
class. Same as run-time code generation, the static generation would leverage 
JCodeModel as well.

A class of InterpreterEvaluator will be responsible to evaluate an expression, 
given an incoming RecordBatch, and put the result into an outgoing ValueVector:

  public static void evaluate(RecordBatch incoming, ValueVector outVV, 
LogicalExpression expr)

The InterpreterEvaluator will leverage a visitor pattern extends 
AbstractExprVisitor. For each row, it iterates through the expression tree, and 
do the evaluation. If one node of the expression tree is a DrillSimpleFunc, it 
will call the statically generated interpreter class for the corresponding 
DrillSimpleFunc, and get the result of ValueHolder.


   
 

> Allow interpreted tree materialization
> --------------------------------------
>
>                 Key: DRILL-1383
>                 URL: https://issues.apache.org/jira/browse/DRILL-1383
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Codegen
>            Reporter: Jacques Nadeau
>            Assignee: Jinfeng Ni
>
> The current code generation paradigm requires an expression tree to be 
> compiled before it can be evaluated.  This can be time intensive and complex 
> when we need to evaluate an expression only a small number of times.  We 
> should provide a new interface that avoids code generation for evaluation of 
> a particular expression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to