[
https://issues.apache.org/jira/browse/DRILL-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132379#comment-14132379
]
Jinfeng Ni commented on DRILL-1383:
-----------------------------------
This feature would not have impact on Drill's external functionality. An
expression evaluated either by compile & execution model or interpreter model
should return identical results.
1. Motivation
Given an expression, there are two models to evaluate the expression:
1) Compile & execute. This is the current model used by Drill. In
execution time, the expression will first be materialized. Then, a run-time
code generator will produce an evaluation class for the expression, and such
class will be compiled. Then, given list of incoming RecordBatchs with the
identical Schema, Drill will use this compiled class to evaluate the expression.
2) Interpreter. Interpreter model could be used either in planning time or
execution time.
In planning time, interpreter model could be used to compute the part of
constant expression, and replace the constant expression with its evaluation
result. Another usage is in partition pruning, where planner could use the
interpreter to evaluate whether the searching filter is satisfied for one
particular partition, in order to pre-determine a subset of candidate
partitions.
In execution time, Drill's operator may switch from compile & execution
model to interpreter, if the input is small, and it would be time-consuming to
generate the run-time code and compile.
2. Interface
The critical part to support interpreter model is to statically generate an
interpreter class for each Drill function template, including build-in or UDF,
during package build process. This is different from compile & execution
model, where the code generation happens in run-time.
As the first step, we only consider using interpreter model for expression made
of DrillSimpleFunc.
public interface DrillSimpleFuncInterpreter extends DrillFuncInterpreter {
public void doSetup(ValueHolder[] args, RecordBatch incoming);
public ValueHolder doEval(ValueHolder [] args) ;
}
A class of InterpreterBuilder would scan all the DrillSimplFunc function
template in the Drill package, and for each of them, generate an interpreter
class. Same as run-time code generation, the static generation would leverage
JCodeModel as well.
A class of InterpreterEvaluator will be responsible to evaluate an expression,
given an incoming RecordBatch, and put the result into an outgoing ValueVector:
public static void evaluate(RecordBatch incoming, ValueVector outVV,
LogicalExpression expr)
The InterpreterEvaluator will leverage a visitor pattern extends
AbstractExprVisitor. For each row, it iterates through the expression tree, and
do the evaluation. If one node of the expression tree is a DrillSimpleFunc, it
will call the statically generated interpreter class for the corresponding
DrillSimpleFunc, and get the result of ValueHolder.
> Allow interpreted tree materialization
> --------------------------------------
>
> Key: DRILL-1383
> URL: https://issues.apache.org/jira/browse/DRILL-1383
> Project: Apache Drill
> Issue Type: Improvement
> Components: Execution - Codegen
> Reporter: Jacques Nadeau
> Assignee: Jinfeng Ni
>
> The current code generation paradigm requires an expression tree to be
> compiled before it can be evaluated. This can be time intensive and complex
> when we need to evaluate an expression only a small number of times. We
> should provide a new interface that avoids code generation for evaluation of
> a particular expression.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)