Feng Zhu created CALCITE-3224: --------------------------------- Summary: New RexNode-to-Expression CodeGen Implementation Key: CALCITE-3224 URL: https://issues.apache.org/jira/browse/CALCITE-3224 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.20.0 Reporter: Feng Zhu Assignee: Feng Zhu Attachments: codegen.png
h3. *Background* Current RexNode-to-Expression implementation relies on BlockBuilder's incorrect “optimizations” to inline unsafe operations. As illustrated in CALCITE-3173, when this cooperation is broken in some special cases, it will cause exceptions like NPE, such as CALCITE-3142, CALCITE-3143, CALCITE-3150. Though we can fix these problems under current implementation framework with some efforts like the PR in CALCITE-3142, the logic will become more and more complex. To pursue a thorough and elegant solution, we implement a new one. Moreover, it also ensures the correctness for non-optimized code. h3. *Major Features* * *Visitor Pattern*: Each RexNode will be visited only once in a bottom-up way, rather than recursively visiting a RexNode many times with different NullAs settings. * *Conditional Semantic*: It can naturally guarantee the correctness even without BlockBuilder’s “optimizing” operation. Each line of code generated for a RexNode is null safe. * *Interface Compatibility*: The implementation only updates _RexToLixTranslator_ and _RexImpTable_. Interfaces such as CallImplementor keep unchanged. h3. *Implementation* For each RexNode, the visitor will generally generate two declaration statements, one for value and one for nullable. The code snippet is like: {code:java} {valueVariable} = {valueExpression} {isNullVariable} = {isNullExpression} {code} The visitor’s result will be the variable pair (*_isNullVariable_*, *_valueVariable_*). h3. *Example Demonstration* Take a simple test case as example, in which the "commission" column is nullable. {code:java} @Test public void testNPE() { CalciteAssert.hr() .query("select \"commission\" + 10 as s\n" + "from \"hr\".\"emps\"") .returns("S=1010\nS=510\nS=null\nS=260\n"); } {code} The codegen progress and non-optimized code are demonstrated in the figure below. !codegen.png! # When visiting *RexInputRef (commission)*, the visitor generates three lines of code, the result is a pair of ParameterExpression (*_input_isNull_*, *_input_value_*). # Then the visitor visits *RexLiteral (10)* and generates two lines of code. The result is (*_literal_isNull_*, *_literal_value_*). # After that, when visiting *RexCall(Add)*, (_*input_isNull*_, _*input_value*_) and (_*literal_isNull*_, _*literal_value*_) can be used to implement the logic. The visitor also generates two lines of code and return the variable pair. In the end, the result Expression is constructed based on (_*binary_call_isNull*_, _*binary_call_value*_) -- This message was sent by Atlassian JIRA (v7.6.14#76016)