[ 
https://issues.apache.org/jira/browse/CALCITE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-4300:
-------------------------------
    Description: 
{{EnumerableBatchNestedLoopJoin#implement}} method defines a variable named 
{{corrList}} in the dynamic code (which will store the correlating variables of 
the EBNLJ operator). Under certain circumstances (virtually impossible to 
reproduce on Calcite core, but feasible on downstream projects with further 
optimizations like IndexScan), this variable naming can lead to issues if two 
EBNLJ are nested:
{code}
/*   5 */   final com.onwbp.org.apache.calcite.linq4j.Enumerable 
_inputEnumerable = 
com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(..., 
..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
/*   6 */     public com.onwbp.org.apache.calcite.linq4j.AbstractEnumerable 
apply(final java.util.List corrList) { // corrList1
/*   7 */       {
...
/*  11 */         final com.onwbp.org.apache.calcite.linq4j.Enumerable 
_inputEnumerable = 
com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(..., 
..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
/*  12 */           public com.onwbp.org.apache.calcite.linq4j.Enumerable 
apply(final java.util.List corrList) { // corrList2
/*  13 */             {
...
/*  16 */                 myContext.putCorrelatingValue("$cor10.0", ((Object[]) 
corrList.get(0))[0]); // here it refers to corrList1, problem!
/*  17 */                 myContext.putCorrelatingValue("$cor11.0", ((Object[]) 
corrList.get(1))[0]); // here it refers to corrList1, problem!
/*  18 */                 myContext.putCorrelatingValue("$cor34.0", (String) 
corrList.get(0)); // here it refers to corrList2, works by chance
/*  19 */                 myContext.putCorrelatingValue("$cor35.0", (String) 
corrList.get(1)); // here it refers to corrList2, works by chance
.
{code}

Notice how dynamic code involves two "corrList" (lines 6 and 12); however when 
they are referenced (lines 16-19), the second one is always used, since they 
share the same name.
The fix is simple, each {{EnumerableBatchNestedLoopJoin}} must guarantee a 
unique name for its {{corrList}} in the dynamic code.


  was:
{{EnumerableBatchNestedLoopJoin#implement}} method defines a variable named 
{{corrList}} in the dynamic code (which will store the correlating variables of 
the EBNLJ operator). Under certain circumstances (virtually impossible to 
reproduce on Calcite core, but feasible on downstream projects with further 
optimizations like IndexScan), this variable naming can lead to issues if two 
EBNLJ are nested:
{code}
/*   5 */   final com.onwbp.org.apache.calcite.linq4j.Enumerable 
_inputEnumerable = 
com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(..., 
..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
/*   6 */     public com.onwbp.org.apache.calcite.linq4j.AbstractEnumerable 
apply(final java.util.List corrList) { // corrList1
/*   7 */       {
...
/*  11 */         final com.onwbp.org.apache.calcite.linq4j.Enumerable 
_inputEnumerable = 
com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(..., 
..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
/*  12 */           public com.onwbp.org.apache.calcite.linq4j.Enumerable 
apply(final java.util.List corrList) { // corrList2
/*  13 */             {
...
/*  16 */                 myContext.putCorrelatingValue("$cor10.0", ((Object[]) 
corrList.get(0))[0]); // here it refers to corrList1, problem!
/*  17 */                 myContext.putCorrelatingValue("$cor11.0", ((Object[]) 
corrList.get(1))[0]); // here it refers to corrList1, problem!
/*  18 */                 myContext.putCorrelatingValue("$cor34.0", (String) 
corrList.get(0)); // here it refers to corrList2, works by chance
/*  19 */                 myContext.putCorrelatingValue("$cor35.0", (String) 
corrList.get(1)); // here it refers to corrList2, works by chance
.
{code}

Notice how dynamic code involves two "corrList" (lines 6 and 12); however when 
they are referenced, the second one is always used, since they share the same 
name.
The fix is simple, each {{EnumerableBatchNestedLoopJoin}} must guarantee a 
unique name for its {{corrList}} in the dynamic code.



> EnumerableBatchNestedLoopJoin dynamic code generation can lead to variable 
> name issues if two EBNLJ are nested
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4300
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4300
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>             Fix For: 1.26.0
>
>
> {{EnumerableBatchNestedLoopJoin#implement}} method defines a variable named 
> {{corrList}} in the dynamic code (which will store the correlating variables 
> of the EBNLJ operator). Under certain circumstances (virtually impossible to 
> reproduce on Calcite core, but feasible on downstream projects with further 
> optimizations like IndexScan), this variable naming can lead to issues if two 
> EBNLJ are nested:
> {code}
> /*   5 */   final com.onwbp.org.apache.calcite.linq4j.Enumerable 
> _inputEnumerable = 
> com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(...,
>  ..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
> /*   6 */     public com.onwbp.org.apache.calcite.linq4j.AbstractEnumerable 
> apply(final java.util.List corrList) { // corrList1
> /*   7 */       {
> ...
> /*  11 */         final com.onwbp.org.apache.calcite.linq4j.Enumerable 
> _inputEnumerable = 
> com.onwbp.org.apache.calcite.linq4j.EnumerableDefaults.correlateBatchJoin(...,
>  ..., new com.onwbp.org.apache.calcite.linq4j.function.Function1() {
> /*  12 */           public com.onwbp.org.apache.calcite.linq4j.Enumerable 
> apply(final java.util.List corrList) { // corrList2
> /*  13 */             {
> ...
> /*  16 */                 myContext.putCorrelatingValue("$cor10.0", 
> ((Object[]) corrList.get(0))[0]); // here it refers to corrList1, problem!
> /*  17 */                 myContext.putCorrelatingValue("$cor11.0", 
> ((Object[]) corrList.get(1))[0]); // here it refers to corrList1, problem!
> /*  18 */                 myContext.putCorrelatingValue("$cor34.0", (String) 
> corrList.get(0)); // here it refers to corrList2, works by chance
> /*  19 */                 myContext.putCorrelatingValue("$cor35.0", (String) 
> corrList.get(1)); // here it refers to corrList2, works by chance
> .
> {code}
> Notice how dynamic code involves two "corrList" (lines 6 and 12); however 
> when they are referenced (lines 16-19), the second one is always used, since 
> they share the same name.
> The fix is simple, each {{EnumerableBatchNestedLoopJoin}} must guarantee a 
> unique name for its {{corrList}} in the dynamic code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to