[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988958#comment-14988958 ] ASF GitHub Bot commented on DRILL-3912: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/189 > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986667#comment-14986667 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-153236066 The revised code(adds MappingSet to ExpressionHolder) looks good to me. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986653#comment-14986653 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r43715208 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java --- @@ -218,11 +220,16 @@ public HoldingContainer addExpr(LogicalExpression ex) { } public HoldingContainer addExpr(LogicalExpression ex, boolean rotate) { +return addExpr(ex, rotate, false); --- End diff -- I actually think I will just remove the boolean flag altogether. Originally I wanted to limit the scope, but I think it makes sense to always have it enabled. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986654#comment-14986654 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r43715223 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java --- @@ -106,19 +177,30 @@ public HoldingContainer visitFunctionCall(FunctionCall call, ClassGenerator g @Override public HoldingContainer visitBooleanOperator(BooleanOperator op, ClassGenerator generator) throws RuntimeException { + HoldingContainer hc = getPrevious(op, generator.getMappingSet()); --- End diff -- Okay, I think I can make that change. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986640#comment-14986640 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r43714802 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java --- @@ -106,19 +177,30 @@ public HoldingContainer visitFunctionCall(FunctionCall call, ClassGenerator g @Override public HoldingContainer visitBooleanOperator(BooleanOperator op, ClassGenerator generator) throws RuntimeException { + HoldingContainer hc = getPrevious(op, generator.getMappingSet()); --- End diff -- You put the check whether it's common subexpression in EvalVisitor, when handling with different expression. I feel the code would be much cleaner, if we add a CSEFilter in a similar to ConstantFilter. The logic will go through ConstantFilter -> CSEFilter -> EvalVisitor. The check of common subexpression and adding to the Hashmap would be done solely in CSEFilter:visitiUnknow(). In AbstractExrVisitor, other methods will by default come to visitUnknow(). In this way, there is no need to repeat the checking of common subexpression for different expression in EvalVisitor; actually no change has to be done to EvalVisitor. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986634#comment-14986634 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r43714617 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java --- @@ -218,11 +220,16 @@ public HoldingContainer addExpr(LogicalExpression ex) { } public HoldingContainer addExpr(LogicalExpression ex, boolean rotate) { +return addExpr(ex, rotate, false); --- End diff -- I think this had better to be addExpr(ex, rotate, true). That is, be default, we turn on CSE. Otherwise, seems the patch only turns on CSE for Project, Filter, HashAgg, HashJoin, but not for other operators. Is there any reason that we do not by default turn on CSE here? > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986538#comment-14986538 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-153219893 This is a net gain. Agree with @jinfengni to track the SR replacement separately. +1 > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979747#comment-14979747 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-152066327 @jinfengni I updated the PR. Could you take a look? > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965811#comment-14965811 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r42557389 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java --- @@ -0,0 +1,322 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr; + +import com.google.common.collect.Lists; +import org.apache.drill.common.expression.BooleanOperator; +import org.apache.drill.common.expression.CastExpression; +import org.apache.drill.common.expression.ConvertExpression; +import org.apache.drill.common.expression.FunctionCall; +import org.apache.drill.common.expression.FunctionHolderExpression; +import org.apache.drill.common.expression.IfExpression; +import org.apache.drill.common.expression.LogicalExpression; +import org.apache.drill.common.expression.NullExpression; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.expression.TypedNullConstant; +import org.apache.drill.common.expression.ValueExpressions.BooleanExpression; +import org.apache.drill.common.expression.ValueExpressions.DateExpression; +import org.apache.drill.common.expression.ValueExpressions.Decimal18Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal28Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal38Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal9Expression; +import org.apache.drill.common.expression.ValueExpressions.DoubleExpression; +import org.apache.drill.common.expression.ValueExpressions.FloatExpression; +import org.apache.drill.common.expression.ValueExpressions.IntExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalDayExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalYearExpression; +import org.apache.drill.common.expression.ValueExpressions.LongExpression; +import org.apache.drill.common.expression.ValueExpressions.QuotedString; +import org.apache.drill.common.expression.ValueExpressions.TimeExpression; +import org.apache.drill.common.expression.ValueExpressions.TimeStampExpression; +import org.apache.drill.common.expression.visitors.AbstractExprVisitor; + +import java.util.List; + +public class EqualityVisitor extends AbstractExprVisitor { + @Override + public Boolean visitFunctionCall(FunctionCall call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionCall)) { + return false; +} +if (!checkType(call, value)) { + return false; +} +if (!call.getName().equals(((FunctionCall) value).getName())) { + return false; +} +return checkChildren(call, value); + } + + @Override + public Boolean visitFunctionHolderExpression(FunctionHolderExpression holder, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionHolderExpression)) { + return false; +} +if (!checkType(holder, value)) { + return false; +} +if (!holder.getName().equals(((FunctionHolderExpression) value).getName())) { + return false; +} +return checkChildren(holder, value); + } + + @Override + public Boolean visitIfExpression(IfExpression ifExpr, LogicalExpression value) throws RuntimeException { +if (!(value instanceof IfExpression)) { + return false; +} +return checkChildren(ifExpr, value); + } + + @Override + public Boolean visitBooleanOperator(BooleanOperator call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof BooleanOperator)) {
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965539#comment-14965539 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r42537221 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java --- @@ -0,0 +1,322 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr; + +import com.google.common.collect.Lists; +import org.apache.drill.common.expression.BooleanOperator; +import org.apache.drill.common.expression.CastExpression; +import org.apache.drill.common.expression.ConvertExpression; +import org.apache.drill.common.expression.FunctionCall; +import org.apache.drill.common.expression.FunctionHolderExpression; +import org.apache.drill.common.expression.IfExpression; +import org.apache.drill.common.expression.LogicalExpression; +import org.apache.drill.common.expression.NullExpression; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.expression.TypedNullConstant; +import org.apache.drill.common.expression.ValueExpressions.BooleanExpression; +import org.apache.drill.common.expression.ValueExpressions.DateExpression; +import org.apache.drill.common.expression.ValueExpressions.Decimal18Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal28Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal38Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal9Expression; +import org.apache.drill.common.expression.ValueExpressions.DoubleExpression; +import org.apache.drill.common.expression.ValueExpressions.FloatExpression; +import org.apache.drill.common.expression.ValueExpressions.IntExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalDayExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalYearExpression; +import org.apache.drill.common.expression.ValueExpressions.LongExpression; +import org.apache.drill.common.expression.ValueExpressions.QuotedString; +import org.apache.drill.common.expression.ValueExpressions.TimeExpression; +import org.apache.drill.common.expression.ValueExpressions.TimeStampExpression; +import org.apache.drill.common.expression.visitors.AbstractExprVisitor; + +import java.util.List; + +public class EqualityVisitor extends AbstractExprVisitor { + @Override + public Boolean visitFunctionCall(FunctionCall call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionCall)) { + return false; +} +if (!checkType(call, value)) { + return false; +} +if (!call.getName().equals(((FunctionCall) value).getName())) { + return false; +} +return checkChildren(call, value); + } + + @Override + public Boolean visitFunctionHolderExpression(FunctionHolderExpression holder, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionHolderExpression)) { + return false; +} +if (!checkType(holder, value)) { + return false; +} +if (!holder.getName().equals(((FunctionHolderExpression) value).getName())) { + return false; +} +return checkChildren(holder, value); + } + + @Override + public Boolean visitIfExpression(IfExpression ifExpr, LogicalExpression value) throws RuntimeException { +if (!(value instanceof IfExpression)) { + return false; +} +return checkChildren(ifExpr, value); + } + + @Override + public Boolean visitBooleanOperator(BooleanOperator call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof BooleanOperator)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965520#comment-14965520 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r42534375 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java --- @@ -0,0 +1,322 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr; + +import com.google.common.collect.Lists; +import org.apache.drill.common.expression.BooleanOperator; +import org.apache.drill.common.expression.CastExpression; +import org.apache.drill.common.expression.ConvertExpression; +import org.apache.drill.common.expression.FunctionCall; +import org.apache.drill.common.expression.FunctionHolderExpression; +import org.apache.drill.common.expression.IfExpression; +import org.apache.drill.common.expression.LogicalExpression; +import org.apache.drill.common.expression.NullExpression; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.expression.TypedNullConstant; +import org.apache.drill.common.expression.ValueExpressions.BooleanExpression; +import org.apache.drill.common.expression.ValueExpressions.DateExpression; +import org.apache.drill.common.expression.ValueExpressions.Decimal18Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal28Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal38Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal9Expression; +import org.apache.drill.common.expression.ValueExpressions.DoubleExpression; +import org.apache.drill.common.expression.ValueExpressions.FloatExpression; +import org.apache.drill.common.expression.ValueExpressions.IntExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalDayExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalYearExpression; +import org.apache.drill.common.expression.ValueExpressions.LongExpression; +import org.apache.drill.common.expression.ValueExpressions.QuotedString; +import org.apache.drill.common.expression.ValueExpressions.TimeExpression; +import org.apache.drill.common.expression.ValueExpressions.TimeStampExpression; +import org.apache.drill.common.expression.visitors.AbstractExprVisitor; + +import java.util.List; + +public class EqualityVisitor extends AbstractExprVisitor { + @Override + public Boolean visitFunctionCall(FunctionCall call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionCall)) { + return false; +} +if (!checkType(call, value)) { + return false; +} +if (!call.getName().equals(((FunctionCall) value).getName())) { + return false; +} +return checkChildren(call, value); + } + + @Override + public Boolean visitFunctionHolderExpression(FunctionHolderExpression holder, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionHolderExpression)) { + return false; +} +if (!checkType(holder, value)) { + return false; +} +if (!holder.getName().equals(((FunctionHolderExpression) value).getName())) { + return false; +} +return checkChildren(holder, value); + } + + @Override + public Boolean visitIfExpression(IfExpression ifExpr, LogicalExpression value) throws RuntimeException { +if (!(value instanceof IfExpression)) { + return false; +} +return checkChildren(ifExpr, value); + } + + @Override + public Boolean visitBooleanOperator(BooleanOperator call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof BooleanOperator)) {
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964355#comment-14964355 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-149391542 Yeah, I think that makes sense. I will go ahead and do that. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964349#comment-14964349 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-149390932 If the only bug you saw is in Scalar Replacement after turning on CSE for other operators, would it make sense to open a different JIRA to track the SR bug, and fix the SR bug there? From what I tried, the run-time generated code seems to be valid, and would produce the correct query result, if SR is disabled for that query. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964343#comment-14964343 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r42445825 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java --- @@ -106,19 +163,32 @@ public HoldingContainer visitFunctionCall(FunctionCall call, ClassGenerator g @Override public HoldingContainer visitBooleanOperator(BooleanOperator op, ClassGenerator generator) throws RuntimeException { + HoldingContainer hc = getPrevious(op); + if (hc != null) { +return hc; + } + newScope(); --- End diff -- I think this should be possible, and probably less error prone. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964342#comment-14964342 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r42445808 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java --- @@ -0,0 +1,322 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr; + +import com.google.common.collect.Lists; +import org.apache.drill.common.expression.BooleanOperator; +import org.apache.drill.common.expression.CastExpression; +import org.apache.drill.common.expression.ConvertExpression; +import org.apache.drill.common.expression.FunctionCall; +import org.apache.drill.common.expression.FunctionHolderExpression; +import org.apache.drill.common.expression.IfExpression; +import org.apache.drill.common.expression.LogicalExpression; +import org.apache.drill.common.expression.NullExpression; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.expression.TypedNullConstant; +import org.apache.drill.common.expression.ValueExpressions.BooleanExpression; +import org.apache.drill.common.expression.ValueExpressions.DateExpression; +import org.apache.drill.common.expression.ValueExpressions.Decimal18Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal28Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal38Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal9Expression; +import org.apache.drill.common.expression.ValueExpressions.DoubleExpression; +import org.apache.drill.common.expression.ValueExpressions.FloatExpression; +import org.apache.drill.common.expression.ValueExpressions.IntExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalDayExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalYearExpression; +import org.apache.drill.common.expression.ValueExpressions.LongExpression; +import org.apache.drill.common.expression.ValueExpressions.QuotedString; +import org.apache.drill.common.expression.ValueExpressions.TimeExpression; +import org.apache.drill.common.expression.ValueExpressions.TimeStampExpression; +import org.apache.drill.common.expression.visitors.AbstractExprVisitor; + +import java.util.List; + +public class EqualityVisitor extends AbstractExprVisitor { + @Override + public Boolean visitFunctionCall(FunctionCall call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionCall)) { + return false; +} +if (!checkType(call, value)) { + return false; +} +if (!call.getName().equals(((FunctionCall) value).getName())) { + return false; +} +return checkChildren(call, value); + } + + @Override + public Boolean visitFunctionHolderExpression(FunctionHolderExpression holder, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionHolderExpression)) { + return false; +} +if (!checkType(holder, value)) { + return false; +} +if (!holder.getName().equals(((FunctionHolderExpression) value).getName())) { + return false; +} +return checkChildren(holder, value); + } + + @Override + public Boolean visitIfExpression(IfExpression ifExpr, LogicalExpression value) throws RuntimeException { +if (!(value instanceof IfExpression)) { + return false; +} +return checkChildren(ifExpr, value); + } + + @Override + public Boolean visitBooleanOperator(BooleanOperator call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof BooleanOperator)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964340#comment-14964340 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-149390304 I actually have done some additional work on this branch to extend it to work with other operators. But I ran into the Scalar Replacement bug that was mentioned in the JIRA. So I didn't add the changes to the pull request. The tests fail because the current PR assumes there is only a single MappingSet, and thus the generated code is all going into the same methods. My solution was to clear the evaluated expression cache whenever the setMappingSet() method is called. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964316#comment-14964316 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r42444043 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java --- @@ -106,19 +163,32 @@ public HoldingContainer visitFunctionCall(FunctionCall call, ClassGenerator g @Override public HoldingContainer visitBooleanOperator(BooleanOperator op, ClassGenerator generator) throws RuntimeException { + HoldingContainer hc = getPrevious(op); + if (hc != null) { +return hc; + } + newScope(); --- End diff -- Is it possible to move the call of newScope() and leaveScope() to ClassGenerator.nestEvalBlock(), unnestEvalBlock()? Those two methods would essentially create / remove a new nested scope in the code, and is actually called for BooleanOperator (and/or) and IfExpression. Otherwise, we have to remember to call newScope() / leaveScope() everywhere the codes creates/delete a new scope, which seems to be error-prone. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964307#comment-14964307 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/189#discussion_r42443753 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java --- @@ -0,0 +1,322 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr; + +import com.google.common.collect.Lists; +import org.apache.drill.common.expression.BooleanOperator; +import org.apache.drill.common.expression.CastExpression; +import org.apache.drill.common.expression.ConvertExpression; +import org.apache.drill.common.expression.FunctionCall; +import org.apache.drill.common.expression.FunctionHolderExpression; +import org.apache.drill.common.expression.IfExpression; +import org.apache.drill.common.expression.LogicalExpression; +import org.apache.drill.common.expression.NullExpression; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.expression.TypedNullConstant; +import org.apache.drill.common.expression.ValueExpressions.BooleanExpression; +import org.apache.drill.common.expression.ValueExpressions.DateExpression; +import org.apache.drill.common.expression.ValueExpressions.Decimal18Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal28Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal38Expression; +import org.apache.drill.common.expression.ValueExpressions.Decimal9Expression; +import org.apache.drill.common.expression.ValueExpressions.DoubleExpression; +import org.apache.drill.common.expression.ValueExpressions.FloatExpression; +import org.apache.drill.common.expression.ValueExpressions.IntExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalDayExpression; +import org.apache.drill.common.expression.ValueExpressions.IntervalYearExpression; +import org.apache.drill.common.expression.ValueExpressions.LongExpression; +import org.apache.drill.common.expression.ValueExpressions.QuotedString; +import org.apache.drill.common.expression.ValueExpressions.TimeExpression; +import org.apache.drill.common.expression.ValueExpressions.TimeStampExpression; +import org.apache.drill.common.expression.visitors.AbstractExprVisitor; + +import java.util.List; + +public class EqualityVisitor extends AbstractExprVisitor { + @Override + public Boolean visitFunctionCall(FunctionCall call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionCall)) { + return false; +} +if (!checkType(call, value)) { + return false; +} +if (!call.getName().equals(((FunctionCall) value).getName())) { + return false; +} +return checkChildren(call, value); + } + + @Override + public Boolean visitFunctionHolderExpression(FunctionHolderExpression holder, LogicalExpression value) throws RuntimeException { +if (!(value instanceof FunctionHolderExpression)) { + return false; +} +if (!checkType(holder, value)) { + return false; +} +if (!holder.getName().equals(((FunctionHolderExpression) value).getName())) { + return false; +} +return checkChildren(holder, value); + } + + @Override + public Boolean visitIfExpression(IfExpression ifExpr, LogicalExpression value) throws RuntimeException { +if (!(value instanceof IfExpression)) { + return false; +} +return checkChildren(ifExpr, value); + } + + @Override + public Boolean visitBooleanOperator(BooleanOperator call, LogicalExpression value) throws RuntimeException { +if (!(value instanceof BooleanOperator)) {
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964301#comment-14964301 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-149385017 Seems to me this pull request would remove the common subexpression elimination(CSE) for Project and Filter only; it would not do CSE for other operators, like Join, Aggregation, Union, etc. I understand that the Union-type work would highly require CSE in Project/Filter. But will Union-type work would also require CSE in other operators? If this JIRA only targets for Project/Filter, I think it would make sense to explicitly specify this scope in the JIRA. Otherwise, people would expect CSE would happen to all Drill's relation operator, which is not true. Another question I have is : how easy is it to extend this patch to include CSE support for other operators? I tried to enable CSE for all operators. Turns out that bunch of unit testcase would fail. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947906#comment-14947906 ] Steven Phillips commented on DRILL-3912: 1) I had not enabled CSE in hash join, so it didn't have that problem. Now that I have enabled in hash join, I am seeing the same SR error. 2) In this case, it looks like the ConstantFilter is causing the '1 + 2' and '1 + 3' parts of the expressions to be resolved first, and then 'a + 1' is no longer common. Duplicate vectors reads are removed, though. I think this behavior is probably fine. 3) I am not targeting this for 1.2. Probably for 1.3. My main motivation here was to solve a problem I was running into in my Union-type work. Function resolution when there is Union type for the input involves case statements that check the current type of the input, and then executes a branch based on that type. In this case, both the condition expression as well as both branches will reference the input. For example, 1 + a would become something like {code} case when typeOf(a) = int then 1 + cast(a as int) when typeOf(a) = varchar then 1 + cast(cast(a as varchar) as int) end {code} So you can see that a single reference to 'a' becomes 3 references. And 'a' might not just be a ValueVectorReadExpression, it could be the output from some other expression tree. And if an input has more than 2 types, or if a function has multiple Union-type inputs, the complexity of the expression increases dramatically, and the amount of generated code gets to be quite large. I needed to find some way to fix this. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947837#comment-14947837 ] Jinfeng Ni commented on DRILL-3912: --- I have not tried your patch. Could you pls check two cases: 1) is t2.department_id using only one holder and one vector access in the generated code? {code} select t1.full_name from cp.`employee.json` t1, cp.`department.json` t2 where t1.department_id = t2.department_id and t1.position_id = t2.department_id"); {code} Last time I tried with my patch, after the HashJoin run-time code remove the redundant holder, it hit scalar replacement error. I'm curious why your patch did not hit SR error, if the redundant holder is removed. 2) common expression in project's multiple expressions: select a+1, a + 1 +2, a + 1 + 3 Also, are you targeting this patch for 1.2 or next release? > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947810#comment-14947810 ] Steven Phillips commented on DRILL-3912: It looks like your patch does a subset of my patch. It will eliminate common vector read expressions in the same JBlock. My patch will eliminate any redundant expression as long as the previously evaluated expression is in scope. For example, with filter: ( a + b > 0 and ( a + b = c or a + b = d)) the expression (a + b) would currently have to be computed 3 times, and each reference to a and b would require accessing the corresponding vectors. With my patch, (a + b) would only be calculated once. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947799#comment-14947799 ] Jinfeng Ni commented on DRILL-3912: --- I actually had one patch to remove the redundancy in run-time generated code. https://issues.apache.org/jira/browse/DRILL-3754 The only failure with the patch is for it hit failure in scalar replacement logic for case where same column appear twice in a HashJoin. The run-time code seems valid, but the scalar replacement fails. {code} select t1.full_name from cp.`employee.json` t1, cp.`department.json` t2 where t1.department_id = t2.department_id and t1.position_id = t2.department_id"); {code} I just rebase on master branch. Can you check if the two patches address same problem? See: https://github.com/jinfengni/incubator-drill/tree/DRILL-3754 > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947797#comment-14947797 ] ASF GitHub Bot commented on DRILL-3912: --- Github user jaltekruse commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-146363523 Do you want to go ahead and add these comments as javadocs on the classes? This is the kind of information that would help a future dev understand and extend these classes in the future. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947733#comment-14947733 ] Julian Hyde commented on DRILL-3912: The search space for common relational expressions is so much larger (and so different) that it needs a different approach. In Calcite we are planning to introduce a Spool operator to create temporary tables and then re-use them throughout the query as if they were materialized views. Furthermore the temporary tables might be virtual (i.e. never fully materialized, but have two or subscribers to the stream of records). CALCITE-481 is the placeholder for that work. > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947731#comment-14947731 ] Steven Phillips commented on DRILL-3912: Yes, Drill physical plans are currently trees only. What you are suggesting require a more general DAG execution. This patch only deals with common expressions within operators, and does its work right at code-generation time. > Common subexpression elimination > > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947721#comment-14947721 ] Ted Dunning commented on DRILL-3912: It sounds like this only deals with common sub-expressions in expressions. A far more significant optimization would be to deal with common sub-expressions at a larger scale. A classic case is multiple re-use of a single expression in a common table expression. For instance, {code} with x as (select dir0, id from dfs.tdunning.zoom where id < 12), y as (select id, count(*) cnt from x group by id), z as (select count(distinct id) id_count from x) select dir0, x.id, y.cnt from x , y, z where x.id = y.id and y.cnt / z.id_count > 3 {code} Without good sub-expression elimination, table zoom will be scanned three times. Last I heard, DRILL doesn't optimize this away. > Common subexpression elimination > > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947665#comment-14947665 ] ASF GitHub Bot commented on DRILL-3912: --- GitHub user StevenMPhillips opened a pull request: https://github.com/apache/drill/pull/189 DRILL-3912: Common subexpression elimination In EvaluationVisitor, map each logical expression subtree to the HoldingContainer produced when the subtree is evaluated, and store it in a HashMap. Tow new classes, HashVisitor and EqualityVisitor are used to determine the hashcode and equality for the map. Two logical expression trees are equal if all child nodes are equal, and the root nodes are the same type and have the same fields. Only expressions that are in scope are available in the map. When entering a new scope, the current map is copied into a new map. The current map is pushed onto a stack, and the new map becomes the current map. When that scope is left, the previous map is popped off the stack and becomes the current scope. Thus previously evaluated expression in the outer scope are available while in the inner scope, but any new expressions will not be available when in the outer scope. A new scope is entered when evaluating a boolean expression, or when evaluating either branch of an If Expression. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StevenMPhillips/drill drill-3912 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/189.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #189 commit f13b4eab7ea1cb61efcda0e137006bc983c9a339 Author: Steven Phillips Date: 2015-10-07T10:44:10Z DRILL-3912: Common subexpression elimination > Common subexpression elimination > > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947663#comment-14947663 ] Julian Hyde commented on DRILL-3912: Take a look at Calcite's RexProgram. It internally removes sub-expressions, topologically sorting them so that simplest expressions occur first. Also, if you have combined project and filter into one operation, it evaluates expressions used in the filter first. > Common subexpression elimination > > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)