[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-07-20 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-883045221


   I think it is much easier to solve it at query optimization (i.e. by the 
optimizer), instead of at codegen. It also looks like query optimization 
problem instead of codegen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-07-20 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-883045221


   I think it is much easier to solve it at query optimization (i.e. by the 
optimizer), instead of at codegen. It also looks like query optimization 
problem instead of codegen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-07-19 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-883045221


   I think it is much easier to solve it at query optimization (i.e. by the 
optimizer), instead of at codegen. It also looks like query optimization 
problem instead of codegen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-07-06 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-874534597


   Oh, seems there is still a trouble. The local inputs for subexpressions for 
each predicate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-07-06 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-874534597


   Oh, seems there is still a trouble. The local inputs for subexpressions for 
each predicate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-07-04 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-873532291


   Okay, now the subexpressions are just pulled before the individual predicate 
uses it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-05-17 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-842762256


   Thanks @maropu! Forgot to remove the stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-05-17 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-842096440


   I'll find some time to continue on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2021-05-17 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-842096440


   I'll find some time to continue on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2020-12-03 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-738192798


   > That's also unstable from a user's perspective because it could break by 
non-local changes, e.g. filters that get pushed down. It's not a great user 
experience if adding a small predicate causes a large performance cliff.
   
   That's good point although it is not new. For example, Spark does some 
fallback between codegen and interpreted and vectorization and 
non-vecterization reader. All these fallback possibly cause performance cliff.
   
   Anyway, as I said above, it is ideal case if we can just pull subexpr in 
front of the conjunct that uses it. As this is a WIP work, I will work towards 
this direction.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2020-12-02 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-737378127


   @bart-samwel Thanks for the comment.
   
   Yeah, I have also notified this point after creating the PR. It would be 
ideal if we can pull up subexpr in front of the conjunct that uses it. An 
easier but sub-optimal approach is to only pull up subexpr that are shared 
between all predicates in the Filter.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter

2020-12-01 Thread GitBox


viirya commented on pull request #30565:
URL: https://github.com/apache/spark/pull/30565#issuecomment-737029373


   The codegen change is ready for review. I need to make some benchmark code 
too.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org