[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-883045221 I think it is much easier to solve it at query optimization (i.e. by the optimizer), instead of at codegen. It also looks like query optimization problem instead of codegen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-883045221 I think it is much easier to solve it at query optimization (i.e. by the optimizer), instead of at codegen. It also looks like query optimization problem instead of codegen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-883045221 I think it is much easier to solve it at query optimization (i.e. by the optimizer), instead of at codegen. It also looks like query optimization problem instead of codegen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-874534597 Oh, seems there is still a trouble. The local inputs for subexpressions for each predicate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-874534597 Oh, seems there is still a trouble. The local inputs for subexpressions for each predicate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-873532291 Okay, now the subexpressions are just pulled before the individual predicate uses it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-842762256 Thanks @maropu! Forgot to remove the stale label. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-842096440 I'll find some time to continue on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-842096440 I'll find some time to continue on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-738192798 > That's also unstable from a user's perspective because it could break by non-local changes, e.g. filters that get pushed down. It's not a great user experience if adding a small predicate causes a large performance cliff. That's good point although it is not new. For example, Spark does some fallback between codegen and interpreted and vectorization and non-vecterization reader. All these fallback possibly cause performance cliff. Anyway, as I said above, it is ideal case if we can just pull subexpr in front of the conjunct that uses it. As this is a WIP work, I will work towards this direction. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-737378127 @bart-samwel Thanks for the comment. Yeah, I have also notified this point after creating the PR. It would be ideal if we can pull up subexpr in front of the conjunct that uses it. An easier but sub-optimal approach is to only pull up subexpr that are shared between all predicates in the Filter. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30565: [WIP][SPARK-33625][SQL] Subexpression elimination for whole-stage codegen in Filter
viirya commented on pull request #30565: URL: https://github.com/apache/spark/pull/30565#issuecomment-737029373 The codegen change is ready for review. I need to make some benchmark code too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org