[GitHub] [spark] viirya commented on pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient

2021-07-03 Thread GitBox


viirya commented on pull request #33142:
URL: https://github.com/apache/spark/pull/33142#issuecomment-873424131


   Thanks! Merging to master/branch-3.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient

2021-07-02 Thread GitBox


viirya commented on pull request #33142:
URL: https://github.com/apache/spark/pull/33142#issuecomment-873349727


   @maropu Any more comments? Otherwise I will merge this tomorrow. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient

2021-06-30 Thread GitBox


viirya commented on pull request #33142:
URL: https://github.com/apache/spark/pull/33142#issuecomment-871634042


   > Can you briefly introduce your idea? Sorting by height is stable and fast 
now.
   
   I've not looked in the details yet. Is sorting by height guaranteed to sort 
expressions by child-parent? I said current sorting is not reliable because it 
might miss some cases probably. It is because two expressions with no 
child-parent relation has no clear comparison order. So sorting is somehow 
unreliable for expressions. Does sorting by height solve it?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient

2021-06-29 Thread GitBox


viirya commented on pull request #33142:
URL: https://github.com/apache/spark/pull/33142#issuecomment-871104576


   > Can you briefly introduce your idea? Sorting by height is stable and fast 
now.
   
   Basically, the steps are:
   
   1. Propagate the `SubExprEliminationState` map for all subexprs (no needed 
to be sorted). Only create the value and isNull variables, don't do codegen yet.
   2. Iterate all subexprs to do codegen. Because expression codegen will look 
at the map to replace subexprs, any subexpr in children will be replaced and 
chained. So we don't need to sort subexprs in advance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient

2021-06-29 Thread GitBox


viirya commented on pull request #33142:
URL: https://github.com/apache/spark/pull/33142#issuecomment-870971121


   > track the "height" of common subexpressions, to quickly do child-parent 
sort.
   
   About this, I think the sorting is not reliable as it is hard to do 
child-parent sort. I have another proposal to get rid of the sort as I 
mentioned before.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org