leanken commented on pull request #29104:
URL: https://github.com/apache/spark/pull/29104#issuecomment-662286709


   > @leanken ... IMHO, this is getting close and really you have done an 
amazing job not just trying out all the different ideas and approaches but also 
paying attention to testing etc.
   > 
   > As for my vote on the question of different node vs hacking at the BHJ: I 
think I am finally convinced that having a separate node is the way to go here 
since there is just too much difference wrt to BHJ. Your answer on 
`isUniqueKeyCodePath` was the clincher for me. It's indeed better to have a 
different node than making another core node harder to follow. It would also 
restrict the performance impact of this PR to avoid making any inadvertent 
regressions into core BHJ.
   > 
   > My last remaining request is wrt to improving the unit testing coverage. I 
was wondering if we can copy paste some of the not-in query tests from 
SubQuerySuite and in addition to testing the results, also assert that the 
optimization kicks in or not. Ideally it would be great to reuse those tests 
without duplicating them, but its better to duplicate than not test them with 
this special optimization !.
   > 
   > Thanks.
   
   add case in SubquerySuite. check on both result and whether the newExecNode 
hit.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to