leanken commented on pull request #29104:
URL: https://github.com/apache/spark/pull/29104#issuecomment-661672887


   > In ExtractEquiJoinKeys (in patterns.scala) there is code like this:
   > 
   > ```
   >         case EqualTo(l, r) if canEvaluate(l, left) && canEvaluate(r, 
right) => Some((l, r))
   >         case EqualTo(l, r) if canEvaluate(l, right) && canEvaluate(r, 
left) => Some((r, l))
   > ```
   > 
   > So I am wondering if it is possible that LeftAnti join can actually have 
the left side as the build and the right side as streaming ? Is it possible we 
won't optimize that case ? Do you think it makes sense to ensure that BNLJ is 
not present for any not-in query ? (ie this optimization should kick in always).
   > 
   > I am still a bit surprised that you didn't have to modify any .sql.out 
files because the plan would have changed from BNLJ to BHJ.
   
   I think when both canBroadcastBySize(left) and canBroadcastBySize(right) are 
false, the ExtractSingleColumnNullAwareAntiJoin pattern will be misMatch,  then 
it will still fallback to the origin BNLJ, FYI.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to