boryaferz commented on PR #34602:
URL: https://github.com/apache/spark/pull/34602#issuecomment-4320569721

   @Liulietong  Hello, I have a small clarification on this sentence from the 
description:
   > "In the cases where skewedJoin is not last stage, OptimizeSkewedJoin may 
not work because the number of collected shuffleStages is more than 2."
   
   I think the core issue is actually the total number of shuffle stages being 
>2, rather than strictly whether the skewed join is the last stage or not.
   
   For example, if the skewed join is the last stage, but there are 3 earlier 
shuffle stages from other joins, the total count would still be >2, and the old 
rule would also fail in that case.
   
   Am I wrong somewhere?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to