boryaferz commented on PR #34602: URL: https://github.com/apache/spark/pull/34602#issuecomment-4320569721
@Liulietong Hello, I have a small clarification on this sentence from the description: > "In the cases where skewedJoin is not last stage, OptimizeSkewedJoin may not work because the number of collected shuffleStages is more than 2." I think the core issue is actually the total number of shuffle stages being >2, rather than strictly whether the skewed join is the last stage or not. For example, if the skewed join is the last stage, but there are 3 earlier shuffle stages from other joins, the total count would still be >2, and the old rule would also fail in that case. Am I wrong somewhere? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
