ChenMichael edited a comment on pull request #34684:
URL: https://github.com/apache/spark/pull/34684#issuecomment-976849576


   In order for this problem to manifest, we have to do join planning in 
between the time an InMemoryRelation is converted to a RDD and the time where 
the job executing this RDD completes. In AQE, since it repeatedly does 
replanning, this can happen when the InMemoryRelations are on different levels. 
With AQE off, it only does join planning once, so there's no scenario where 
part of the query materializes the InMemoryRelation and then join planning 
happens on another part of the query with inaccurate stats. I guess this could 
happen with AQE off if there are concurrent jobs sharing the same 
InMemoryRelation though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to