Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/21342 I'm also in favor of delaying for a couple of days for more detailed review because historically I think these types of changes have been high risk. The risk calculus might be a bit different if this was fixing a critical "wrong result" correctness bug, but it seems like this is a longstanding annoyance which causes either poor performance or visible job failures, so I don't see an extreme urgency to get this in immediately. Therefore let's take a bit of time to ponder things and sleep on it to be sure that we've thought through corner-cases (to be clear, I do think this patch is generally in a good direction). Some specifics: 1. The old code had some places which purposely caught `OutOfMemoryError` thrown from layers of the spilling code. I do not know whether the expected sources of OOMs were only the throw sites modified here or whether the intent was also to catch OOMs from allocating too big arrays, etc. The latter would have been a dodgy pattern and bad idea in the first place, but I just wanted to note this as a potential risk for unintended / implicit behavior changes. If we want to be super conservative about that we could update throw sites but keep catch sites and extend them to catch _both_ OOM cases. 2. Should we maybe throw an `OutOfMemoryError` subclass and then pattern-match on our subclass in a couple of specific places? That might help reduce change-surface.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org