changgyoopark-db commented on code in PR #47955: URL: https://github.com/apache/spark/pull/47955#discussion_r1746001339
########## sql/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectExecutePlanHandler.scala: ########## @@ -31,6 +32,18 @@ class SparkConnectExecutePlanHandler(responseObserver: StreamObserver[proto.Exec try { executeHolder.eventsManager.postStarted() executeHolder.start() + } catch { + // Errors raised before the execution holder has finished spawning a thread are considered + // plan execution failure, and the client should not try reattaching it afterwards. + case s: SparkThrowable => + SparkConnectService.executionManager.removeExecuteHolder(executeHolder.key) + throw s + case t: Throwable => + SparkConnectService.executionManager.removeExecuteHolder(executeHolder.key) + throw SparkException.internalError(t.getMessage(), t) Review Comment: Hm.. good point, yeah, OOM is definitely not a spark error. On the other hand, OOM is definitely not NonFatal, so not sure if wrapping it in a NonFatal error type is a good idea. Unfortunately, not wrapping OOM is also problematic because (EDGE) the client will retry (OOM is translated into UNKNOWN) and will eventually get OPERATION_NOT_FOUND <- this isn't good either. I'll firstly remove the conversion, and address the client error issue (EDGE) separately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org