Juliusz Sompolski created SPARK-44833: -----------------------------------------
Summary: Spark Connect reattach when initial ExecutePlan didn't reach server doing too eager Reattach Key: SPARK-44833 URL: https://issues.apache.org/jira/browse/SPARK-44833 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 3.5.0 Reporter: Juliusz Sompolski In {code:java} case ex: StatusRuntimeException if Option(StatusProto.fromThrowable(ex)) .exists(_.getMessage.contains("INVALID_HANDLE.OPERATION_NOT_FOUND")) => if (lastReturnedResponseId.isDefined) { throw new IllegalStateException( "OPERATION_NOT_FOUND on the server but responses were already received from it.", ex) } // Try a new ExecutePlan, and throw upstream for retry. -> iter = rawBlockingStub.executePlan(initialRequest) -> throw new GrpcRetryHandler.RetryException {code} we call executePlan, and throw RetryException to have an exception handled upstream. Then it goes to {code:java} retry { if (firstTry) { // on first try, we use the existing iter. firstTry = false } else { // on retry, the iter is borked, so we need a new one -> iter = rawBlockingStub.reattachExecute(createReattachExecuteRequest()) } {code} and because it's not firstTry, immediately does reattach. This causes no failure - the reattach will work and attach to the query, the original executePlan will get detached. But it could be improved. Same issue is also present in python reattach.py. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org