Hi, I've been reviewing YarnAllocator.updateResourceRequests and think the other branch is...too verbose (and may be deceiving that it's more complex than it really is). I hope to get corrected.
The source code of the method is https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L294. The method is about requesting new or cancelling outstanding YARN container requests for executors. Right? It starts by checking the executor container requests using getPendingAllocate [1] and branches by missing (the other branch uses numPendingAllocate also which is going to be crucial in my understanding of the method). So, in the other branch, when the number of outstanding YARN containers is too much [2], the method calls [3] amClient.getMatchingRequests(RM_REQUEST_PRIORITY, ANY_HOST, resource) which is exactly getPendingAllocate [4] (!) Is that correct? If my understanding is correct, the code does not need to call getPendingAllocate in the branch since it had already requested it in [1] (at the very beginning) and since we're inside the branch the code does not have to check "matchingRequests.isEmpty" either. My understanding is that the code should be as simple as the following: } else if (numPendingAllocate > 0 && missing < 0) { val numToCancel = math.min(numPendingAllocate, -missing) logInfo(s"Canceling requests for $numToCancel executor container(s) to have a new desired " + s"total $targetNumExecutors executors.") pendingAllocate.take(numToCancel).foreach(amClient.removeContainerRequest) } i.e. just a single pendingAllocate.take... Is that correct? THANKS a lot for reading thus far!!! I greatly appreciate your time and effort to help me understand Spark. p.s. I'm ready with a PR. [1] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L295 [2] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L355 [3] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L360 [4] https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L200 Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org