Github user redsanket commented on a diff in the pull request: https://github.com/apache/spark/pull/22173#discussion_r216069578 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -281,4 +282,31 @@ public Properties cryptoConf() { public long maxChunksBeingTransferred() { return conf.getLong("spark.shuffle.maxChunksBeingTransferred", Long.MAX_VALUE); } + + /** + * Percentage of io.serverThreads used by netty to process ChunkFetchRequest. + * Shuffle server will use a separate EventLoopGroup to process ChunkFetchRequest messages. + * Although when calling the async writeAndFlush on the underlying channel to send + * response back to client, the I/O on the channel is still being handled by + * {@link org.apache.spark.network.server.TransportServer}'s default EventLoopGroup + * that's registered with the Channel, by waiting inside the ChunkFetchRequest handler + * threads for the completion of sending back responses, we are able to put a limit on + * the max number of threads from TransportServer's default EventLoopGroup that are + * going to be consumed by writing response to ChunkFetchRequest, which are I/O intensive + * and could take long time to process due to disk contentions. By configuring a slightly + * higher number of shuffler server threads, we are able to reserve some threads for + * handling other RPC messages, thus making the Client less likely to experience timeout + * when sending RPC messages to the shuffle server. Default to 0, which is 2*#cores + * or io.serverThreads. 10 would mean 10% of 2*#cores or 10% of io.serverThreads + * which equals 0.1 * 2*#cores or 0.1 * io.serverThreads. + */ + public int chunkFetchHandlerThreads() { + if(!this.getModuleName().equalsIgnoreCase("shuffle")) { + return 0; + } + int chunkFetchHandlerThreadsPercent = + conf.getInt("spark.shuffle.server.chunkFetchHandlerThreadsPercent", 0); + return this.serverThreads() > 0? (this.serverThreads() * chunkFetchHandlerThreadsPercent)/100: --- End diff -- I think it is a good idea to document both as this is an important config. Let me know your thoughts
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org