== Short version == A recent commit replaces Spark's networking subsystem with one based on Netty rather than raw sockets. Users running off of master can disable this change by setting "spark.shuffle.blockTransferService=nio". We will be testing with this during the QA period for Spark 1.2. The new implementation is designed to increase stability and decrease GC pressure during shuffles.
== Long version == For those who haven't been following the associated PR's and JIRA's: We recently merged PR #2753 which creates a "network" package which does not depend on Spark core. #2753 introduces a Netty-based BlockTransferService to replace the NIO-based ConnectionManager, used for transferring shuffle and RDD cache blocks between Executors (in other words, the transport layer of the BlockManager). The new BlockTransferService is intended to provide increased stability, decreased maintenance burden, and decreased garbage collection. By relying on Netty to take care of the low-level networking, the actual transfer code is simpler and easier to verify. By making use of ByteBuf pooling, we can lower both memory usage and memory churn by reusing buffers. This was actually a critical component of the petasort benchmark, where the code originated from. While building this component, we realized it was a good opportunity to extract out the core transport functionality from Spark so we could reuse it for SPARK-3796, which calls for an external service which can serve Spark shuffle files. Thus, we created the "network/common" package, containing the functionality for setting up a simple control plane and an efficient data plane over a network. This part is functionally independent from Spark and is in fact written in Java to further minimize dependencies. PR #3001 finishes the work of creating an external shuffle service by creating a "network/shuffle" package which deals with serving Spark shuffle files from outside of an executor. The intention is that this server can be run anywhere -- including inside the Spark Standalone Worker or the YARN NodeManager, or as a separate service inside Mesos -- and provide the ability to scale up and down executors without losing shuffle data. Thanks to Aaron, Reynold and others who have worked on these improvements over the last month. - Patrick --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org