Is this what you are looking for
1. Build Spark with the YARN profile <http://spark.apache.org/docs/1.2.0/building-spark.html>. Skip this step if you are using a pre-packaged distribution. 2. Locate the spark-<version>-yarn-shuffle.jar. This should be under $SPARK_HOME/network/yarn/target/scala-<version> if you are building Spark yourself, and under lib if you are using a distribution. 3. Add this jar to the classpath of all NodeManagers in your cluster. 4. In the yarn-site.xml on each node, add spark_shuffle to yarn.nodemanager.aux-services, then set yarn.nodemanager.aux-services.spark_shuffle.class to org.apache.spark.network.yarn.YarnShuffleService. Additionally, set all relevantspark.shuffle.service.* configurations <http://spark.apache.org/docs/1.2.0/configuration.html>. 5. Restart all NodeManagers in your cluster. On Wed, Jan 28, 2015 at 1:30 AM, Corey Nolet <cjno...@gmail.com> wrote: > I've read that this is supposed to be a rather significant optimization to > the shuffle system in 1.1.0 but I'm not seeing much documentation on > enabling this in Yarn. I see github classes for it in 1.2.0 and a property > "spark.shuffle.service.enabled" in the spark-defaults.conf. > > The code mentions that this is supposed to be run inside the Nodemanager > so I'm assuming it needs to be wired up in the yarn-site.xml under the > "yarn.nodemanager.aux-services" property? > > > > -- [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com> *Arush Kharbanda* || Technical Teamlead ar...@sigmoidanalytics.com || www.sigmoidanalytics.com