Re: Spark 1.2.x Yarn Auxiliary Shuffle Service

Arush Kharbanda Mon, 09 Feb 2015 01:06:44 -0800

Is this what you are looking for

   1. Build Spark with the YARN profile
   <http://spark.apache.org/docs/1.2.0/building-spark.html>. Skip this step
   if you are using a pre-packaged distribution.
   2. Locate the spark-<version>-yarn-shuffle.jar. This should be under
   $SPARK_HOME/network/yarn/target/scala-<version> if you are building
   Spark yourself, and under lib if you are using a distribution.
   3. Add this jar to the classpath of all NodeManagers in your cluster.
   4. In the yarn-site.xml on each node, add spark_shuffle to
   yarn.nodemanager.aux-services, then set
   yarn.nodemanager.aux-services.spark_shuffle.class to
   org.apache.spark.network.yarn.YarnShuffleService. Additionally, set all
   relevantspark.shuffle.service.* configurations
   <http://spark.apache.org/docs/1.2.0/configuration.html>.
   5. Restart all NodeManagers in your cluster.

On Wed, Jan 28, 2015 at 1:30 AM, Corey Nolet <cjno...@gmail.com> wrote:

> I've read that this is supposed to be a rather significant optimization to
> the shuffle system in 1.1.0 but I'm not seeing much documentation on
> enabling this in Yarn. I see github classes for it in 1.2.0 and a property
> "spark.shuffle.service.enabled" in the spark-defaults.conf.
>
> The code mentions that this is supposed to be run inside the Nodemanager
> so I'm assuming it needs to be wired up in the yarn-site.xml under the
> "yarn.nodemanager.aux-services" property?
>
>
>
>

-- 

[image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com

Re: Spark 1.2.x Yarn Auxiliary Shuffle Service

Reply via email to