I was actually able to get this to work.  I was NOT setting the classpath
properly originally.

Simply running
java -cp /etc/hadoop/conf/:<yarn, hadoop jars> com.domain.JobClass

and setting yarn-client as the spark master worked for me.  Originally I
had not put the configuration on the classpath. Also, I used
$SPARK_HOME/bin/compute_classpath.sh now now to get all of the relevant
jars.  The job properly connects to the am at the correct port.

Is there any intuition on how spark executor map to yarn workers or how the
different memory settings interplay, SPARK_MEM vs YARN_WORKER_MEM?


On Tue, May 20, 2014 at 2:25 PM, Andrew Or <and...@databricks.com> wrote:

> Hi Gaurav and Arun,
> Your settings seem reasonable; as long as YARN_CONF_DIR or HADOOP_CONF_DIR
> is properly set, the application should be able to find the correct RM
> port. Have you tried running the examples in yarn-client mode, and your
> custom application in yarn-standalone (now yarn-cluster) mode?
> 2014-05-20 5:17 GMT-07:00 gaurav.dasgupta <gaurav.d...@gmail.com>:
> Few more details I would like to provide (Sorry as I should have provided
>> with the previous post):
>>  *- Spark Version = 0.9.1 (using pre-built spark-0.9.1-bin-hadoop2)
>>  - Hadoop Version = 2.4.0 (Hortonworks)
>>  - I am trying to execute a Spark Streaming program*
>> Because I am using Hortornworks Hadoop (HDP), YARN is configured with
>> different port numbers than the default Apache's default configurations.
>> For
>> example, *resourcemanager.address* is <IP>:8050 in HDP whereas it defaults
>> to <IP>:8032.
>> When I run the Spark examples using bin/run-example, I can see in the
>> console logs, that it is connecting to the right port configured by HDP,
>> i.e., 8050. Please refer the below console log:
>> */[root@host spark-0.9.1-bin-hadoop2]# SPARK_YARN_MODE=true
>> SPARK_JAR=assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar
>> SPARK_YARN_APP_JAR=examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.1.jar
>> bin/run-example org.apache.spark.examples.HdfsTest yarn-client
>> /user/root/test
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/usr/local/spark-0.9.1-bin-hadoop2/examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/usr/local/spark-0.9.1-bin-hadoop2/assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 14/05/20 06:55:29 INFO slf4j.Slf4jLogger: Slf4jLogger started
>> 14/05/20 06:55:29 INFO Remoting: Starting remoting
>> 14/05/20 06:55:29 INFO Remoting: Remoting started; listening on addresses
>> :[akka.tcp://spark@<IP:60988]
>> 14/05/20 06:55:29 INFO Remoting: Remoting now listens on addresses:
>> [akka.tcp://spark@&lt;IP>:60988]
>> 14/05/20 06:55:29 INFO spark.SparkEnv: Registering BlockManagerMaster
>> 14/05/20 06:55:29 INFO storage.DiskBlockManager: Created local directory
>> at
>> /tmp/spark-local-20140520065529-924f
>> 14/05/20 06:55:29 INFO storage.MemoryStore: MemoryStore started with
>> capacity 4.2 GB.
>> 14/05/20 06:55:29 INFO network.ConnectionManager: Bound socket to port
>> 35359
>> with id = ConnectionManagerId(<IP>,35359)
>> 14/05/20 06:55:29 INFO storage.BlockManagerMaster: Trying to register
>> BlockManager
>> 14/05/20 06:55:29 INFO storage.BlockManagerMasterActor$BlockManagerInfo:
>> Registering block manager <IP>:35359 with 4.2 GB RAM
>> 14/05/20 06:55:29 INFO storage.BlockManagerMaster: Registered BlockManager
>> 14/05/20 06:55:29 INFO spark.HttpServer: Starting HTTP Server
>> 14/05/20 06:55:29 INFO server.Server: jetty-7.x.y-SNAPSHOT
>> 14/05/20 06:55:29 INFO server.AbstractConnector: Started
>> SocketConnector@
>> 14/05/20 06:55:29 INFO broadcast.HttpBroadcast: Broadcast server started
>> at
>> http://<IP>:59418
>> 14/05/20 06:55:29 INFO spark.SparkEnv: Registering MapOutputTracker
>> 14/05/20 06:55:29 INFO spark.HttpFileServer: HTTP File server directory is
>> /tmp/spark-fc34fdc8-d940-420b-b184-fc7a8a65501a
>> 14/05/20 06:55:29 INFO spark.HttpServer: Starting HTTP Server
>> 14/05/20 06:55:29 INFO server.Server: jetty-7.x.y-SNAPSHOT
>> 14/05/20 06:55:29 INFO server.AbstractConnector: Started
>> SocketConnector@
>> 14/05/20 06:55:29 INFO server.Server: jetty-7.x.y-SNAPSHOT
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/storage/rdd,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/storage,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/stages/stage,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/stages/pool,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/stages,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/environment,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/executors,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/metrics/json,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/static,null}
>> 14/05/20 06:55:29 INFO handler.ContextHandler: started
>> o.e.j.s.h.ContextHandler{/,null}
>> 14/05/20 06:55:29 INFO server.AbstractConnector: Started
>> SelectChannelConnector@
>> 14/05/20 06:55:29 INFO ui.SparkUI: Started Spark Web UI at http://
>> <IP>:4040
>> 14/05/20 06:55:29 WARN util.NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 14/05/20 06:55:29 INFO spark.SparkContext: Added JAR
>> /usr/local/spark-0.9.1-bin-hadoop2/examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.1.jar
>> at http://<IP>:53425/jars/spark-examples_2.10-assembly-0.9.1.jar with
>> timestamp 1400586929921
>> 14/05/20 06:55:30 INFO client.RMProxy: Connecting to ResourceManager at
>> <IP>:8050
>> 14/05/20 06:55:30 INFO yarn.Client: Got Cluster metric info from
>> ApplicationsManager (ASM), number of NodeManagers: 9
>> 14/05/20 06:55:30 INFO yarn.Client: Queue info ... queueName: default,
>> queueCurrentCapacity: 0.0, queueMaxCapacity: 1.0,/*
>> But, when I running my own custom spark streaming code, it is trying to
>> connect to port number 8032 instead and hence unable to connect. Refer the
>> below log:
>> */[root@host spark-0.9.1-bin-hadoop2]# SPARK_YARN_MODE=true
>> SPARK_JAR=assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar
>> SPARK_YARN_APP_JAR=/home/gaurav/SparkStreamExample.jar java -cp
>> /home/gaurav/SparkStreamExample.jar:assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar
>> SparkStreamExample yarn-client <IP> 9999
>> log4j:WARN No appenders could be found for logger
>> (akka.event.slf4j.Slf4jLogger).
>> log4j:WARN Please initialize the log4j system properly.
>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
>> more info.
>> 14/05/20 07:04:38 INFO SparkEnv: Using Spark's default log4j profile:
>> org/apache/spark/log4j-defaults.properties
>> 14/05/20 07:04:38 INFO SparkEnv: Registering BlockManagerMaster
>> 14/05/20 07:04:38 INFO DiskBlockManager: Created local directory at
>> /tmp/spark-local-20140520070438-5eae
>> 14/05/20 07:04:38 INFO MemoryStore: MemoryStore started with capacity 4.2
>> GB.
>> 14/05/20 07:04:38 INFO ConnectionManager: Bound socket to port 49869 with
>> id
>> = ConnectionManagerId(<IP>,49869)
>> 14/05/20 07:04:38 INFO BlockManagerMaster: Trying to register BlockManager
>> 14/05/20 07:04:38 INFO BlockManagerMasterActor$BlockManagerInfo:
>> Registering
>> block manager <IP>:49869 with 4.2 GB RAM
>> 14/05/20 07:04:38 INFO BlockManagerMaster: Registered BlockManager
>> 14/05/20 07:04:38 INFO HttpServer: Starting HTTP Server
>> 14/05/20 07:04:38 INFO HttpBroadcast: Broadcast server started at
>> http://<IP>:36946
>> 14/05/20 07:04:38 INFO SparkEnv: Registering MapOutputTracker
>> 14/05/20 07:04:38 INFO HttpFileServer: HTTP File server directory is
>> /tmp/spark-414ba274-adc0-4a0e-b1a4-9c1f048cbf37
>> 14/05/20 07:04:38 INFO HttpServer: Starting HTTP Server
>> 14/05/20 07:04:38 INFO SparkUI: Started Spark Web UI at http://<IP>:4040
>> 14/05/20 07:04:38 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 14/05/20 07:04:38 INFO SparkContext: Added JAR
>> /home/gaurav/SparkStreamExample.jar at
>> http://<IP>:40053/jars/SparkStreamExample.jar with timestamp
>> 1400587478500
>> 14/05/20 07:04:38 INFO RMProxy: Connecting to ResourceManager at
>> /
>> 14/05/20 07:04:39 INFO Client: Retrying connect to server:
>> Already tried 0 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>> 14/05/20 07:04:40 INFO Client: Retrying connect to server:
>> Already tried 1 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>> 14/05/20 07:04:41 INFO Client: Retrying connect to server:
>> Already tried 2 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>> 14/05/20 07:04:42 INFO Client: Retrying connect to server:
>> Already tried 3 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)/*
>> Do I need to specify the YARN ports configured by HDP to Spark somehow?
>> How
>> the example jobs can detect the correct YARN ports?
>> Thanks in advance.
>> -- Gaurav
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-configuration-file-doesn-t-work-when-run-with-yarn-client-mode-tp1418p6097.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to