Hi All,

I'm having trouble getting a job to use the spark history server. We have a
cluster configured with Ambari, if I run the job from one of the nodes
within the Ambari configured cluster, everything works fine, the job
appears in the spark history server.

If I configure a client external to the cluster, running the same job, the
history server is not used.

When the job completes fine, I see these lines appear in the log:


16/02/05 11:57:22 INFO history.YarnHistoryService: Starting
YarnHistoryService for application application_1453893909110_0108 attempt
Some(appattempt_1453893909110_0108_000001); state=1; endpoint=
http://somehost:8188/ws/v1/timeline/; bonded to ATS=false; listening=false;
batchSize=10; flush count=0; total number queued=0, processed=0; attempted
entity posts=0 successful entity posts=0 failed entity posts=0; events
dropped=0; app start event received=false; app end event received=false;
16/02/05 11:57:22 INFO history.YarnHistoryService: Spark events will be
published to the Timeline service at http://somehost:8188/ws/v1/timeline/


On the client which is external to the cluster, these lines do not appear
in the logs. I have printed out spark context and attempted to match what
is configured on the working job, with the failing job, all seems fine.

These are the job settings:

conf.set('spark.speculation','true')
conf.set('spark.dynamicAllocation.enabled','false')
conf.set('spark.shuffle.service.enabled','false')
conf.set('spark.executor.instances', '4')
conf.set('spark.akka.threads','4')
conf.set('spark.dynamicAllocation.initialExecutors','4')
conf.set('spark.history.provider','org.apache.spark.deploy.yarn.history.YarnHistoryProvider')
conf.set('spark.yarn.services','org.apache.spark.deploy.yarn.history.YarnHistoryService')
conf.set('spark.history.ui.port','18080')
conf.set('spark.driver.extraJavaOptions','-Dhdp.version=2.3.4.0-3485')
conf.set('spark.yarn.containerLauncherMaxThreads','25')
conf.set('spark.yarn.driver.memoryOverhead','384')
conf.set('spark.yarn.executor.memoryOverhead','384')
conf.set('spark.yarn.historyServer.address','somehost:18080')
conf.set('spark.yarn.max.executor.failures','3')
conf.set('spark.yarn.preserve.staging.files','false')
conf.set('spark.yarn.queue','default')
conf.set('spark.yarn.scheduler.heartbeat.interval-ms','5000')
conf.set('spark.yarn.submit.file.replication','3')
conf.set('spark.yarn.am.extraJavaOptions','-Dhdp.version=2.3.4.0-3485')
conf.set('spark.blockManager.port','9096')
conf.set('spark.driver.port','9095')
conf.set('spark.fileserver.port','9097')

I am using the following tar.gz file to install spark on the node external
to the cluster:

http://www.apache.org/dyn/closer.lua/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz

Will this version of spark have everything required to talk correctly to
yarn and the spark history service?

So it comes down to, the spark context settings appear to be exactly the
same, there are no errors in the logs pointing to the job not being able to
connect to anything, none of the ports are blocked, why is this not working
when run external to the cluster....?

There is no kerberos security configured on the cluster.

Thanks!

Reply via email to