Port 7077 is for "client" mode connections to the master. In "cluster" mode 
it's 6066 and this means the "driver" runs on the spark cluster on a node spark 
chooses. The command I use to deploy my spark app (including the driver) is 
below:


spark-submit --deploy-mode cluster --master 
spark://tiplxapp-spk01:6066,tiplxapp-spk02:6066,tiplxapp-spk03:6066 
/app/tmx/ngxspark/lib/EX1AppSpark-1.0.13.jar



Yes, your right I believe when the master dies, zookeeper detects that and 
elects a new master node and spark-submit should carry on. Not sure how this 
leads into the UI believing the app is in "waiting" state?


Also, I noticed when these fail overs happen the "worker" web GUI goes a bit 
strange and starts reporting over allocated resources? Look at the cores and 
memory used?


[http://142.201.185.134:18081/static/spark-logo-77x50px-hd.png] 1.6.0 
<http://142.201.185.134:18081/> Spark Worker at 142.201.185.134:7078

  *   ID: worker-20160622152457-142.201.185.134-7078
  *   Master URL: spark://142.201.185.132:7077
  *   Cores: 4 (5 Used)
  *   Memory: 2.7 GB (3.0 GB Used)

Back to Master<http://142.201.185.132:18080/>





________________________________
From: Mich Talebzadeh <mich.talebza...@gmail.com>
Sent: September 7, 2016 2:52 PM
To: arlindo santos
Cc: user @spark
Subject: Re: spark 1.6.0 web console shows running application in a "waiting" 
status, but it's acutally running

This is my take.

When you issue spark-submit on any node it start GUI on port 4040 by default. 
Otherwise you can specify port yourself with --conf  "spark.ui.port=<port>"

As I understand in standalone mode executors run on workers.

$SPARK_HOME/sbin/start-slave.sh spark://<host>::7077

That port 7077 is the master port. If master dies, then those workers lose 
connection to port 7077 so I believe they go stale. So the spark-submit carries 
on using the remaining executors on other workers.

So in summary one expects the job to run.  You start your UI on <HOST>:port.

One test you can do is to exit from UI and start UI on the host that zookeeper 
selects the master on the same port. That should work.

HTH







Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 7 September 2016 at 15:27, arlindo santos 
<sarli...@hotmail.com<mailto:sarli...@hotmail.com>> wrote:
Yes refreshed a few times. Running in cluster mode.

Fyi.. I can duplicate this easily now. Our setup consists of 3 nodes running 
standalone spark, master and worker on each, zookeeper doing master leader 
election. If I kill a master on any node, the master shifts to another node and 
that is when the app state changes to waiting and never changes back to running 
on the gui, but really it's in a running mode.

Sent from my BlackBerry 10 smartphone on the Rogers network.
From: Mich Talebzadeh
Sent: Wednesday, September 7, 2016 9:50 AM
To: sarlindo
Cc: user @spark
Subject: Re: spark 1.6.0 web console shows running application in a "waiting" 
status, but it's acutally running


Have you refreshed the Spark UI page?

What Mode are you running your Spark app?

HTH


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 6 September 2016 at 16:15, sarlindo 
<sarli...@hotmail.com<mailto:sarli...@hotmail.com>> wrote:
I have 2 questions/issues.

1. We had the spark-master shut down (reason unknown) we looked at the
spark-master logs and it simply shows this, is there some other log I should
be looking at to find out why the master went down?

16/09/05 21:10:00 INFO ClientCnxn: Opening socket connection to server
tiplxapp-spk02.prd.tse.com/142.201.219.76:2181<http://tiplxapp-spk02.prd.tse.com/142.201.219.76:2181>.
 Will not attempt to
authenticate using SASL (unknown error)
16/09/05 21:10:00 ERROR Master: Leadership has been revoked -- master
shutting down.
16/09/05 21:10:00 INFO ClientCnxn: Socket connection established, initiating
session, client: /142.201.219.75:56361<http://142.201.219.75:56361>, server:
tiplxapp-spk02.prd.tse.com/142.201.219.76:2181<http://tiplxapp-spk02.prd.tse.com/142.201.219.76:2181>


2. Spark 1.6.0 web console shows a running application in a "waiting"
status, but it's actually running. Is this an existing bug?

<http://apache-spark-user-list.1001560.n3.nabble.com/file/n27665/33.png>





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-6-0-web-console-shows-running-application-in-a-waiting-status-but-it-s-acutally-running-tp27665.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>



Reply via email to