Thanks Yana,
My current experience here is after running some small spark-submit
based tests the Master once again stopped being reachable. No change in
the test setup. I restarted Master/Worker and still not reachable.
What might be the variables here in which association with the
Here is example after git clone-ing latest 1.4.0-SNAPSHOT. The first 3
runs (FINISHED) were successful and connected quickly. Fourth run (ALIVE)
is failing on connection/association.
URL: spark://mellyrn.local:7077
REST URL: spark://mellyrn.local:6066 (cluster mode)
Workers: 1
Cores: 8 Total,
What does your master log say -- normally the master should NEVER shut
down...you should be able to spark-submit to infinity with no issues...So
the question about high variance on upstart is one issue, but the other
thing that's puzzling to me is why your master is ever down to begin
Check whether the name can be resolved in the /etc/hosts file (or DNS) of the
worker
(the same btw applies for the Node where you run the driver app – all other
nodes must be able to resolve its name)
From: Stephen Boesch [mailto:java...@gmail.com]
Sent: Wednesday, May 20, 2015 10:07
But if I'm reading his email correctly he's saying that:
1. The master and slave are on the same box (so network hiccups are
unlikely culprit)
2. The failures are intermittent -- i.e program works for a while then
worker gets disassociated...
Is it possible that the master restarted? We used to