[ 
https://issues.apache.org/jira/browse/SPARK-9711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangyang Li updated SPARK-9711:
--------------------------------
    Description: 
With Spark 1.4.1 and YARN client mode, my application works at the first time 
the cluster is built. While if I stop and start the cluster with using 
spark-ec2, the same command fails. At the end of the spark logs, it's shown 
that it just keeps trying to connect to master node repeatedly:

INFO Client: Retrying connect to server: 
ec2-54-174-232-129.compute-1.amazonaws.com/172.31.36.29:8032. Already tried 0 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)

I restarted YARN and dfs manually after restarting the cluster, however, I was 
unable to restart Tachyon and it fails when running ./bin/tachyon runTests, 
which might be the possible reason.

  was:
With Spark 1.4.1 and YARN client mode, my application works at the first time 
the cluster is built. While if I stop and start the cluster with using 
spark-ec2, the same command fails. At the end of the spark logs, it's shown 
that it just keeps trying to connect to master node repeatedly:

INFO Client: Retrying connect to server: 
ec2-54-174-232-129.compute-1.amazonaws.com/172.31.36.29:8032. Already tried 0 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)


> Unable to run spark after restarting cluster with spark-ec2
> -----------------------------------------------------------
>
>                 Key: SPARK-9711
>                 URL: https://issues.apache.org/jira/browse/SPARK-9711
>             Project: Spark
>          Issue Type: Bug
>          Components: EC2
>    Affects Versions: 1.4.1
>            Reporter: Guangyang Li
>
> With Spark 1.4.1 and YARN client mode, my application works at the first time 
> the cluster is built. While if I stop and start the cluster with using 
> spark-ec2, the same command fails. At the end of the spark logs, it's shown 
> that it just keeps trying to connect to master node repeatedly:
> INFO Client: Retrying connect to server: 
> ec2-54-174-232-129.compute-1.amazonaws.com/172.31.36.29:8032. Already tried 0 
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
> sleepTime=1000 MILLISECONDS)
> I restarted YARN and dfs manually after restarting the cluster, however, I 
> was unable to restart Tachyon and it fails when running ./bin/tachyon 
> runTests, which might be the possible reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to