[ https://issues.apache.org/jira/browse/SPARK-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146689#comment-15146689 ]
Christopher Bourez edited comment on SPARK-13317 at 2/14/16 6:58 PM: --------------------------------------------------------------------- I launch a cluster ./ec2/spark-ec2 -k sparkclusterkey -i ~/sparkclusterkey.pem --region=eu-west-1 --copy-aws-credentials --instance-type=m1.large -s 4 --hadoop-major-version=2 launch spark-cluster which gives me a master at ec2-54-229-16-73.eu-west-1.compute.amazonaws.com and slaves at ec2-54-194-99-236.eu-west-1.compute.amazonaws.com etc If I launch a job in client mode from another network, for example in a Zeppelin notebook on my macbook, which configuration is equivalent to spark-shell --master=spark://ec2-54-229-16-73.eu-west-1.compute.amazonaws.com:7077 I see in the logs : ` 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/0 on worker-20160214185030-172.31.4.179-34425 (172.31.4.179:34425) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/0 on hostPort 172.31.4.179:34425 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/1 on worker-20160214185030-172.31.4.176-47657 (172.31.4.176:47657) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/1 on hostPort 172.31.4.176:47657 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/2 on worker-20160214185031-172.31.4.177-41379 (172.31.4.177:41379) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/2 on hostPort 172.31.4.177:41379 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/3 on worker-20160214185032-172.31.4.178-34353 (172.31.4.178:34353) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/3 on hostPort 172.31.4.178:34353 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.11:64058 with 511.5 MB RAM, BlockManagerId(driver, 192.168.1.11, 64058) 16/02/14 19:55:04 INFO BlockManagerMaster: Registered BlockManager ` which are private IP that my macbook cannot access and when launching a job, an error follow : 16/02/14 19:57:19 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources I tryied to connect to the slave, to set SPARK_LOCAL_IP in the slave's spark-env.sh, stop and restart all slaves from the master, spark master still returns the private IP. Thanks, was (Author: christopher5106): I launch a cluster ./ec2/spark-ec2 -k sparkclusterkey -i ~/sparkclusterkey.pem --region=eu-west-1 --copy-aws-credentials --instance-type=m1.large -s 4 --hadoop-major-version=2 launch spark-cluster which gives me a master at ec2-54-229-16-73.eu-west-1.compute.amazonaws.com and slaves at ec2-54-194-99-236.eu-west-1.compute.amazonaws.com etc If I launch a job in client mode from another network, for example in a Zeppelin notebook on my macbook, which configuration is equivalent to spark-shell --master=spark://ec2-54-229-16-73.eu-west-1.compute.amazonaws.com:7077 I see in the logs : ``` 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/0 on worker-20160214185030-172.31.4.179-34425 (172.31.4.179:34425) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/0 on hostPort 172.31.4.179:34425 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/1 on worker-20160214185030-172.31.4.176-47657 (172.31.4.176:47657) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/1 on hostPort 172.31.4.176:47657 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/2 on worker-20160214185031-172.31.4.177-41379 (172.31.4.177:41379) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/2 on hostPort 172.31.4.177:41379 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/3 on worker-20160214185032-172.31.4.178-34353 (172.31.4.178:34353) with 2 cores 16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/3 on hostPort 172.31.4.178:34353 with 2 cores, 1024.0 MB RAM 16/02/14 19:55:04 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.11:64058 with 511.5 MB RAM, BlockManagerId(driver, 192.168.1.11, 64058) 16/02/14 19:55:04 INFO BlockManagerMaster: Registered BlockManager ``` which are private IP that my macbook cannot access and when launching a job, an error follow : 16/02/14 19:57:19 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources I tryied to connect to the slave, to set SPARK_LOCAL_IP in the slave's spark-env.sh, stop and restart all slaves from the master, spark master still returns the private IP. Thanks, > SPARK_LOCAL_IP does not bind on Slaves > -------------------------------------- > > Key: SPARK-13317 > URL: https://issues.apache.org/jira/browse/SPARK-13317 > Project: Spark > Issue Type: Bug > Environment: Linux EC2, different VPC > Reporter: Christopher Bourez > > SPARK_LOCAL_IP does not bind to the provided IP on slaves. > When launching a job or a spark-shell from a second network, the returned IP > for the slave is still the first IP of the slave. > So the job fails with the message : > Initial job has not accepted any resources; check your cluster UI to ensure > that workers are registered and have sufficient resources > It is not a question of resources but the driver which cannot connect to the > slave given the wrong IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org