[ https://issues.apache.org/jira/browse/SPARK-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-6188: ----------------------------- Shepherd: (was: Josh Rosen) Assignee: Theodore Vasiloudis > Instance types can be mislabeled when re-starting cluster with default > arguments > -------------------------------------------------------------------------------- > > Key: SPARK-6188 > URL: https://issues.apache.org/jira/browse/SPARK-6188 > Project: Spark > Issue Type: Bug > Components: EC2 > Affects Versions: 1.0.2, 1.1.0, 1.1.1, 1.2.0, 1.2.1 > Reporter: Theodore Vasiloudis > Assignee: Theodore Vasiloudis > Priority: Minor > Fix For: 1.4.0 > > > This was discovered when investigating > https://issues.apache.org/jira/browse/SPARK-5838. > In short, when restarting a cluster that you launched with an alternative > instance type, you have to provide the instance type(s) again in the > "/spark-ec2 -i <key-file> --region=<ec2-region> start <cluster-name>" > command. Otherwise it gets set to the default m1.large. > This then affects the setup of the machines. > I'll submit a pull request that takes cares of this, without the user needing > to provide the instance type(s) again. > EDIT: > Example case where this becomes a problem: > 1. User launches a cluster with instances with 1 disk, ex. m3.large. > 2. The user stops the cluster. > 3. When the user restarts the cluster with the start command without > providing the instance type, the setup is performed using the default > instance type, m1.large, which assumes 2 disks present in the machine. > 4. The SPARK_LOCAL_DIRS is then set to "mnt/spark,mnt2/spark". /mnt2 > corresponds to the snapshot partition in a m3.large instance, which is only > 8GB in size. When the user runs jobs that shuffle data, this partition fills > up quickly, resulting in failed jobs due to "No space left on device" errors. > Apart from this example one could come up with other examples where the setup > of the machines is wrong, due to assuming that they are of type m1.large. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org