Github user vidaha commented on the pull request:

    https://github.com/apache/spark/pull/1899#issuecomment-51949860
  
    Hi Josh,
    
    IMHO, it's best not to require a Spark cluster name and the security group 
to be the same.  While you can reuse an existing security group to launch 
another cluster, you can't launch more than one cluster with the same security 
group.  Perhaps a company wants to have an internal-applications or dev 
security group and reuse that for launch multiple Spark clusters.  In addition, 
AWS has a strict limit of 100 on the number of security groups on a VPC, and 
since two security groups are required (one of the masters and one for the 
workers), this means, that only 50 Spark cluster can be launched on a VPC.  
While that might seem like a reasonable limit, I can easily see companies 
having a use case to exceed that.
    
    Do you mind illustrating the problem about name conflicts?  If I understand 
what you are saying, you are mentioned this scenario:
    
    % ./spark-ec2 … —security-group my-security-group launch my-cluster-name
    
    And then later, you also run:
    
    % ./spark-ec2 … launch my-security-group
    
    This works fine - I tested it - there will be two clusters with the same 
security group, but different names.  These are some error cases that I thought 
might offer and tested for manually and they worked out fine:
    
    - Can you then run ./spark-ec2 … —delete-groups destroy 
my-security-group and delete the security group when another cluster is using 
that security group?
    
    I tried this, and amazon has the correct controls to prevent deleting a 
security group still in use by another cluster.
    
    - Can you forget to include the security group override on a launched 
cluster and create problems?
    
    I tried this as well, and since the get_existing_cluster code was modified 
to use the name rather than the security group to identify the instance, this 
works right.  You can’t run:
    
    % ./spark-ec2 … launch my-cluster-name
    
    if there is already a cluster with my-cluster-name launched.
    
    
    Are there some other possible conflicts that you can think of?  If you can 
write out the commands to illustrate the use cases you are thinking of - I can 
run them and see what happens.
    
    -Vida


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to