GitHub user CodingCat opened a pull request:
https://github.com/apache/spark/pull/59
SPARK-1166: clean vpc_id if the group was just now created
Reported in https://spark-project.atlassian.net/browse/SPARK-1166
In some very weird situation (when new created group master_group and
slave_group have valid vpc_id), user will receive the following error when
running the spark-ec2 script
```
Setting up security groups...
ERROR:boto:400 Bad Request
ERROR:boto:<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Invalid
value 'null' for protocol. VPC security group rules must specify protocols
explicitly.</Message></Error></Errors><RequestID>fc56f0ba-915a-45b6-8555-05d4dd0f14ee</RequestID></Response>
Traceback (most recent call last):
File "./spark_ec2.py", line 813, in <module>
main()
File "./spark_ec2.py", line 806, in main
real_main()
File "./spark_ec2.py", line 689, in real_main
conn, opts, cluster_name)
File "./spark_ec2.py", line 244, in launch_cluster
slave_group.authorize(src_group=master_group)
File
"/Users/nanzhu/code/spark/ec2/third_party/boto-2.4.1.zip/boto-2.4.1/boto/ec2/securitygroup.py",
line 184, in authorize
File
"/Users/nanzhu/code/spark/ec2/third_party/boto-2.4.1.zip/boto-2.4.1/boto/ec2/connection.py",
line 2181, in authorize_security_group
File
"/Users/nanzhu/code/spark/ec2/third_party/boto-2.4.1.zip/boto-2.4.1/boto/connection.py",
line 944, in get_status
boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Invalid
value 'null' for protocol. VPC security group rules must specify protocols
explicitly.</Message></Error></Errors><RequestID>fc56f0ba-915a-45b6-8555-05d4dd0f14ee</RequestID></Response>
```
The related code in boto is as following:
```
group_name = None
if not self.vpc_id:
group_name = self.name
group_id = None
if self.vpc_id:
group_id = self.id
src_group_name = None
src_group_owner_id = None
src_group_group_id = None
if src_group:
cidr_ip = None
src_group_owner_id = src_group.owner_id
if not self.vpc_id:
src_group_name = src_group.name
else:
if hasattr(src_group, 'group_id'):
src_group_group_id = src_group.group_id
else:
src_group_group_id = src_group.id
status = self.connection.authorize_security_group(group_name,
src_group_name,
src_group_owner_id,
ip_protocol,
from_port,
to_port,
cidr_ip,
group_id,
src_group_group_id)
```
So if we just create a new cluster, we should clean the vpc_id for the user
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/CodingCat/spark SPARK-1166
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/59.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #59
----
commit ecf485c3cd31a475951b427eafbd1eaa1dfb71d5
Author: CodingCat <[email protected]>
Date: 2014-03-03T02:56:27Z
clean vpc_id if the group was just now created
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---