Hello Whirr Developers, My name is Bruce Basil Mathews. I am the Western Regional Solutions Architect for HP Public Cloud Services (the largest OpenStack deployment in the U.S.) Recently, I had an opportunity to test the operation and effectiveness of whirr for deploying the Cloudera CHD4 stack to our cloud. I was very impressed with the overall capabilities and end results, but I found a few things that the development team may want to address to optimize whirr for the task.
1. Security Group Entries: I think, in addition to the opening of port 22 (ssh) in the initial phases, you may also wish to create and entry for ICMP from -1 to -1 with a CIDR of 0.0.0.0/0. This will allow the Compute Instances to use ping and other verification methods between addresses. Also, in my case, the needed DataNode and TaskTracker ports never seemed to be inserted into the Security Group, so I added them manually. I think this second issue may be related to the Name Resolution issue I will describe later. 2. IP Assignment: It appears that you are creating each Compute Instance and then assigning a Floating IP address that attempts to be used for inter-process communication between nodes of the cluster. It might be better to deploy using the Private IP addresses as the vehicle for Node communication behind our Firewall and to expose the Public IP address of the NameNode and JobTracker Nodes for proxy to and from the whirr server for hadoop command execution. 3. Security Group Rule Masking: It seems as though you tried to use the Floating IP address scheme to mask the rules created. This sets up some rather complex scenarios until we move to Neutron in Grizzly. In the interim, you may consider using the Private IP address scheme for masking or simply use 0.0.0.0/0. 4. If you use the Private IP Addresses, then the DNS services behind our Firewall can resolve the host names of all of the Compute Instances involved, but it may be a good idea to update the hosts files on every node to include the Private IP addresses of all the involved Compute Instances just to be safe. This is related to the issue brought up in items #1, #2 and #3 above. I hope you don't mind these suggestions! Aside from these four items, and the need to 'reboot' the cluster after manual repairs, the deployment went very well from my perspective! I have a document outlining the procedure I used and the results achieved. Please send me a direct email and I will be happy to send it to whomever is interested. It was too large to attach to the email... Please do let me know if this information is helpful and useful for subsequent action or if I should not be using this group as a forum for such things... K? The Very Best! Bruce Basil Mathews HP Cloud Services, DBaaS Architect +1 760 961 7699 / Tel +1 760 553 3197 / Mobile HP Public Cloud Site: http://www.hpcloud.com<http://www.hp.com/go/proliantgen8> 'All the world is a stage, and all the men and women in it, merely players' Jaquis, As You Like It!
