We have been running the code from trunk on our production system since this morning. I have made a few reservations and captured an image. I have not encountered any problems. -Andy
On Thu, Mar 19, 2015 at 4:31 PM, Andy Kurth <[email protected]> wrote: > I believe the problems described in VCL-839 are fixed in trunk. I did not > make any changes to vcl-install.sh, but updated the backend code to not > rely on the management node's private IP address. Please test it out. The > changes only affect Linux images. When testing, be sure to verify the > firewall is correct: > > -after an image loads, before post_load is executed (22 should be open to > all IPs) > -after post_load is executed (22 should be closed, all of management > node's IPs should be allowed to connect to any port) > -after a user clicks Connect but before connecting (22 should be open to > any IP, management node access shouldn't change) > -after a user connects (22 should be locked down to user's IP, management > node access shouldn't change) > -after a user clicks Connect and the reservation times out due to no > initial connection, after sanitize is executed (22 should be closed to all, > management node should still be allowed from any of its IPs to any port) > -after pre_capture is executed (22 should be open to all) > -after a user clicks Connect from a different remote IP address (22 should > be allowed from user's original and new remote IP) > > It is also beneficial to test the outcome if the management node is only > allowed to connect on port 22. Manually change iptables and check the > various stages. Under no condition should the management node be locked > out. > > We are essentially running 2.4 in production right now. I'll update all > of our management nodes tomorrow morning to trunk and we will watch things > closely. If no problems are identified, I think a release candidate could > be created late in the day tomorrow. > > Thanks, > Andy > > On Wed, Mar 18, 2015 at 4:09 PM, Andy Kurth <[email protected]> wrote: > >> >> On Wed, Mar 18, 2015 at 11:08 AM, Aaron Coburn <[email protected]> >> wrote: >> >>> I'm in favor of whatever would be least confusing to users. And that >>> probably means waiting until a 2.4.1 release before announcing it on the >>> a.o mailing list. >>> >> >> Agree. >> >> Regarding 2.4.1, the problem discovered yesterday has been fixed in >> trunk. I tested a few 15-VM reservations using the code in trunk and >> cluster_info was correct. >> >> However, I found another problem described in >> https://issues.apache.org/jira/browse/VCL-839. Using a slightly >> modified vcl-install.sh, I installed a new CentOS 6.5 VM with VCL 2.4 and >> then updated it with the code in trunk. I was able to create a CentOS 6.5 >> base image and make reservations without any problems. When I attempted to >> capture one of the reservations, it failed because the management node had >> locked itself out after the first user connection was detected. This is >> described ad nauseam in the Jira issue. >> >> The problem is partially due to vcl-install.sh using localhost by default >> as the management node name. We could change the script to use something >> else. Regardless, the management node name must resolve to the private IP >> address or problems will occur. The script should add an entry to >> /etc/hosts so the MN's hostname in vcld.conf and the management node table >> resolves to the MN's private IP address. Josh primarily developed the >> script but is travelling this week. I can try to address the issues with >> the script tomorrow. >> >> This will fix the install script but there are still problems with the >> code. A management node should never lock itself out. These problems can >> be pushed off in my opinion but we need to add to the install documentation >> a step to make sure the MN's hostname resolves to its private IP address. >> >> Thought? >> >> Regards, >> Andy >> > >
