We have been running the code from trunk on our production system since
this morning.  I have made a few reservations and captured an image.  I
have not encountered any problems.
-Andy

On Thu, Mar 19, 2015 at 4:31 PM, Andy Kurth <[email protected]> wrote:

> I believe the problems described in VCL-839 are fixed in trunk.  I did not
> make any changes to vcl-install.sh, but updated the backend code to not
> rely on the management node's private IP address.  Please test it out.  The
> changes only affect Linux images.  When testing, be sure to verify the
> firewall is correct:
>
> -after an image loads, before post_load is executed (22 should be open to
> all IPs)
> -after post_load is executed (22 should be closed, all of management
> node's IPs should be allowed to connect to any port)
> -after a user clicks Connect but before connecting (22 should be open to
> any IP, management node access shouldn't change)
> -after a user connects (22 should be locked down to user's IP, management
> node access shouldn't change)
> -after a user clicks Connect and the reservation times out due to no
> initial connection, after sanitize is executed (22 should be closed to all,
> management node should still be allowed from any of its IPs to any port)
> -after pre_capture is executed (22 should be open to all)
> -after a user clicks Connect from a different remote IP address (22 should
> be allowed from user's original and new remote IP)
>
> It is also beneficial to test the outcome if the management node is only
> allowed to connect on port 22.  Manually change iptables and check the
> various stages.  Under no condition should the management node be locked
> out.
>
> We are essentially running 2.4 in production right now.  I'll update all
> of our management nodes tomorrow morning to trunk and we will watch things
> closely.  If no problems are identified, I think a release candidate could
> be created late in the day tomorrow.
>
> Thanks,
> Andy
>
> On Wed, Mar 18, 2015 at 4:09 PM, Andy Kurth <[email protected]> wrote:
>
>>
>> On Wed, Mar 18, 2015 at 11:08 AM, Aaron Coburn <[email protected]>
>> wrote:
>>
>>> I'm in favor of whatever would be least confusing to users. And that
>>> probably means waiting until a 2.4.1 release before announcing it on the
>>> a.o mailing list.
>>>
>>
>> Agree.
>>
>> Regarding 2.4.1, the problem discovered yesterday has been fixed in
>> trunk.  I tested a few 15-VM reservations using the code in trunk and
>> cluster_info was correct.
>>
>> However, I found another problem described in
>> https://issues.apache.org/jira/browse/VCL-839.  Using a slightly
>> modified vcl-install.sh, I installed a new CentOS 6.5 VM with VCL 2.4 and
>> then updated it with the code in trunk.  I was able to create a CentOS 6.5
>> base image and make reservations without any problems.  When I attempted to
>> capture one of the reservations, it failed because the management node had
>> locked itself out after the first user connection was detected.  This is
>> described ad nauseam in the Jira issue.
>>
>> The problem is partially due to vcl-install.sh using localhost by default
>> as the management node name.  We could change the script to use something
>> else.  Regardless, the management node name must resolve to the private IP
>> address or problems will occur.  The script should add an entry to
>> /etc/hosts so the MN's hostname in vcld.conf and the management node table
>> resolves to the MN's private IP address.  Josh primarily developed the
>> script but is travelling this week.  I can try to address the issues with
>> the script tomorrow.
>>
>> This will fix the install script but there are still problems with the
>> code.  A management node should never lock itself out.  These problems can
>> be pushed off in my opinion but we need to add to the install documentation
>> a step to make sure the MN's hostname resolves to its private IP address.
>>
>> Thought?
>>
>> Regards,
>> Andy
>>
>
>

Reply via email to