Problems with update_cluster in OS.pm

Andy Kurth Wed, 04 Feb 2015 15:08:07 -0800

I was working on some issues related to cluster requests and discovered
there are some serious issues with the update_cluster subroutine in OS.pm.
This code was reworked after the previous VCL 2.3.2 release.  These
problems need to be corrected before 2.4 may be released.


Problem 1:
Due to OS.pm::update_cluster, the processing of the reserved state for each
computer is taking an extremely long time.  A fairly large cluster of 80
computers is taking over 20 minutes to complete, not including the time it
spends waiting for the user to click Connect and to actually connect to the
computer.  This is longer than it took to actually load the bare-metal
computers via xCAT.

Problem 2:
Every reservation that belongs to a cluster request first creates its own
cluster_info file in /tmp.  The file that each creates is identical.  Every
reservation then proceeds to copy the file it created to every other
computer in the cluster request via SCP.  So, for an 80 node cluster,
cluster_info files are needlessly copied back and forth 6,400 times.

Problem 3:
Each node does not copy the cluster_info file it created in /tmp to the
proper location on itself.  Instead, it sends it to itself via SCP.
There's really no need for this because the file will end up in the proper
location when the other 79 nodes SCP it over.

Problem 4:
The subroutine is using the following to determine where to send the file
to via SCP:
my $node_name = $request_data->{reservation}{$resid}{computer}{SHORTNAME};

If you're not familiar with the backend code, the remote computer's short
hostname is being used.  For example, if the FQDN is vm9.example.com, the
code is trying to send the file to "vm9".  This will only work if "vm9"
resolves to an address the management node can get to.  This will not work
for distributed management nodes residing in different locations.
Furthermore, with the move away from relying on /etc/hosts, whenever a
hostname is passed to run_scp_command, the computer's private IP address
will be used.  Examine the determine_remote_connection_target for more
information.  If absolutely necessary to SCP files back and forth, the
public IP address must be used.

Problem 5:
The update_cluster subroutine is attempting to open the firewall on each
computer for incoming traffic from the other computers in the cluster.
This is good, but the way it is implemented is very slow an inefficient.
It is also not reliable due to the way the enable_firewall_port subroutines
are handling things.  Errors are being generated because the command being
generated is too large:

|10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828| ----
> WARNING ----
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828|
> 2015-02-04
> 16:28:26|10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828|failed
> to enable firewall port on blade1f5-5, protocol: tcp, port: any, scope:
> <omitted>, exit status: 1, command:
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828|
> iptables -D INPUT 9 && iptables -D INPUT 8 && iptables -D INPUT 7 &&
> iptables -D INPUT 6 && iptables -D INPUT 5 && iptables -D INPUT 4 &&
> iptables -D INPUT 3 && iptables -D INPUT 2 && iptables -D INPUT 10 &&
> iptables -D INPUT 1 && /sbin/iptables -v -I INPUT 1 -p tcp -j ACCEPT -s
> 10.25.0.241 -m state --state NEW,RELATED,ESTABLISHED &&

...much more omitted...
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828| output:
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828|
> iptables: Index of deletion too big.
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828| ( 0)
> Linux.pm, enable_firewall_port (line: 3828)
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828| (-1)
> OS.pm, update_cluster (line: 3909)
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828| (-2)
> reserved.pm, process (line: 157)
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828| (-3)
> vcld, make_new_child (line: 587)
> |10435|2626116|2726142|reserved|Linux.pm:enable_firewall_port|3828| (-4)
> vcld, main (line: 348)



Problem 6:
The subroutine is calling $self->data->get_request_data() and then
accessing the hash data directly.  Example:
$request_data->{reservation}{$rid}{computer}{IPaddress}

The past few years have been spent trying to get rid of code which directly
accesses the hash constructed by get_request_info.  There are functions
built into DataStructure.pm which may be accessed via  $self->data to
retrieve this information, including information about other reservations
in the cluster.  The get_request_data function should only be used when
absolutely necessary -- such as from DataStructure.pm or if the entire hash
is needed to create another module object.

Problem 7:
The reserved.pm::process subroutine calculates the exact time it should
timeout a reservation due to no initial connection by taking the time the
user clicked connect and adding the initialconnecttimeout variable value to
it.  The update_cluster subroutine gets run in the meantime.  As it is
currently written, it can take longer than the entire initialconnecttimeout
value.  When control is passed back to reserved.pm::process, the code
realizes the initial connection expire time is in the past so connection
checking isn't being done at all.

Problem 8:
The code includes the following:
if ($image_OS_type =~ /linux/i) {
  ...
}
elsif ($image_OS_type =~ /windows/i) {
  ...
}

The backend is architected so that these if/else statements should be
avoided whenever possible.  Sure, an if/else such as this is quick and easy
but the more elegant way to handle it would be to add a commonly named
subroutine to the Linux and Windows modules that handles the specifics of
this statement.  Again, work has been done over the past few years to
rework code such as this.  I see no reason to keep adding it back.

Problem 9:
"$image_OS_type" - There are over 9,000 variables defined in the backend
code.  Variables intended to act as constants for an entire module are all
uppercase.  Nearly all other variables are entirely lowercase, except for a
few in some very old code.  I see no reason to mix case.

Regards,
Andy

Problems with update_cluster in OS.pm

Reply via email to