>From this thread: http://vcl.markmail.org/thread/2jyeef4dt3zfz5iq
Set the output of the SSH call to be verbose, and am getting "connection timed out during banner exchange"[1] [2]. Dmitri has changed his to 10 seconds with VCL 2.3 [3], but we haven't made any changes to the code and this suddenly started happening. 7804|5:5|new| ---- WARNING ---- |7804|5:5|new| 2014-05-29 10:28:33|7804|5:5|new|utils.pm:run_ssh_command(5010)|attempt 1/3: failed to execute SSH command on vcl02: 'echo "testing ssh on vcl02"', exit status: 255, SSH exits with the exit status of the remote command or with 255 if an error occurred, output: |7804|5:5|new| ssh output (echo "test...): Connection timed out during banner exchange |7804|5:5|new| ( 0) utils.pm, run_ssh_command (line: 5010) |7804|5:5|new| (-1) OS.pm, is_ssh_responding (line: 668) |7804|5:5|new| (-2) VMware.pm, initialize (line: 303) |7804|5:5|new| (-3) Module.pm, new (line: 207) |7804|5:5|new| (-4) Module.pm, create_provisioning_object (line: 522) |7804|5:5|new| (-5) State.pm, initialize (line: 136) [1] http://vcl.markmail.org/thread/kixneixigivlgqjh [2] http://vcl.markmail.org/thread/bulmesaak4bqe44a [3] http://vcl.markmail.org/thread/qzxovcgfklmsfce5 Is network congestion the most likely issue? vcl-private is on its own, isolated network between two servers -- there's not even a switch in between... Thanks, Mike On Thu, May 29, 2014 at 1:14 PM, Mike Haudenschild <[email protected]> wrote: > FWIW, seeing this across multiple management nodes and multiple VM hosts. > > > On Thu, May 29, 2014 at 1:07 PM, Mike Haudenschild <[email protected]> > wrote: > >> Yes, I'm able to connect manually via SSH using >> >> ssh -i /etc/vcl/vcl.key <vmhostname> >> >> By IP and by name with and without DNS suffix. >> >> Thanks, >> Mike >> >> >> On Thu, May 29, 2014 at 1:04 PM, Aaron Coburn <[email protected]> >> wrote: >> >>> Mike, >>> >>> For vmware hosts, vcld will always try to connect over the vSphere API >>> first. If that fails, it will try ssh. It appears the problem you're >>> encountering relates to the VM hosts not responding to ssh. Are you able to >>> manually connect to the host via ssh? >>> >>> Aaron Coburn >>> >>> >>> On May 29, 2014, at 11:49 AM, Mike Haudenschild <[email protected]> >>> wrote: >>> >>> > Good morning -- >>> > >>> > We're unable to create reservations, or even migrate VMs from VM >>> hosts, via the VCL Web UI. Seeing the following in the logs, which seem to >>> suggest that the management node is trying to use the vSphere SDK to >>> control the ESXi host, which is not desired. The username/password fields >>> in the VM host profile assigned to these VM hosts is EMPTY, which I believe >>> is the criterion on which VCL decides what method to use to control ESXi. >>> I confirmed that these fields are NULL in the database for this host >>> profile, as well. >>> > >>> > I would create a new/clean host profile and move the VM hosts there, >>> but since they have VMs attached to them (and I can't migrate them off), >>> VCL won't allow me to change the individual VM hosts' profiles to a new one. >>> > >>> > Thanks very much, >>> > Mike >>> > >>> > >>> > >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:create_mn_os_object(381)|VCL::Module::OS::Linux::ManagementNode >>> OS object created, address: 97e1ae0 >>> > 2014-05-29 08:38:57|7330|109:109|new|Module.pm:new(192)|VCL::new >>> object created for state new, address: 8f50650 >>> > 2014-05-29 08:38:57|7330|109:109|new|State.pm:initialize(88)|obtained >>> a database handle for this state process, stored as $ENV{dbh} >>> > 2014-05-29 08:38:57|7330|109:109|new|State.pm:check_image_os(771)|no >>> corrections need to be made, not an imaging request, returning 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:is_parent_reservation(1004)|returning >>> true: parent reservation ID for this request: 109 >>> > 2014-05-29 >>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6752)|reservation >>> count: 1 >>> > 2014-05-29 >>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6771)|PARENTIMAGE: >>> 1 >>> > 2014-05-29 >>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6772)|SUBIMAGE: >>> 0 >>> > 2014-05-29 >>> > 08:38:57|7330|109:109|new|utils.pm:rename_vcld_process(6790)|renamed >>> process to 'vcld 109:109 new labvm13>vcl01.whittier.edu >>> vmwarewin7-Windows7Base3-v2 admin' >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:is_parent_reservation(1004)|returning >>> true: parent reservation ID for this request: 109 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:is_parent_reservation(1004)|returning >>> true: parent reservation ID for this request: 109 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:create_os_object(308)|VCL::Module::OS::Windows::Version_6::7 >>> module loaded >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:new(196)|VCL::Module::OS::Windows::Version_6::7 >>> object created for image vmwarewin7-Windows7Base3-v2, address: 97ea9a0 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Windows.pm:initialize(214)|beginning Windows >>> module initialization >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Windows.pm:initialize(224)|Windows module >>> initialization complete >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:create_os_object(314)|VCL::Module::OS::Windows::Version_6::7 >>> OS object created, address: 97ea9a0 >>> > 2014-05-29 08:38:57|7330|109:109|new|State.pm:initialize(125)|computer >>> is a VM, attempting to create VM host OS object >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(619)|computer ID >>> argument was specified, retrieving data for computer ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(643)|image ID >>> argument was specified: 1, DataStructure object will contain image >>> information for the production imagerevision of this image >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(674)|retrieved data >>> for imagerevision ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(690)|retrieved data >>> for image ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:create_vmhost_os_object(454)|attempting >>> to load VM host OS module: VCL::Module::OS::Linux::UnixLab (image: 1) >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:create_vmhost_os_object(460)|VM host OS >>> module loaded: VCL::Module::OS::Linux::UnixLab >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:new(196)|VCL::Module::OS::Linux::UnixLab >>> object created for image noimage, address: 9ead8e0 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:create_vmhost_os_object(465)|VCL::Module::OS::Linux::UnixLab >>> OS object created, address: 9ead8e0 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:create_provisioning_object(519)|VCL::Module::Provisioning::VMware::VMware >>> module loaded >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:new(200)|VCL::Module::Provisioning::VMware::VMware >>> object created for computer labvm13, address: 91447d0 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|VMware.pm:initialize(267)|initializing >>> VCL::Module::Provisioning::VMware::VMware object >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(619)|computer ID >>> argument was specified, retrieving data for computer ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(643)|image ID >>> argument was specified: 1, DataStructure object will contain image >>> information for the production imagerevision of this image >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(674)|retrieved data >>> for imagerevision ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(690)|retrieved data >>> for image ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_datastructure(1397)|created >>> DataStructure object for VM host: vcl01 >>> > 2014-05-29 08:38:57|7330|109:109|new|VMware.pm:initialize(287)|VM >>> profile assigned to vcl01: WhittierLocal - profile >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(619)|computer ID >>> argument was specified, retrieving data for computer ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(643)|image ID >>> argument was specified: 1, DataStructure object will contain image >>> information for the production imagerevision of this image >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(674)|retrieved data >>> for imagerevision ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|DataStructure.pm:_initialize(690)|retrieved data >>> for image ID: 1 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_datastructure(1397)|created >>> DataStructure object for VM host: vcl01 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_api_object(1500)|attempting >>> to load VMware control module: >>> VCL::Module::Provisioning::VMware::vSphere_SDK >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_api_object(1506)|loaded >>> VMware control module: VCL::Module::Provisioning::VMware::vSphere_SDK >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|Module.pm:new(200)|VCL::Module::Provisioning::VMware::vSphere_SDK >>> object created for computer labvm13, address: a305240 >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|vSphere_SDK.pm:initialize(98)|vSphere SDK for >>> Perl does not appear to be installed on this managment node, unable to load >>> VMware vSphere SDK Perl modules, error: >>> > |7330|109:109|new| Can't locate VMware/VIRuntime.pm in @INC (@INC >>> contains: /usr/local/vcl/bin/../../../.. /usr/local/vcl/bin/../.. >>> /usr/local/vcl/bin/../../.. /usr/local/vcl/bin/../../../../.. >>> /usr/local/vcl/bin/.. /usr/local/vcl/bin/../lib >>> /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi >>> /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl >>> /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi >>> /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl >>> /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at >>> (eval 231) line 1. >>> > |7330|109:109|new| BEGIN failed--compilation aborted at (eval 231) >>> line 1. >>> > 2014-05-29 >>> 08:38:57|7330|109:109|new|VMware.pm:get_vmhost_api_object(1520)|API object >>> could not be created: VCL::Module::Provisioning::VMware::vSphere_SDK >>> > 2014-05-29 08:39:01|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:01 >>> > 2014-05-29 08:39:06|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:06 >>> > 2014-05-29 08:39:11|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:11 >>> > 2014-05-29 >>> 08:39:12|7330|109:109|new|OS.pm:is_ssh_responding(682)|vcl01 is NOT >>> responding to SSH, SSH command failed, port 22: open, port 24: closed >>> > 2014-05-29 08:39:12|7330|109:109|new|VMware.pm:initialize(307)|unable >>> to control OS of VM host vcl01 using VCL::Module::OS::Linux::UnixLab OS >>> object because VM host is not responding to SSH >>> > >>> > |7330|109:109|new| ---- WARNING ---- >>> > |7330|109:109|new| 2014-05-29 >>> 08:39:12|7330|109:109|new|Module.pm:create_provisioning_object(530)|provisioning >>> object could not be created, returning 0 >>> > |7330|109:109|new| ( 0) Module.pm, create_provisioning_object (line: >>> 530) >>> > |7330|109:109|new| (-1) State.pm, initialize (line: 136) >>> > |7330|109:109|new| (-2) Module.pm, new (line: 207) >>> > |7330|109:109|new| (-3) vcld, make_new_child (line: 567) >>> > |7330|109:109|new| (-4) vcld, main (line: 350) >>> > >>> > >>> > |7330|109:109|new| ---- WARNING ---- >>> > |7330|109:109|new| 2014-05-29 >>> 08:39:12|7330|109:109|new|State.pm:initialize(145)|failed to create >>> provisioning object >>> > |7330|109:109|new| ( 0) State.pm, initialize (line: 145) >>> > |7330|109:109|new| (-1) Module.pm, new (line: 207) >>> > |7330|109:109|new| (-2) vcld, make_new_child (line: 567) >>> > |7330|109:109|new| (-3) vcld, main (line: 350) >>> > >>> > 2014-05-29 >>> > 08:39:12|7330|109:109|new|utils.pm:delete_computerloadlog_reservation(6396)|removing >>> computerloadlog entries matching loadstate = begin >>> > 2014-05-29 >>> > 08:39:12|7330|109:109|new|utils.pm:delete_computerloadlog_reservation(6443)|deleted >>> rows from computerloadlog for reservation id=109 >>> > 2014-05-29 08:39:12|7330|109:109|new|State.pm:DESTROY(929)|VCL::new >>> process duration: 15 seconds >>> > 2014-05-29 >>> 08:39:12|7330|109:109|new|DataStructure.pm:get_computer_private_ip_address(1641)|attempting >>> to retrieve private IP address for computer: labvm13 >>> > 2014-05-29 >>> 08:39:12|7330|109:109|new|OS.pm:get_file_contents(1897)|retrieved 77 lines >>> from file on vclman02: '/etc/hosts' >>> > 2014-05-29 >>> 08:39:12|7330|109:109|new|DataStructure.pm:get_computer_private_ip_address(1697)|returning >>> IP address from /etc/hosts file: 192.168.1.13 >>> > 2014-05-29 08:39:12|7330|109:109|new|utils.pm:getnewdbh(2474)|unable >>> to use database handle stored in $ENV{dbh} >>> > 2014-05-29 08:39:12|7330|109:109|new|utils.pm:getnewdbh(2531)|database >>> handle stored in $ENV{dbh} >>> > 2014-05-29 08:39:12|7330|109:109|new|utils.pm:is_inblockrequest(5760)|zero >>> rows were returned from database select >>> > 2014-05-29 >>> 08:39:12|7330|109:109|new|DataStructure.pm:get_image_affiliation_name(2121)|image >>> owner id: 1 >>> > 2014-05-29 >>> 08:39:12|7330|109:109|new|DataStructure.pm:retrieve_user_data(1412)|attempting >>> to retrieve and store data for user: user.id = '1' >>> > 2014-05-29 >>> 08:39:12|7330|109:109|new|DataStructure.pm:retrieve_user_data(1475)|data >>> has been retrieved for user: admin (id: 1) >>> > 2014-05-29 08:39:16|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:16 >>> > 2014-05-29 08:39:22|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:21 >>> > 2014-05-29 08:39:27|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:27 >>> > 2014-05-29 08:39:32|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:32 >>> > 2014-05-29 08:39:37|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:37 >>> > 2014-05-29 08:39:42|7166|vcld:main(167)|lastcheckin time updated for >>> management node 2: 2014-05-29 08:39:42 >>> > 2014-05-29 08:39:43|7330|109:109|new|utils.pm:mail(1253)|SUCCESS -- >>> Sending mail To: [email protected], PROBLEM -- >>> vclman02|109:109|new|vcld|labvm13>vcl01.whittier.edu >>> |vmwarewin7-Windows7Base3-v2|admin >>> > >>> > |7330|109:109|new| ---- CRITICAL ---- >>> > |7330|109:109|new| 2014-05-29 >>> 08:39:12|7330|109:109|new|vcld:make_new_child(574)|VCL::new object could >>> not be created and initialized >>> > |7330|109:109|new| ( 0) vcld, make_new_child (line: 574) >>> > |7330|109:109|new| (-1) vcld, main (line: 350) >>> >>> >> >
