Good afternoon, devs -- I've been experiencing slower than expected reservation provisioning times on a VCL infrastructure that uses a SAN for all storage (all ESXi, on blades). I first noticed it when I'd click the "Connect!" button on a reservation and the RDP connection wouldn't open the first time. Restarting the RDP client 15-30 seconds later, the connection would succeed.
Watching vcld.log, I found that connecting to the Cygwin shell from the management node was taking 6-10 seconds (whereas the same connections on servers using local-disk storage take 1-2 seconds). I can replicate the behavior running ssh -i /etc/vcl/vcl.key <target machine> from the management node. It really hit home when I started a bash shell LOCALLY (with bash --login -i -x) on a target Windows VM and watched how long it took just to get to a bash prompt. Each of the startup scripts took a long time. (I'm not running bash-completion, a common complaint about slow Cygwin shell startups.) I *think* -- requesting confirmation of this -- that each time the management node wants to issue a command to a remote computer it initiates a new SSH connection, then closes that connection when the command finishes processing. Is that accurate? If so, that would mean that those 6-10 seconds would be compounded several times over while the management node prepares the remote computer for my reservation. I'm currently investigating moving Cygwin into a RAMdisk on the VM images, but that only makes sense if the above assumption about multiple SSH sessions is accurate. The latency on the SAN connection is very low, and ESXi reports that latencies on the virtural disks are slow. I have /etc/hosts set up, DNS resolves fine, and pings between the management node and VMs are fine. Has anyone else run into any similar behavior with Cygwin? Many thanks, Mike -- *Mike Haudenschild* Education Systems Manager Longsight Group (740) 599-5005 x809 [email protected] www.longsight.com
