>
> Probably cleaner to inject a smf service before vmadmd that on shutdown
> will loop all your kvm zones and set autoboot to false.
>
> Then insert a 2nd one after zones service is up that starts all your kvm
> zones one by one with a delay between them.
>
>

KVM VMs are always set with autoboot=false in their zone configuration.
Instead, they use a zonecfg attr vm-autoboot to determine whether or not to
bring them up when the node restarts. The zones service then does not boot
them but leaves them for vmadmd. When vmadmd starts it runs this:

https://github.com/joyent/smartos-live/blob/release-20151112/src/vm/sbin/vmadmd.js#L2206

which loops through all VMs. At line 2258 there you'll see that for KVM VMs
we call the loadVM function. That function then checks the autoboot
property (which is a virtual property and comes from the zone's autoboot
for non-KVM and vm-autoboot for KVM) and boots the zone if it should be
running.

What you'll notice here is that the line where we're looping (linked above)
is using async.forEachSeries. This means we already *are* booting these
zones sequentially but we are not waiting for VMs to be fully booted before
moving on to the next one. There is no callback being called from loadVM.
So we're just currently firing all these off to complete asynchronously.

So if, as is being claimed, all the KVM VMs booting simultaneously is
causing problems, the way I'd probably fix this if it were me doing the
work would be:

 * first add a callback to loadVM which gets called when we're finished
with a zone (the case in discussion is when the VM.start is complete)
 * resolve https://github.com/davepacheco/node-vasync/issues/27
 * update the vasync in /usr/vm/node_modules in smartos-live to include
this change
 * change async.forEachSeries to vasync.forEachParallel in vmadmd with an
optional configured limit on concurrency (probably in the node's config)

If you did this you may find that it's still a problem as VM.start() only
waits for the hardware to be ready and can't know when a guest's kernel has
completed its startup. So it may be that you'd have to add a configurable
delay after this such that after the VM.start is completed you call the
loadVM callback some number of (milli)seconds after VM.start() calls its
callback. This would give each guest that much time to start before the
next VM is started.

Hope that helps.

Josh



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to