Excerpts from Jay Pipes's message of 2014-08-22 11:16:05 -0700: > On 08/22/2014 01:48 PM, Clint Byrum wrote: > > It has been brought to my attention that Ironic uses the biggest hammer > > in the IPMI toolbox to control chassis power: > > > > https://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/ipminative.py#n142 > > > > Which is > > > > ret = ipmicmd.set_power('off', wait) > > > > This is the most abrupt form, where the system power should be flipped > > off at a hardware level. The "short press" on the power button would be > > 'shutdown' instead of 'off'. > > > > I also understand that this has been brought up before, and that the > > answer given was "SSH in and shut it down yourself." I can respect that > > position, but I have run into a bit of a pickle using it. Observe: > > > > - ssh box.ip "poweroff" > > - poll ironic until power state is off. > > - This is a race. Ironic is asserting the power. As soon as it sees > > that the power is off, it will turn it back on. > > > > - ssh box.ip "halt" > > - NO way to know that this has worked. Once SSH is off and the network > > stack is gone, I cannot actually verify that the disks were > > unmounted properly, which is the primary area of concern that I > > have. > > > > This is particulary important if I'm issuing a rebuild + preserve > > ephemeral, as it is likely I will have lots of I/O going on, and I want > > to make sure that it is all quiesced before I reboot to replace the > > software and reboot. > > > > Perhaps I missed something. If so, please do educate me on how I can > > achieve this without hacking around it. Currently my workaround is to > > manually unmount the state partition, which is something system shutdown > > is supposed to do and may become problematic if system processes are > > holding it open. > > > > It seems to me that Ironic should at least try to use the graceful > > shutdown. There can be a timeout, but it would need to be something a user > > can disable so if graceful never works we never just dump the power on the > > box. Even a journaled filesystem will take quite a bit to do a full fsck. > > > > The inability to gracefully shutdown in a reasonable amount of time > > is an error state really, and I need to go to the box and inspect it, > > which is precisely the reason we have ERROR states. > > What about placing a runlevel script in /etc/init.d/ and symlinking it > to run on shutdown -- i.e. /etc/rc0.d/? You could run fsync or unmount > the state partition in that script which would ensure disk state was > quiesced, no?
That's already what OS's do in their rc0.d. My point is, I don't have any way to know that process happened, without the box turning itself off after it succeeded. _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev