Hello Michael, On Wed, Jul 27, 2011 at 11:07:13AM -0500, Michael Roth wrote: > One thing worth mentioning is that the current host-side interface to > the guest agent is not what we're hoping to build libvirt interfaces > around. It's a standalone, out-of-band tool for now, but when QMP is > converted to QAPI the guest agent interfaces will be exposed to the host > transparently to the host as normal QMP commands. libvirt should be able > to tell the difference from a guest-agent induced fsfreeze or a guest > kernel induced fsfreeze (except perhaps to identify extended > capabilities in a particular case): > > http://wiki.qemu.org/Features/QAPI/GuestAgent
Sounds good. > Another thing to note is that snapshotting is not necessarily something > that should be completely transparent to the guest. One of the planned > future features for the guest agent (mentioned in the snapshot wiki, and > a common use case that I've seen come up elsewhere as well in the > context of database applications), is a way for userspace applications > to register callbacks to be made in the event of a freeze (dumping > application-managed caches to disk and things along that line). The Not sure if the scripts are really needed or if they would just open a brand new fsfreeze specific unix domain socket (created by the database) to tell the database to freeze. If the latter is the case, then it'd be better rather than changing the database to open unix domain socket so the script can connect to it when invoked (or maybe to just add some new function to the protocol of an existing open unix domain socket), to instead change the database to open a /dev/virtio-fsfreeze device, created by the virtio-fsfreeze.ko virtio driver through udev. The database would poll it, and it could read the request to freeze, and write into it that it finished freezing when done. Then when all openers of the device freezed, the virtio-fsfreeze.ko would go ahead freezing all the filesystems, and then tell qemu when it's finished freezing. Then qemu can finally block all the I/O and tell libvirt to go ahead with the snapshot. If the script hangs (user agent in guest approach), or if the database hangs while keeping open the /dev/virtio-fsfreeze device (virtio-fsfreeze.ko approach), that would hang the whole fsfreeze operation in the virtio-fsfreeze.ko driver. Otherwise a timeout would be required. But the general idea is that the more stuff is going to be freezed (especially when userland is involved and not just guest kernel code like in the virtio-fsfreeze.ko), the higher the risk of an hang (or alternatively of a false positive timeout... if there's a timeout). If scripts are needed, then the agent starting the scripts with execve, could also open the /dev/virtio-fsfreeze instead of being invoked by the communication with libvirt with QMP/QAPI etc... The advantage at least is that if the database is killed, closing the file will not lead to an hang or a failure of the fsfreeze. If the agent is killed things would go bad instead (either hang or timeout). Maybe it's more a matter of taste, and maybe my taste makes me prefer a virtio-fsfreeze.ko that later can create register a dev /dev/virtio-fsfreeze that any app can open. The permission on the device will also define which apps may lead to false positive timeout of the snapshotting, or lead to an hang. > implementation of this would likely be a directory where application can > place scripts in that get called in the event of a freeze, something > that would require a user-space daemon anyway. > > Also, in terms of supporting older guests, the proposed guest tools ISO > (akin to virtualbox/vmware guest tools): > > http://lists.gnu.org/archive/html/qemu-devel/2011-06/msg02239.html > > would give us a distribution channel that doesn't require any > involvement from distro maintainers. A distro-package to boot strap the > agent would be still be preferable, but the ISO approach seems to work > well in practice. And for managed environments getting custom packages > installed generally isn't as much of a problem as requiring reboots or > kernel changes. Nice to see it works for more hypervisors. I think it boils down if an agent is needed for fsfreeze or not. I think it's not, but I also tend to agree it can work with the agent. As a developer I don't have much doubt that it'd be so much simpler to use for me with a virtio driver and no userland change but I may be biased. I just don't see many cons to the kernel solution, except perhaps the fact to change the fsfreeze code you've to respin a kernel update.