On Thu, Jul 28, 2011 at 11:53:50AM +0900, Fernando Luis Vázquez Cao wrote:
> On Wed, 2011-07-27 at 17:24 +0200, Andrea Arcangeli wrote:
> > making
> > sure no lib is calling any I/O function to be able to defreeze the
> > filesystems later, making sure the oom killer or a wrong kill -9
> > $RANDOM isn't killing the agent by mistake while the I/O is blocked
> > and the copy is going.
> 
> Yes with the current API if the agent is killed while the filesystems
> are frozen we are screwed.
> 
> I have just submitted patches that implement a new API that should make
> the virtualization use case more reliable. Basically, I am adding a new
> ioctl, FIGETFREEZEFD, which freezes the indicated filesystem and returns
> a file descriptor; as long as that file descriptor is held open, the
> filesystem remains open. If the freeze file descriptor is closed (be it
> through a explicit call to close(2) or as part of process exit
> housekeeping) the associated filesystem is automatically thawed.
> 
> - fsfreeze: add ioctl to create a fd for freeze control
>   http://marc.info/?l=linux-fsdevel&m=131175212512290&w=2
> - fsfreeze: add freeze fd ioctls
>   http://marc.info/?l=linux-fsdevel&m=131175220612341&w=2

This is probably how the API should have been implemented originally
instead of FIFREEZE/FITHAW.

It looks a bit overkill though, I would think it'd be enough to have
the fsfreeze forced at FIGETFREEZEFD, and the only way to thaw by
closing the file without requiring any of the
FS_FREEZE_FD/FS_THAW_FD/FS_ISFROZEN_FD. But I guess you have use cases
for those if you implemented it, maybe to check if root is stepping on
its own toes by checking if the fs is already freezed before freezing
it and returning failure if it is, running ioctl instead of opening
closing the file isn't necessarily better. At the very least the
get_user(should_freeze, argp) doesn't seem so necessary, it just
complicates the ioctl API a bit without much gain, I think it'd be
cleaner if the FS_FREEZE_FD was the only way to freeze then.

It's certainly a nice reliability improvement and safer API.

Now if you add a file descriptor to epoll/poll that userland can open
and talk to, to know when a fsfreeze is asked on a certain fs, a
fsfreeze userland agent (not virt related too) could open it and start
the scripts if that filesystem is being fsfreezed before calling
freeze_super().

Then a PARAVIRT_FSFREEZE=y/m driver could just invoke the fsfreeze
without any dependency on a virt specific guest agent.

Maybe Christoph's right there are filesystems in userland (not sure
how the storage is related, it's all about filesystems and apps as far
I can see, and it's all blkdev agnostic) that may make things more
complicated, but those usually have a kernel backend too (like
fuse). I may not see the full picture of the filesystem in userland or
how the storage agent in guest userland relates to this.

If you believe having libvirt talking QMP/QAPI over a virtio-serial
vmchannel with some virt specific guest userland agent bypassing qemu
entirely is better, that's ok with me, but there should be a strong
reason for it because the paravirt_fsfreeze.ko approach with a small
qemu backend and a qemu monitor command that starts paravirt-fsfreeze
in guest before going ahead blocking all I/O (to provide backwards
compatibility and reliable snapshots to guest OS that won't have the
paravirt fsfreeze too) looks more reliable, more compact and simpler
to use to me. I'll be surely ok either ways though.

Thanks,
Andrea

Reply via email to