On Mon, Aug 13, 2018 at 08:00:19PM +0100, Dr. David Alan Gilbert wrote: > cc'ing in Mike*2 > * Denis Plotnikov (dplotni...@virtuozzo.com) wrote: > > > > > > On 26.07.2018 12:23, Peter Xu wrote: > > > On Thu, Jul 26, 2018 at 10:51:33AM +0200, Paolo Bonzini wrote: > > > > On 25/07/2018 22:04, Andrea Arcangeli wrote: > > > > > > > > > > It may look like the uffd-wp model is wish-feature similar to an > > > > > optimization, but without the uffd-wp model when the WP fault is > > > > > triggered by kernel code, the sigsegv model falls apart and requires > > > > > all kind of ad-hoc changes just for this single feature. Plus uffd-wp > > > > > has other benefits: it makes it all reliable in terms of not > > > > > increasing the number of vmas in use during the snapshot. Finally it > > > > > makes it faster too with no mmap_sem for reading and no sigsegv > > > > > signals. > > > > > > > > > > The non cooperative features got merged first because there was much > > > > > activity on the kernel side on that front, but this is just an ideal > > > > > time to nail down the remaining issues in uffd-wp I think. That I > > > > > believe is time better spent than trying to emulate it with sigsegv > > > > > and changing all drivers to send new events down to qemu specific to > > > > > the sigsegv handling. We considered this before doing uffd for > > > > > postcopy too but overall it's unreliable and more work (no single > > > > > change was then needed to KVM code with uffd to handle postcopy and > > > > > here it should be the same). > > > > > > > > I totally agree. The hard part in userfaultfd was the changes to the > > > > kernel get_user_pages API, but the payback was huge because _all_ kernel > > > > uses (KVM, vhost-net, syscalls, etc.) just work with userfaultfd. Going > > > > back to mprotect would be a huge mistake. > > > > > > Thanks for explaining the bits. I'd say I wasn't aware of the > > > difference before I started the investigation (and only until now I > > > noticed that major difference between mprotect and userfaultfd). I'm > > > really glad that it's much clear (at least for me) on which way we > > > should choose. > > > > > > Now I'm thinking whether we can move the userfault write protect work > > > forward. The latest discussion I saw so far is in 2016, when someone > > > from Huawei tried to use the write protect feature for that old > > > version of live snapshot but reported issue: > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01127.html > > > > > > Is that the latest status for userfaultfd wr-protect? > > > > > > If so, I'm thinking whether I can try to re-verify the work (I tried > > > his QEMU repository but I failed to compile somehow, so I plan to > > > write some even simpler code to try) to see whether I can get the same > > > KVM error he encountered. > > > > > > Thoughts? > > > > Just to sum up all being said before. > > > > Using mprotect is a bad idea because VM's memory can be accessed from the > > number of places (KVM, vhost, ...) which need their own special care > > of tracking memory accesses and notifying QEMU which makes the mprotect > > using unacceptable. > > > > Protected memory accesses tracking can be done via userfaultfd's WP mode > > which isn't available right now. > > > > So, the reasonable conclusion is to wait until the WP mode is available and > > build the background snapshot on top of userfaultfd-wp. > > But, works on adding the WP-mode is pending for a quite a long time already. > > > > Is there any way to estimate when it could be available? > > I think a question is whether anyone is actively working on it; I > suspect really it's on a TODO list rather than moving at the moment.
I thought Andrea was working on it :) > What I don't really understand is what stage the last version got upto. > > Dave > > > > > > > Regards, > > > > > > > -- > > Best, > > Denis > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > -- Sincerely yours, Mike.