cc'ing in Mike*2 * Denis Plotnikov (dplotni...@virtuozzo.com) wrote: > > > On 26.07.2018 12:23, Peter Xu wrote: > > On Thu, Jul 26, 2018 at 10:51:33AM +0200, Paolo Bonzini wrote: > > > On 25/07/2018 22:04, Andrea Arcangeli wrote: > > > > > > > > It may look like the uffd-wp model is wish-feature similar to an > > > > optimization, but without the uffd-wp model when the WP fault is > > > > triggered by kernel code, the sigsegv model falls apart and requires > > > > all kind of ad-hoc changes just for this single feature. Plus uffd-wp > > > > has other benefits: it makes it all reliable in terms of not > > > > increasing the number of vmas in use during the snapshot. Finally it > > > > makes it faster too with no mmap_sem for reading and no sigsegv > > > > signals. > > > > > > > > The non cooperative features got merged first because there was much > > > > activity on the kernel side on that front, but this is just an ideal > > > > time to nail down the remaining issues in uffd-wp I think. That I > > > > believe is time better spent than trying to emulate it with sigsegv > > > > and changing all drivers to send new events down to qemu specific to > > > > the sigsegv handling. We considered this before doing uffd for > > > > postcopy too but overall it's unreliable and more work (no single > > > > change was then needed to KVM code with uffd to handle postcopy and > > > > here it should be the same). > > > > > > I totally agree. The hard part in userfaultfd was the changes to the > > > kernel get_user_pages API, but the payback was huge because _all_ kernel > > > uses (KVM, vhost-net, syscalls, etc.) just work with userfaultfd. Going > > > back to mprotect would be a huge mistake. > > > > Thanks for explaining the bits. I'd say I wasn't aware of the > > difference before I started the investigation (and only until now I > > noticed that major difference between mprotect and userfaultfd). I'm > > really glad that it's much clear (at least for me) on which way we > > should choose. > > > > Now I'm thinking whether we can move the userfault write protect work > > forward. The latest discussion I saw so far is in 2016, when someone > > from Huawei tried to use the write protect feature for that old > > version of live snapshot but reported issue: > > > > https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01127.html > > > > Is that the latest status for userfaultfd wr-protect? > > > > If so, I'm thinking whether I can try to re-verify the work (I tried > > his QEMU repository but I failed to compile somehow, so I plan to > > write some even simpler code to try) to see whether I can get the same > > KVM error he encountered. > > > > Thoughts? > > Just to sum up all being said before. > > Using mprotect is a bad idea because VM's memory can be accessed from the > number of places (KVM, vhost, ...) which need their own special care > of tracking memory accesses and notifying QEMU which makes the mprotect > using unacceptable. > > Protected memory accesses tracking can be done via userfaultfd's WP mode > which isn't available right now. > > So, the reasonable conclusion is to wait until the WP mode is available and > build the background snapshot on top of userfaultfd-wp. > But, works on adding the WP-mode is pending for a quite a long time already. > > Is there any way to estimate when it could be available?
I think a question is whether anyone is actively working on it; I suspect really it's on a TODO list rather than moving at the moment. What I don't really understand is what stage the last version got upto. Dave > > > > Regards, > > > > -- > Best, > Denis -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK