http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-01/msg12802.htmlRe: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: Robin, if you don't mind, could you please post or upload somewhere Yes my last patch was SMP safe, stable and feature complete for KVM. I tested it for 1 week on my smp workstation with real desktop load and everything loaded, with 3G non-linux guest running on 2G of ram. Now for whatever reason I adapted the KVM side to Christoph's V2/V3 and it hangs the moment it hits swap. However in the meantime I changed test hardware, upgraded host to 2.6.24-hg, and upgraded kvm kernel and userland. all patches applied cleanly (with a minor nit in a .h include in V2 on top of current git). Swapping of regular tasks on the test system is 100% solid or I wouldn't even wasting time mentioning this. By code inspection I didn't expect a stability regression or I wouldn't have chanced all variables at the same time (taking the opportunity to move everything to bleeding edge while moving to V2 turned out to be a bad idea). I already audited the mmu notifiers a few times, infact I already went back to call invalidate_page and age_page inside ptep_clear_flush/young in case the page-pin wasn't enough to prevent the page to change under the sptes, as I thought yesterday. Christoph's V3 notably still misses the needed range flushes in mremap for example, but that's not my problem. (Jack instead will certainly kernel crash due to the missing invalidate_page after ptep_clear_flush in mremap, such an invalidate_page wasn't missing with my last patch) I'm now going to run the same binaries that still are stable on my workstation on the test system too, to rule out timings and hardware differences. implement Christoph's and my changes in a safe fashion. Andrea, I agree I think for KVM basic swapping both V2 and V3 should be safe. V2 had race conditions that would later break KSM yes, I fixed it and V3 should be already ok and I'm not testing KSM. This is all thanks to the pin of the page in get_user_page that KVM does for every page mapped in any spte. The three issues we need to simultaneously solve is revoking the remote Agreed. Could we consider doing a range-based recall and lock callout before invalidate_page/age_page can return inside ptep_clear_flush/young and Jack will need that too. Infact Jack will need an invalidate_page also inside ptep_get_and_clear. And the range callout will be done always in a sleeping context and it'll relay on the page-pin to be safe (when details->i_mmap_lock != NULL invalidate_range it shouldn't be called inside zap_page_range but before returning from unmap_mapping_range_vma before cond_resched). This will make everything a bit simpler and less prone to breakage IMHO, plus it'll have a chance to work for Jack w/o page-pin without additional cluttering of mm/*.c. |