Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Andrea Arcangeli
Hello Jeson,

On Fri, Mar 08, 2019 at 04:50:36PM +0800, Jason Wang wrote:
> Just to make sure I understand here. For boosting through huge TLB, do 
> you mean we can do that in the future (e.g by mapping more userspace 
> pages to kenrel) or it can be done by this series (only about three 4K 
> pages were vmapped per virtqueue)?

When I answered about the advantages of mmu notifier and I mentioned
guaranteed 2m/gigapages where available, I overlooked the detail you
were using vmap instead of kmap. So with vmap you're actually doing
the opposite, it slows down the access because it will always use a 4k
TLB even if QEMU runs on THP or gigapages hugetlbfs.

If there's just one page (or a few pages) in each vmap there's no need
of vmap, the linearity vmap provides doesn't pay off in such
case.

So likely there's further room for improvement here that you can
achieve in the current series by just dropping vmap/vunmap.

You can just use kmap (or kmap_atomic if you're in preemptible
section, should work from bh/irq).

In short the mmu notifier to invalidate only sets a "struct page *
userringpage" pointer to NULL without calls to vunmap.

In all cases immediately after gup_fast returns you can always call
put_page immediately (which explains why I'd like an option to drop
FOLL_GET from gup_fast to speed it up).

Then you can check the sequence_counter and inc/dec counter increased
by _start/_end. That will tell you if the page you got and you called
put_page to immediately unpin it or even to free it, cannot go away
under you until the invalidate is called.

If sequence counters and counter tells that gup_fast raced with anyt
mmu notifier invalidate you can just repeat gup_fast. Otherwise you're
done, the page cannot go away under you, the host virtual to host
physical mapping cannot change either. And the page is not pinned
either. So you can just set the "struct page * userringpage = page"
where "page" was the one setup by gup_fast.

When later the invalidate runs, you can just call set_page_dirty if
gup_fast was called with "write = 1" and then you clear the pointer
"userringpage = NULL".

When you need to read/write to the memory
kmap/kmap_atomic(userringpage) should work.

In short because there's no hardware involvement here, the established
mapping is just the pointer to the page, there is no need of setting
up any pagetables or to do any TLB flushes (except on 32bit archs if
the page is above the direct mapping but it never happens on 64bit
archs).

Thanks,
Andrea
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Andrea Arcangeli
On Fri, Mar 08, 2019 at 04:58:44PM +0800, Jason Wang wrote:
> Can I simply can set_page_dirty() before vunmap() in the mmu notifier 
> callback, or is there any reason that it must be called within vumap()?

I also don't see any problem in doing it before vunmap. As far as the
mmu notifier and set_page_dirty is concerned vunmap is just
put_page. It's just slower and potentially unnecessary.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Andrea Arcangeli
On Fri, Mar 08, 2019 at 05:13:26PM +0800, Jason Wang wrote:
> Actually not wrapping around,  the pages for used ring was marked as 
> dirty after a round of virtqueue processing when we're sure vhost wrote 
> something there.

Thanks for the clarification. So we need to convert it to
set_page_dirty and move it to the mmu notifier invalidate but in those
cases where gup_fast was called with write=1 (1 out of 3).

If using ->invalidate_range the page pin also must be removed
immediately after get_user_pages returns (not ok to hold the pin in
vmap until ->invalidate_range is called) to avoid false positive gup
pin checks in things like KSM, or the pin must be released in
invalidate_range_start (which is called before the pin checks).

Here's why:

/*
 * Check that no O_DIRECT or similar I/O is in progress on the
 * page
 */
if (page_mapcount(page) + 1 + swapped != page_count(page)) {
set_pte_at(mm, pvmw.address, pvmw.pte, entry);
goto out_unlock;
}
[..]
set_pte_at_notify(mm, pvmw.address, pvmw.pte, entry);
  ^^^ too late release the pin here, the
  above already failed

->invalidate_range cannot be used with mutex anyway so you need to go
back with invalidate_range_start/end anyway, just the pin must be
released in _start at the latest in such case.

My prefer is generally to call gup_fast() followed by immediate
put_page() because I always want to drop FOLL_GET from gup_fast as a
whole to avoid 2 useless atomic ops per gup_fast.

I'll write more about vmap in answer to the other email.

Thanks,
Andrea
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH V2 0/5] vhost: accelerate metadata access through vmap()

2019-03-08 Thread Christoph Hellwig
On Wed, Mar 06, 2019 at 02:18:07AM -0500, Jason Wang wrote:
> This series tries to access virtqueue metadata through kernel virtual
> address instead of copy_user() friends since they had too much
> overheads like checks, spec barriers or even hardware feature
> toggling. This is done through setup kernel address through vmap() and
> resigter MMU notifier for invalidation.
> 
> Test shows about 24% improvement on TX PPS. TCP_STREAM doesn't see
> obvious improvement.

How is this going to work for CPUs with virtually tagged caches?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Michael S. Tsirkin
On Fri, Mar 08, 2019 at 04:58:44PM +0800, Jason Wang wrote:
> 
> On 2019/3/8 上午3:17, Jerome Glisse wrote:
> > On Thu, Mar 07, 2019 at 12:56:45PM -0500, Michael S. Tsirkin wrote:
> > > On Thu, Mar 07, 2019 at 10:47:22AM -0500, Michael S. Tsirkin wrote:
> > > > On Wed, Mar 06, 2019 at 02:18:12AM -0500, Jason Wang wrote:
> > > > > +static const struct mmu_notifier_ops vhost_mmu_notifier_ops = {
> > > > > + .invalidate_range = vhost_invalidate_range,
> > > > > +};
> > > > > +
> > > > >   void vhost_dev_init(struct vhost_dev *dev,
> > > > >   struct vhost_virtqueue **vqs, int nvqs, int 
> > > > > iov_limit)
> > > > >   {
> > > > I also wonder here: when page is write protected then
> > > > it does not look like .invalidate_range is invoked.
> > > > 
> > > > E.g. mm/ksm.c calls
> > > > 
> > > > mmu_notifier_invalidate_range_start and
> > > > mmu_notifier_invalidate_range_end but not mmu_notifier_invalidate_range.
> > > > 
> > > > Similarly, rmap in page_mkclean_one will not call
> > > > mmu_notifier_invalidate_range.
> > > > 
> > > > If I'm right vhost won't get notified when page is write-protected 
> > > > since you
> > > > didn't install start/end notifiers. Note that end notifier can be called
> > > > with page locked, so it's not as straight-forward as just adding a call.
> > > > Writing into a write-protected page isn't a good idea.
> > > > 
> > > > Note that documentation says:
> > > > it is fine to delay the mmu_notifier_invalidate_range
> > > > call to mmu_notifier_invalidate_range_end() outside the page 
> > > > table lock.
> > > > implying it's called just later.
> > > OK I missed the fact that _end actually calls
> > > mmu_notifier_invalidate_range internally. So that part is fine but the
> > > fact that you are trying to take page lock under VQ mutex and take same
> > > mutex within notifier probably means it's broken for ksm and rmap at
> > > least since these call invalidate with lock taken.
> > > 
> > > And generally, Andrea told me offline one can not take mutex under
> > > the notifier callback. I CC'd Andrea for why.
> > Correct, you _can not_ take mutex or any sleeping lock from within the
> > invalidate_range callback as those callback happens under the page table
> > spinlock. You can however do so under the invalidate_range_start call-
> > back only if it is a blocking allow callback (there is a flag passdown
> > with the invalidate_range_start callback if you are not allow to block
> > then return EBUSY and the invalidation will be aborted).
> > 
> > 
> > > That's a separate issue from set_page_dirty when memory is file backed.
> > If you can access file back page then i suggest using set_page_dirty
> > from within a special version of vunmap() so that when you vunmap you
> > set the page dirty without taking page lock. It is safe to do so
> > always from within an mmu notifier callback if you had the page map
> > with write permission which means that the page had write permission
> > in the userspace pte too and thus it having dirty pte is expected
> > and calling set_page_dirty on the page is allowed without any lock.
> > Locking will happen once the userspace pte are tear down through the
> > page table lock.
> 
> 
> Can I simply can set_page_dirty() before vunmap() in the mmu notifier
> callback, or is there any reason that it must be called within vumap()?
> 
> Thanks


I think this is what Jerome is saying, yes.
Maybe add a patch to mmu notifier doc file, documenting this?


> 
> > 
> > > It's because of all these issues that I preferred just accessing
> > > userspace memory and handling faults. Unfortunately there does not
> > > appear to exist an API that whitelists a specific driver along the lines
> > > of "I checked this code for speculative info leaks, don't add barriers
> > > on data path please".
> > Maybe it would be better to explore adding such helper then remapping
> > page into kernel address space ?
> > 
> > Cheers,
> > Jérôme
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Jason Wang


On 2019/3/8 上午11:45, Jerome Glisse wrote:

On Thu, Mar 07, 2019 at 10:43:12PM -0500, Michael S. Tsirkin wrote:

On Thu, Mar 07, 2019 at 10:40:53PM -0500, Jerome Glisse wrote:

On Thu, Mar 07, 2019 at 10:16:00PM -0500, Michael S. Tsirkin wrote:

On Thu, Mar 07, 2019 at 09:55:39PM -0500, Jerome Glisse wrote:

On Thu, Mar 07, 2019 at 09:21:03PM -0500, Michael S. Tsirkin wrote:

On Thu, Mar 07, 2019 at 02:17:20PM -0500, Jerome Glisse wrote:

It's because of all these issues that I preferred just accessing
userspace memory and handling faults. Unfortunately there does not
appear to exist an API that whitelists a specific driver along the lines
of "I checked this code for speculative info leaks, don't add barriers
on data path please".

Maybe it would be better to explore adding such helper then remapping
page into kernel address space ?

I explored it a bit (see e.g. thread around: "__get_user slower than
get_user") and I can tell you it's not trivial given the issue is around
security.  So in practice it does not seem fair to keep a significant
optimization out of kernel because *maybe* we can do it differently even
better :)

Maybe a slightly different approach between this patchset and other
copy user API would work here. What you want really is something like
a temporary mlock on a range of memory so that it is safe for the
kernel to access range of userspace virtual address ie page are
present and with proper permission hence there can be no page fault
while you are accessing thing from kernel context.

So you can have like a range structure and mmu notifier. When you
lock the range you block mmu notifier to allow your code to work on
the userspace VA safely. Once you are done you unlock and let the
mmu notifier go on. It is pretty much exactly this patchset except
that you remove all the kernel vmap code. A nice thing about that
is that you do not need to worry about calling set page dirty it
will already be handle by the userspace VA pte. It also use less
memory than when you have kernel vmap.

This idea might be defeated by security feature where the kernel is
running in its own address space without the userspace address
space present.

Like smap?

Yes like smap but also other newer changes, with similar effect, since
the spectre drama.

Cheers,
Jérôme

Sorry do you mean meltdown and kpti?

Yes all that and similar thing. I do not have the full list in my head.

Cheers,
Jérôme



Yes, address space of kernel its own is the main motivation of using 
vmap here.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Jason Wang


On 2019/3/8 上午5:27, Andrea Arcangeli wrote:

Hello Jerome,

On Thu, Mar 07, 2019 at 03:17:22PM -0500, Jerome Glisse wrote:

So for the above the easiest thing is to call set_page_dirty() from
the mmu notifier callback. It is always safe to use the non locking
variant from such callback. Well it is safe only if the page was
map with write permission prior to the callback so here i assume
nothing stupid is going on and that you only vmap page with write
if they have a CPU pte with write and if not then you force a write
page fault.

So if the GUP doesn't set FOLL_WRITE, set_page_dirty simply shouldn't
be called in such case. It only ever makes sense if the pte is
writable.

On a side note, the reason the write bit on the pte enabled avoids the
need of the _lock suffix is because of the stable page writeback
guarantees?


Basicly from mmu notifier callback you have the same right as zap
pte has.

Good point.

Related to this I already was wondering why the set_page_dirty is not
done in the invalidate. Reading the patch it looks like the dirty is
marked dirty when the ring wraps around, not in the invalidate, Jeson
can tell if I misread something there.



Actually not wrapping around,  the pages for used ring was marked as 
dirty after a round of virtqueue processing when we're sure vhost wrote 
something there.


Thanks




For transient data passing through the ring, nobody should care if
it's lost. It's not user-journaled anyway so it could hit the disk in
any order. The only reason to flush it to do disk is if there's memory
pressure (to pageout like a swapout) and in such case it's enough to
mark it dirty only in the mmu notifier invalidate like you pointed out
(and only if GUP was called with FOLL_WRITE).


O_DIRECT can suffer from the same issue but the race window for that
is small enough that it is unlikely it ever happened. But for device

Ok that clarifies things.


driver that GUP page for hours/days/weeks/months ... obviously the
race window is big enough here. It affects many fs (ext4, xfs, ...)
in different ways. I think ext4 is the most obvious because of the
kernel log trace it leaves behind.

Bottom line is for set_page_dirty to be safe you need the following:
 lock_page()
 page_mkwrite()
 set_pte_with_write()
 unlock_page()

I also wondered why ext4 writepage doesn't recreate the bh if they got
dropped by the VM and page->private is 0. I mean, page->index and
page->mapping are still there, that's enough info for writepage itself
to take a slow path and calls page_mkwrite to find where to write the
page on disk.


Now when loosing the write permission on the pte you will first get
a mmu notifier callback so anyone that abide by mmu notifier is fine
as long as they only write to the page if they found a pte with
write as it means the above sequence did happen and page is write-
able until the mmu notifier callback happens.

When you lookup a page into the page cache you still need to call
page_mkwrite() before installing a write-able pte.

Here for this vmap thing all you need is that the original user
pte had the write flag. If you only allow write in the vmap when
the original pte had write and you abide by mmu notifier then it
is ok to call set_page_dirty from the mmu notifier (but not after).

Hence why my suggestion is a special vunmap that call set_page_dirty
on the page from the mmu notifier.

Agreed, that will solve all issues in vhost context with regard to
set_page_dirty, including the case the memory is backed by VM_SHARED ext4.

Thanks!
Andrea

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Jason Wang


On 2019/3/8 上午3:17, Jerome Glisse wrote:

On Thu, Mar 07, 2019 at 12:56:45PM -0500, Michael S. Tsirkin wrote:

On Thu, Mar 07, 2019 at 10:47:22AM -0500, Michael S. Tsirkin wrote:

On Wed, Mar 06, 2019 at 02:18:12AM -0500, Jason Wang wrote:

+static const struct mmu_notifier_ops vhost_mmu_notifier_ops = {
+   .invalidate_range = vhost_invalidate_range,
+};
+
  void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs, int iov_limit)
  {

I also wonder here: when page is write protected then
it does not look like .invalidate_range is invoked.

E.g. mm/ksm.c calls

mmu_notifier_invalidate_range_start and
mmu_notifier_invalidate_range_end but not mmu_notifier_invalidate_range.

Similarly, rmap in page_mkclean_one will not call
mmu_notifier_invalidate_range.

If I'm right vhost won't get notified when page is write-protected since you
didn't install start/end notifiers. Note that end notifier can be called
with page locked, so it's not as straight-forward as just adding a call.
Writing into a write-protected page isn't a good idea.

Note that documentation says:
it is fine to delay the mmu_notifier_invalidate_range
call to mmu_notifier_invalidate_range_end() outside the page table lock.
implying it's called just later.

OK I missed the fact that _end actually calls
mmu_notifier_invalidate_range internally. So that part is fine but the
fact that you are trying to take page lock under VQ mutex and take same
mutex within notifier probably means it's broken for ksm and rmap at
least since these call invalidate with lock taken.

And generally, Andrea told me offline one can not take mutex under
the notifier callback. I CC'd Andrea for why.

Correct, you _can not_ take mutex or any sleeping lock from within the
invalidate_range callback as those callback happens under the page table
spinlock. You can however do so under the invalidate_range_start call-
back only if it is a blocking allow callback (there is a flag passdown
with the invalidate_range_start callback if you are not allow to block
then return EBUSY and the invalidation will be aborted).



That's a separate issue from set_page_dirty when memory is file backed.

If you can access file back page then i suggest using set_page_dirty
from within a special version of vunmap() so that when you vunmap you
set the page dirty without taking page lock. It is safe to do so
always from within an mmu notifier callback if you had the page map
with write permission which means that the page had write permission
in the userspace pte too and thus it having dirty pte is expected
and calling set_page_dirty on the page is allowed without any lock.
Locking will happen once the userspace pte are tear down through the
page table lock.



Can I simply can set_page_dirty() before vunmap() in the mmu notifier 
callback, or is there any reason that it must be called within vumap()?


Thanks





It's because of all these issues that I preferred just accessing
userspace memory and handling faults. Unfortunately there does not
appear to exist an API that whitelists a specific driver along the lines
of "I checked this code for speculative info leaks, don't add barriers
on data path please".

Maybe it would be better to explore adding such helper then remapping
page into kernel address space ?

Cheers,
Jérôme

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Jason Wang


On 2019/3/8 上午3:16, Andrea Arcangeli wrote:

On Thu, Mar 07, 2019 at 12:56:45PM -0500, Michael S. Tsirkin wrote:

On Thu, Mar 07, 2019 at 10:47:22AM -0500, Michael S. Tsirkin wrote:

On Wed, Mar 06, 2019 at 02:18:12AM -0500, Jason Wang wrote:

+static const struct mmu_notifier_ops vhost_mmu_notifier_ops = {
+   .invalidate_range = vhost_invalidate_range,
+};
+
  void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs, int iov_limit)
  {

I also wonder here: when page is write protected then
it does not look like .invalidate_range is invoked.

E.g. mm/ksm.c calls

mmu_notifier_invalidate_range_start and
mmu_notifier_invalidate_range_end but not mmu_notifier_invalidate_range.

Similarly, rmap in page_mkclean_one will not call
mmu_notifier_invalidate_range.

If I'm right vhost won't get notified when page is write-protected since you
didn't install start/end notifiers. Note that end notifier can be called
with page locked, so it's not as straight-forward as just adding a call.
Writing into a write-protected page isn't a good idea.

Note that documentation says:
it is fine to delay the mmu_notifier_invalidate_range
call to mmu_notifier_invalidate_range_end() outside the page table lock.
implying it's called just later.

OK I missed the fact that _end actually calls
mmu_notifier_invalidate_range internally. So that part is fine but the
fact that you are trying to take page lock under VQ mutex and take same
mutex within notifier probably means it's broken for ksm and rmap at
least since these call invalidate with lock taken.

Yes this lock inversion needs more thoughts.


And generally, Andrea told me offline one can not take mutex under
the notifier callback. I CC'd Andrea for why.

Yes, the problem then is the ->invalidate_page is called then under PT
lock so it cannot take mutex, you also cannot take the page_lock, it
can at most take a spinlock or trylock_page.

So it must switch back to the _start/_end methods unless you rewrite
the locking.

The difference with _start/_end, is that ->invalidate_range avoids the
_start callback basically, but to avoid the _start callback safely, it
has to be called in between the ptep_clear_flush and the set_pte_at
whenever the pfn changes like during a COW. So it cannot be coalesced
in a single TLB flush that invalidates all sptes in a range like we
prefer for performance reasons for example in KVM. It also cannot
sleep.

In short ->invalidate_range must be really fast (it shouldn't require
to send IPI to all other CPUs like KVM may require during an
invalidate_range_start) and it must not sleep, in order to prefer it
to _start/_end.

I.e. the invalidate of the secondary MMU that walks the linux
pagetables in hardware (in vhost case with GUP in software) has to
happen while the linux pagetable is zero, otherwise a concurrent
hardware pagetable lookup could re-instantiate a mapping to the old
page in between the set_pte_at and the invalidate_range_end (which
internally calls ->invalidate_range). Jerome documented it nicely in
Documentation/vm/mmu_notifier.rst .



Right, I've actually gone through this several times but some details 
were missed by me obviously.





Now you don't really walk the pagetable in hardware in vhost, but if
you use gup_fast after usemm() it's similar.

For vhost the invalidate would be really fast, there are no IPI to
deliver at all, the problem is just the mutex.



Yes. A possible solution is to introduce a valid flag for VA. Vhost may 
only try to access kernel VA when it was valid. Invalidate_range_start() 
will clear this under the protection of the vq mutex when it can block. 
Then invalidate_range_end() then can clear this flag. An issue is 
blockable is  always false for range_end().






That's a separate issue from set_page_dirty when memory is file backed.

Yes. I don't yet know why the ext4 internal __writepage cannot
re-create the bh if they've been freed by the VM and why such race
where the bh are freed for a pinned VM_SHARED ext4 page doesn't even
exist for transient pins like O_DIRECT (does it work by luck?), but
with mmu notifiers there are no long term pins anyway, so this works
normally and it's like the memory isn't pinned. In any case I think
that's a kernel bug in either __writepage or try_to_free_buffers, so I
would ignore it considering qemu will only use anon memory or tmpfs or
hugetlbfs as backing store for the virtio ring. It wouldn't make sense
for qemu to risk triggering I/O on a VM_SHARED ext4, so we shouldn't
be even exposed to what seems to be an orthogonal kernel bug.

I suppose whatever solution will fix the set_page_dirty_lock on
VM_SHARED ext4 for the other places that don't or can't use mmu
notifiers, will then work for vhost too which uses mmu notifiers and
will be less affected from the start if something.

Reading the lwn link about the discussion about the long term GUP pin
from Jan vs set_page_dirty_lock: I can only agree with the last 

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Jason Wang


On 2019/3/7 下午11:34, Michael S. Tsirkin wrote:

On Thu, Mar 07, 2019 at 10:45:57AM +0800, Jason Wang wrote:

On 2019/3/7 上午12:31, Michael S. Tsirkin wrote:

+static void vhost_set_vmap_dirty(struct vhost_vmap *used)
+{
+   int i;
+
+   for (i = 0; i < used->npages; i++)
+   set_page_dirty_lock(used->pages[i]);

This seems to rely on page lock to mark page dirty.

Could it happen that page writeback will check the
page, find it clean, and then you mark it dirty and then
invalidate callback is called?



Yes. But does this break anything?
The page is still there, we just remove a
kernel mapping to it.

Thanks

Yes it's the same problem as e.g. RDMA:
we've just marked the page as dirty without having buffers.
Eventually writeback will find it and filesystem will complain...
So if the pages are backed by a non-RAM-based filesystem, it’s all just 
broken.



Yes, we can't depend on the pages that might have been invalidated. As 
suggested, the only suitable place is the MMU notifier callbacks.


Thanks



one can hope that RDMA guys will fix it in some way eventually.
For now, maybe add a flag in e.g. VMA that says that there's no
writeback so it's safe to mark page dirty at any point?






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization