Re: [PATCH 4/4] mm, notifier: Catch sleeping/blocking for !blockable

2019-08-22 Thread Daniel Vetter
On Thu, Aug 22, 2019 at 4:24 PM Jason Gunthorpe  wrote:
>
> On Thu, Aug 22, 2019 at 10:42:39AM +0200, Daniel Vetter wrote:
>
> > > RDMA has a mutex:
> > >
> > > ib_umem_notifier_invalidate_range_end
> > >   rbt_ib_umem_for_each_in_range
> > >invalidate_range_start_trampoline
> > > ib_umem_notifier_end_account
> > >   mutex_lock(_odp->umem_mutex);
> > >
> > > I'm working to delete this path though!
> > >
> > > nonblocking or not follows the start, the same flag gets placed into
> > > the mmu_notifier_range struct passed to end.
> >
> > Ok, makes sense.
> >
> > I guess that also means the might_sleep (I started on that) in
> > invalidate_range_end also needs to be conditional? Or not bother with
> > a might_sleep in invalidate_range_end since you're working on removing
> > the last sleep in there?
>
> I might suggest the same pattern as used for locked, the might_sleep
> unconditionally on the start, and a 2nd might sleep after the IF in
> __mmu_notifier_invalidate_range_end()
>
> Observing that by audit all the callers already have the same locking
> context for start/end

My question was more about enforcing that going forward, since you're
working to remove all the sleeps from invalidate_range_end. I don't
want to add debug annotations which are stricter than what the other
side actually expects. But since currently there is still sleeping
locks in invalidate_range_end I think I'll just stick them in both
places. You can then (re)move it when the cleanup lands.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


Re: [PATCH 4/4] mm, notifier: Catch sleeping/blocking for !blockable

2019-08-22 Thread Daniel Vetter
On Thu, Aug 22, 2019 at 10:16 AM Jason Gunthorpe  wrote:
>
> On Wed, Aug 21, 2019 at 05:41:51PM +0200, Daniel Vetter wrote:
>
> > > Hm, I thought the page table locks we're holding there already prevent any
> > > sleeping, so would be redundant? But reading through code I think that's
> > > not guaranteed, so yeah makes sense to add it for invalidate_range_end
> > > too. I'll respin once I have the ack/nack from scheduler people.
> >
> > So I started to look into this, and I'm a bit confused. There's no
> > _nonblock version of this, so does this means blocking is never allowed,
> > or always allowed?
>
> RDMA has a mutex:
>
> ib_umem_notifier_invalidate_range_end
>   rbt_ib_umem_for_each_in_range
>invalidate_range_start_trampoline
> ib_umem_notifier_end_account
>   mutex_lock(_odp->umem_mutex);
>
> I'm working to delete this path though!
>
> nonblocking or not follows the start, the same flag gets placed into
> the mmu_notifier_range struct passed to end.

Ok, makes sense.

I guess that also means the might_sleep (I started on that) in
invalidate_range_end also needs to be conditional? Or not bother with
a might_sleep in invalidate_range_end since you're working on removing
the last sleep in there?

> > From a quick look through implementations I've only seen spinlocks, and
> > one up_read. So I guess I should wrape this callback in some unconditional
> > non_block_start/end, but I'm not sure.
>
> For now, we should keep it the same as start, conditionally blocking.
>
> Hopefully before LPC I can send a RFC series that eliminates most
> invalidate_range_end users in favor of common locking..

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


Re: [PATCH 4/4] mm, notifier: Catch sleeping/blocking for !blockable

2019-08-21 Thread Jason Gunthorpe
On Wed, Aug 21, 2019 at 05:41:51PM +0200, Daniel Vetter wrote:

> > Hm, I thought the page table locks we're holding there already prevent any
> > sleeping, so would be redundant? But reading through code I think that's
> > not guaranteed, so yeah makes sense to add it for invalidate_range_end
> > too. I'll respin once I have the ack/nack from scheduler people.
> 
> So I started to look into this, and I'm a bit confused. There's no
> _nonblock version of this, so does this means blocking is never allowed,
> or always allowed?

RDMA has a mutex:

ib_umem_notifier_invalidate_range_end
  rbt_ib_umem_for_each_in_range
   invalidate_range_start_trampoline
ib_umem_notifier_end_account
  mutex_lock(_odp->umem_mutex);

I'm working to delete this path though!

nonblocking or not follows the start, the same flag gets placed into
the mmu_notifier_range struct passed to end.

> From a quick look through implementations I've only seen spinlocks, and
> one up_read. So I guess I should wrape this callback in some unconditional
> non_block_start/end, but I'm not sure.

For now, we should keep it the same as start, conditionally blocking.

Hopefully before LPC I can send a RFC series that eliminates most
invalidate_range_end users in favor of common locking..

Jason


Re: [PATCH 4/4] mm, notifier: Catch sleeping/blocking for !blockable

2019-08-20 Thread Jason Gunthorpe
On Tue, Aug 20, 2019 at 10:19:02AM +0200, Daniel Vetter wrote:
> We need to make sure implementations don't cheat and don't have a
> possible schedule/blocking point deeply burried where review can't
> catch it.
> 
> I'm not sure whether this is the best way to make sure all the
> might_sleep() callsites trigger, and it's a bit ugly in the code flow.
> But it gets the job done.
> 
> Inspired by an i915 patch series which did exactly that, because the
> rules haven't been entirely clear to us.
> 
> v2: Use the shiny new non_block_start/end annotations instead of
> abusing preempt_disable/enable.
> 
> v3: Rebase on top of Glisse's arg rework.
> 
> v4: Rebase on top of more Glisse rework.
> 
> Cc: Jason Gunthorpe 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: David Rientjes 
> Cc: "Christian König" 
> Cc: Daniel Vetter 
> Cc: "Jérôme Glisse" 
> Cc: linux...@kvack.org
> Reviewed-by: Christian König 
> Reviewed-by: Jérôme Glisse 
> Signed-off-by: Daniel Vetter 
>  mm/mmu_notifier.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 538d3bb87f9b..856636d06ee0 100644
> +++ b/mm/mmu_notifier.c
> @@ -181,7 +181,13 @@ int __mmu_notifier_invalidate_range_start(struct 
> mmu_notifier_range *range)
>   id = srcu_read_lock();
>   hlist_for_each_entry_rcu(mn, >mm->mmu_notifier_mm->list, hlist) {
>   if (mn->ops->invalidate_range_start) {
> - int _ret = mn->ops->invalidate_range_start(mn, range);
> + int _ret;
> +
> + if (!mmu_notifier_range_blockable(range))
> + non_block_start();
> + _ret = mn->ops->invalidate_range_start(mn, range);
> + if (!mmu_notifier_range_blockable(range))
> + non_block_end();

If someone Acks all the sched changes then I can pick this for
hmm.git, but I still think the existing pre-emption debugging is fine
for this use case.

Also, same comment as for the lockdep map, this needs to apply to the
non-blocking range_end also.

Anyhow, since this series has conflicts with hmm.git it would be best
to flow through the whole thing through that tree. If there are no
remarks on the first two patches I'll grab them in a few days.

Regards,
Jason