On Tue, Feb 28, 2023 at 07:09:57PM -0500, Stefan Hajnoczi wrote:
> On Sat, Feb 25, 2023 at 11:31:37AM -0500, Peter Xu wrote:
> > [not for merging, but for discussion; this is something I found when
> >  looking at another issue on Chuang's optimization for migration downtime]
> > 
> > Summary: we tried to access memory_listeners, address_spaces, etc. in RCU
> > way.  However we didn't implement them with RCU-safety. This patchset is
> > trying to do that; at least making it closer.
> > 
> > NOTE!  It's doing it wrongly for now, so please feel free to see this as a
> > thread to start discussing this problem, as in subject.
> > 
> > The core problem here is how to make sure memory listeners will be freed in
> > RCU ways, per when unlinking them from the global memory_listeners list.
> > 
> > The current patchset (in patch 1) did it with drain_call_rcu(), but of
> > course it's wrong, because of at least two things:
> > 
> >   (1) drain_call_rcu() will release BQL; currently there's no way to me to
> >       guarantee that releasing BQL is safe here.
> > 
> >   (2) memory_listener_unregister() can be called within a RCU read lock
> >       itself (we're so happy to take rcu read lock in many places but we
> >       don't think much on how long it'll be taken; at least not as strict
> >       as the kernel variance, so we're just less care about that fact yet).
> >       It means, drain_call_rcu() should deadlock there waiting for itself.
> >       For an example, see Appendix A.
> > 
> > Side question to Stefan / Maxim: why do we need drain_call_rcu() and what's
> > its difference from synchronize_rcu() in API level besides releasing and
> > retaking BQL when taken?
> 
> Hi,
> I haven't taken a look at the patches or thought about the larger
> problem you're tackling here, but I wanted to reply to this specific
> question.
> 
> It's been a long time since Maxim, Paolo, and I discussed this, but
> drain_call_rcu() and synchronize_rcu() do different things:
> - drain_call_rcu() is about waiting until the current thread's
>   call_rcu() callbacks have completed.
> - synchronize_rcu() is about waiting until there are no more readers in
>   the last grace period.
> 
> Calling synchronize_rcu() doesn't guarantee that call_rcu_thread() has
> completed pending call_rcu() callbacks. Therefore it's not appropriate
> for the existing drain_call_rcu() callers because they rely on previous
> call_rcu() callbacks to have finished.

Ah I missed that detail.

I was quickly thinking whether such a requirement can also be done with a
customized rcu callback that will simply kick a signal after the real
"free" is done, while the call_rcu() context can wait for the signal.  It's
just that assuming RCU callbacks will be executed in order is slightly
tricky.  But I guess it's also hard if the call_rcu() is deep in the stack
so drain_call_rcu() should avoid fiddling on the details.

Thanks Stefan!

-- 
Peter Xu


Reply via email to