On Tue, Jan 10, 2023 at 12:09:41AM -0800, Chuang Xu wrote:
> Hi, Peter and Paolo,

Hi, Chuang, Paolo,

> I'm sorry I didn't reply to your email in time. I was infected with
> COVID-19 two weeks ago, so I couldn't think about the problems discussed
> in your email for a long time.😷
> 
> On 2023/1/4 上午1:43, Peter Xu wrote:
> > Hi, Paolo,
> >
> > On Wed, Dec 28, 2022 at 09:27:50AM +0100, Paolo Bonzini wrote:
> >> Il ven 23 dic 2022, 16:54 Peter Xu ha scritto:
> >>
> >>>> This is not valid because the transaction could happen in *another*
> >>> thread.
> >>>> In that case memory_region_transaction_depth() will be > 0, but RCU is
> >>>> needed.
> >>> Do you mean the code is wrong, or the comment? Note that the code has
> >>> checked rcu_read_locked() where introduced in patch 1, but maybe
> something
> >>> else was missed?
> >>>
> >> The assertion is wrong. It will succeed even if RCU is unlocked in this
> >> thread but a transaction is in progress in another thread.
> > IIUC this is the case where the context:
> >
> > (1) doesn't have RCU read lock held, and,
> > (2) doesn't have BQL held.
> >
> > Is it safe at all to reference any flatview in such a context? The thing
> > is I think the flatview pointer can be freed anytime if both locks are
> not
> > taken.
> >
> >> Perhaps you can check (memory_region_transaction_depth() > 0 &&
> >> !qemu_mutex_iothread_locked()) || rcu_read_locked() instead?
> > What if one thread calls address_space_to_flatview() with BQL held but
> not
> > RCU read lock held? I assume it's a legal operation, but it seems to be
> > able to trigger the assert already?
> >
> > Thanks,
> >
> I'm not sure whether I understand the content of your discussion correctly,
> so here I want to explain my understanding firstly.
> 
> From my perspective, Paolo thinks that when thread 1 is in a transaction,
> thread 2 will trigger the assertion when accessing the flatview without
> holding RCU read lock, although sometimes the thread 2's access to flatview
> is legal. So Paolo suggests checking (memory_region_transaction_depth() > 0
> && !qemu_mutex_iothread_locked()) || rcu_read_locked() instead.
> 
> And Peter thinks that as long as no thread holds the BQL or RCU read lock,
> the old flatview can be released (actually executed by the rcu thread with
> BQL held). When thread 1 is in a transaction, if thread 2 access the
> flatview
> with BQL held but not RCU read lock held, it's a legal operation. In this
> legal case, it seems that both my code and Paolo's code will trigger
> assertion.

IIUC your original patch is fine in this case (BQL held, RCU not held), as
long as depth==0.  IMHO what we want to trap here is when BQL held (while
RCU is not) and depth>0 which can cause unpredictable side effect of using
obsolete flatview.

To summarize, the original check didn't consider BQL, and if to consider
BQL I think it should be something like:

  /* Guarantees valid access to the flatview, either lock works */
  assert(BQL_HELD() || RCU_HELD());

  /*
   * Guarantees any BQL holder is not reading obsolete flatview (e.g. when
   * during vm load)
   */
  if (BQL_HELD())
      assert(depth==0);

IIUC it can be merged into:

  assert((BQL_HELD() && depth==0) || RCU_HELD());

> 
> I'm not sure if I have a good understanding of your emails? I think
> checking(memory_region_transaction_get_depth() == 0 || rcu_read_locked() ||
> qemu_mutex_iothread_locked()) should cover the case you discussed.

This seems still problematic too?  Since the assert can pass even if
neither BQL nor RCU is held (as long as depth==0).

Thanks,

-- 
Peter Xu


Reply via email to