On Tue, Jan 10, 2023 at 12:09:41AM -0800, Chuang Xu wrote: > Hi, Peter and Paolo,
Hi, Chuang, Paolo, > I'm sorry I didn't reply to your email in time. I was infected with > COVID-19 two weeks ago, so I couldn't think about the problems discussed > in your email for a long time.π· > > On 2023/1/4 δΈε1:43, Peter Xu wrote: > > Hi, Paolo, > > > > On Wed, Dec 28, 2022 at 09:27:50AM +0100, Paolo Bonzini wrote: > >> Il ven 23 dic 2022, 16:54 Peter Xu ha scritto: > >> > >>>> This is not valid because the transaction could happen in *another* > >>> thread. > >>>> In that case memory_region_transaction_depth() will be > 0, but RCU is > >>>> needed. > >>> Do you mean the code is wrong, or the comment? Note that the code has > >>> checked rcu_read_locked() where introduced in patch 1, but maybe > something > >>> else was missed? > >>> > >> The assertion is wrong. It will succeed even if RCU is unlocked in this > >> thread but a transaction is in progress in another thread. > > IIUC this is the case where the context: > > > > (1) doesn't have RCU read lock held, and, > > (2) doesn't have BQL held. > > > > Is it safe at all to reference any flatview in such a context? The thing > > is I think the flatview pointer can be freed anytime if both locks are > not > > taken. > > > >> Perhaps you can check (memory_region_transaction_depth() > 0 && > >> !qemu_mutex_iothread_locked()) || rcu_read_locked() instead? > > What if one thread calls address_space_to_flatview() with BQL held but > not > > RCU read lock held? I assume it's a legal operation, but it seems to be > > able to trigger the assert already? > > > > Thanks, > > > I'm not sure whether I understand the content of your discussion correctly, > so here I want to explain my understanding firstly. > > From my perspective, Paolo thinks that when thread 1 is in a transaction, > thread 2 will trigger the assertion when accessing the flatview without > holding RCU read lock, although sometimes the thread 2's access to flatview > is legal. So Paolo suggests checking (memory_region_transaction_depth() > 0 > && !qemu_mutex_iothread_locked()) || rcu_read_locked() instead. > > And Peter thinks that as long as no thread holds the BQL or RCU read lock, > the old flatview can be released (actually executed by the rcu thread with > BQL held). When thread 1 is in a transaction, if thread 2 access the > flatview > with BQL held but not RCU read lock held, it's a legal operation. In this > legal case, it seems that both my code and Paolo's code will trigger > assertion. IIUC your original patch is fine in this case (BQL held, RCU not held), as long as depth==0. IMHO what we want to trap here is when BQL held (while RCU is not) and depth>0 which can cause unpredictable side effect of using obsolete flatview. To summarize, the original check didn't consider BQL, and if to consider BQL I think it should be something like: /* Guarantees valid access to the flatview, either lock works */ assert(BQL_HELD() || RCU_HELD()); /* * Guarantees any BQL holder is not reading obsolete flatview (e.g. when * during vm load) */ if (BQL_HELD()) assert(depth==0); IIUC it can be merged into: assert((BQL_HELD() && depth==0) || RCU_HELD()); > > I'm not sure if I have a good understanding of your emails? I think > checking(memory_region_transaction_get_depth() == 0 || rcu_read_locked() || > qemu_mutex_iothread_locked()) should cover the case you discussed. This seems still problematic too? Since the assert can pass even if neither BQL nor RCU is held (as long as depth==0). Thanks, -- Peter Xu