On Thu Mar 28, 2024 at 8:15 PM AEST, Nicholas Piggin wrote: > On Thu Mar 28, 2024 at 6:12 PM AEST, Nicholas Piggin wrote: > > On Thu Mar 28, 2024 at 3:31 PM AEST, Nicholas Piggin wrote: > > > ppc broadcast tlb flushes should be synchronised with other vCPUs, > > > like all other architectures that support such operations seem to > > > be doing. > > > > > > Fixing ppc removes the last caller of the non-synced TLB flush > > > variants, we can remove some dead code. I'd like to merge patch 1 > > > for 9.0, and hold patches 2 and 3 until 9.1 to avoid churn (unless > > > someone prefers to remove the dead code asap). > > > > Hmm, turns out to not be so simple, this in parts reverts > > the fix in commit 4ddc104689b. Do other architectures > > that use the _synced TLB flush variants have that same problem > > with the TLB flush not actually flushing until the TB ends, > > I wonder? > > Huh, I can reproduce that original problem with a little test > case (which I will upstream into kvm-unit-tests). > > async_run_on_cpu(this_cpu) seems to flush before the next TB, but > async_safe_run_on_cpu(this_cpu) does not? How does it execute it > without exiting from the TB?
Duh, it's because the non-_synced tlb flush variants don't use that for running on this CPU, they just call it directly. Okay that all makes sense now. I think this series plus the below are good then. Also it's possible some other archs that use _all_cpus_synced() (arm, riscv, s390x) _may_ be racy. I had a quick look at sfence.vma and ipte, and AFAIKS they're supposed to take immediate effect after they execute. Thanks, Nick