Hi Will, On Tue, Jan 22, 2019 at 05:44:02AM +0000, Will Deacon wrote: > On Mon, Jan 21, 2019 at 02:21:28PM +0000, Catalin Marinas wrote: > > On Sat, Jan 19, 2019 at 11:58:27PM +0000, Will Deacon wrote: > > > On Thu, Jan 17, 2019 at 07:42:44AM +0000, chenwandun wrote: > > > > Recently, I do some tests on linux-4.19 and hit a softlockup issue. > > > > > > > > I find some CPUs get the spinlock in the __split_huge_pmd function and > > > > then send IPI to other CPUs, waiting the response, while several CPUs > > > > enter the __split_huge_pmd function, want to get the spinlock, but > > > > always > > > > in queued_spin_lock_slowpath, > > > > > > > > Because long time no response to the IPI, that results in a softlockup. > > > > > > > > As to sending IPI, it was in the patch > > > > 3b8c9f1cdfc506e94e992ae66b68bbe416f89610. The patch is mean to send IPI > > > > to each CPU after invalidating the I-cache for kernel mappings. In this > > > > case, after modify pmd, it sends IPI to other CPUS to sync memory > > > > mappings. > > > > > > > > No stable test case to repeat the result, it is hard to repeat the test > > > > procedure. > > > > > > > > The environment is arm64, 64 CPUs. Except for idle CPU, there are 6 kind > > > > of callstacks in total. > > > > > > This looks like another lockup that would be solved if we deferred our > > > I-cache invalidation when mapping user-executable pages, and instead > > > performed the invalidation off the back of a UXN permission fault, where > > > we > > > could avoid holding any locks. > > > > Looking back at commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after > > invalidating the I-cache for kernel mappings"), the text implies that it > > should only do this for kernel mappings. I don't think we need this for > > user mappings. We have a few scenarios where we invoke set_pte_at() with > > exec permission: > > Yes, I think you're right. I got confused because in this case we are > invalidating lines written by the kernel, but actually it's not about who > writes the data, but about whether or not the page table is being changed.
IIUC we may have a userspace problem analagous to the kernel modules problem, if userspace uses dlopen/dlclose to dynamically load/unload shared objects. If userspace unloads an object, then loads another, the new object might get placed at the same VA. A PE could have started speculating instructions from the old object, and IIUC the TLB invalidation and I-cache maintenance don't cause those instructions be re-fetched from the I-cache unless there's a context synchronization event. Do we require the use of membarrier when loading or unloading objects? If so, when does that happen relative to the unmap or map? Thanks, Mark.