On 7/19/19 3:45 PM, Luis Henriques wrote: > Waiman Long <long...@redhat.com> writes: > >> On 7/19/19 2:45 PM, Luis Henriques wrote: >>> On Mon, May 20, 2019 at 04:59:12PM -0400, Waiman Long wrote: >>>> The rwsem->owner contains not just the task structure pointer, it also >>>> holds some flags for storing the current state of the rwsem. Some of >>>> the flags may have to be atomically updated. To reflect the new reality, >>>> the owner is now changed to an atomic_long_t type. >>>> >>>> New helper functions are added to properly separate out the task >>>> structure pointer and the embedded flags. >>> I started seeing KASAN use-after-free with current master, and a bisect >>> showed me that this commit 94a9717b3c40 ("locking/rwsem: Make >>> rwsem->owner an atomic_long_t") was the problem. Does it ring any >>> bells? I can easily reproduce it with xfstests (generic/464). >>> >>> Cheers, >>> -- >>> Luís >> This patch shouldn't change the behavior of the rwsem code. The code >> only access data within the rw_semaphore structures. I don't know why it >> will cause a KASAN error. I will have to reproduce it and figure out >> exactly which statement is doing the invalid access. > Yeah, screwing the bisection is something I've done in the past so I may > have got the wrong commit. Another detail is that I was running > xfstests against CephFS, I didn't tried with any other filesystem. I > can try to reproduce with btrfs or xfs next week. > > Cheers,
Oh, I don't have a CephFS setup. Will you use the scripts/decode_stacktrace.sh to find what line number is the offending statement? That will help in figuring out what has gone wrong. Anyway, it seems like a structure that include a rwsem is freed while another cpu is still waiting to acquire the lock. It is probably a hidden bug in the filesystem code somewhere that the recent changes in rwsem behavior make it easier for the problem to show up. Cheers, Longman