On 7/19/19 3:45 PM, Luis Henriques wrote:
> Waiman Long <long...@redhat.com> writes:
>
>> On 7/19/19 2:45 PM, Luis Henriques wrote:
>>> On Mon, May 20, 2019 at 04:59:12PM -0400, Waiman Long wrote:
>>>> The rwsem->owner contains not just the task structure pointer, it also
>>>> holds some flags for storing the current state of the rwsem. Some of
>>>> the flags may have to be atomically updated. To reflect the new reality,
>>>> the owner is now changed to an atomic_long_t type.
>>>>
>>>> New helper functions are added to properly separate out the task
>>>> structure pointer and the embedded flags.
>>> I started seeing KASAN use-after-free with current master, and a bisect
>>> showed me that this commit 94a9717b3c40 ("locking/rwsem: Make
>>> rwsem->owner an atomic_long_t") was the problem.  Does it ring any
>>> bells?  I can easily reproduce it with xfstests (generic/464).
>>>
>>> Cheers,
>>> --
>>> Luís
>> This patch shouldn't change the behavior of the rwsem code. The code
>> only access data within the rw_semaphore structures. I don't know why it
>> will cause a KASAN error. I will have to reproduce it and figure out
>> exactly which statement is doing the invalid access.
> Yeah, screwing the bisection is something I've done in the past so I may
> have got the wrong commit.  Another detail is that I was running
> xfstests against CephFS, I didn't tried with any other filesystem.  I
> can try to reproduce with btrfs or xfs next week.
>
> Cheers,

Oh, I don't have a CephFS setup. Will you use the
scripts/decode_stacktrace.sh to find what line number is the offending
statement? That will help in figuring out what has gone wrong.

Anyway, it seems like a structure that include a rwsem is freed while
another cpu is still waiting to acquire the lock. It is probably a
hidden bug in the filesystem code somewhere that the recent changes in
rwsem behavior make it easier for  the problem to show up.

Cheers,
Longman

Reply via email to