On Wed, Jun 18, 2014 at 7:53 PM, Chris Mason <c...@fb.com> wrote:
> On 06/18/2014 07:30 PM, Waiman Long wrote:
>> On 06/18/2014 07:27 PM, Chris Mason wrote:
>>> On 06/18/2014 07:19 PM, Waiman Long wrote:
>>>> On 06/18/2014 07:10 PM, Josef Bacik wrote:
>>>>>
>>>>> On 06/18/2014 03:47 PM, Waiman Long wrote:
>>>>>> On 06/18/2014 06:27 PM, Josef Bacik wrote:
>>>>>>>
>>>>>>> On 06/18/2014 03:17 PM, Waiman Long wrote:
>>>>>>>> On 06/18/2014 04:57 PM, Marc Dionne wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I've been seeing very reproducible soft lockups with 3.16-rc1
>>>>>>>>> similar
>>>>>>>>> to what is reported here:
>>>>>>>>> https://urldefense.proofpoint.com/v1/url?u=http://marc.info/?l%3Dlinux-btrfs%26m%3D140290088532203%26w%3D2&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=aoagvtZMwVb16gh1HApZZL00I7eP50GurBpuEo3l%2B5g%3D%0A&s=c62558feb60a480bbb52802093de8c97b5e1f23d4100265b6120c8065bd99565
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> , along with the
>>>>>>>>> occasional hard lockup, making it impossible to complete a parallel
>>>>>>>>> build on a btrfs filesystem for the package I work on.  This was
>>>>>>>>> working fine just a few days before rc1.
>>>>>>>>>
>>>>>>>>> Bisecting brought me to the following commit:
>>>>>>>>>
>>>>>>>>>     commit bd01ec1a13f9a327950c8e3080096446c7804753
>>>>>>>>>     Author: Waiman Long<waiman.l...@hp.com>
>>>>>>>>>     Date:   Mon Feb 3 13:18:57 2014 +0100
>>>>>>>>>
>>>>>>>>>         x86, locking/rwlocks: Enable qrwlocks on x86
>>>>>>>>>
>>>>>>>>> And sure enough if I revert that commit on top of current mainline,
>>>>>>>>> I'm unable to reproduce the soft lockups and hangs.
>>>>>>>>>
>>>>>>>>> Marc
>>>>>>>> The queue rwlock is fair. As a result, recursive read_lock is not
>>>>>>>> allowed unless the task is in an interrupt context. Doing recursive
>>>>>>>> read_lock will hang the process when a write_lock happens
>>>>>>>> somewhere in
>>>>>>>> between. Are recursive read_lock being done in the btrfs code?
>>>>>>>>
>>>>>>> We walk down a tree and read lock each node as we walk down, is that
>>>>>>> what you mean?  Or do you mean read_lock multiple times on the same
>>>>>>> lock in the same process, cause we definitely don't do that.  Thanks,
>>>>>>>
>>>>>>> Josef
>>>>>> I meant recursively read_lock the same lock in a process.
>>>>> I take it back, we do actually do this in some cases.  Thanks,
>>>>>
>>>>> Josef
>>>> This is what I thought when I looked at the looking code in btrfs. The
>>>> unlock code doesn't clear the lock_owner pid, this may cause the
>>>> lock_nested to be set incorrectly.
>>>>
>>>> Anyway, are you going to do something about it?
>>> Thanks for reporting this, we shouldn't be actually taking the lock
>>> recursively.  Could you please try with lockdep enabled?  If the problem
>>> goes away with lockdep on, I think I know what's causing it.  Otherwise,
>>> lockdep should clue us in.
>>>
>>> -chris
>>
>> I am not sure if lockdep will report recursive read_lock as this is
>> possible in the past. If not, we certainly need to add that capability
>> to it.
>>
>> One more thing, I saw comment in btrfs tree locking code about taking a
>> read lock after taking a write (partial?) lock. That is not possible
>> with even with the old rwlock code.
>
> With lockdep on, the clear_path_blocking function you're hitting
> softlockups in is different.  Futjitsu hit a similar problem during
> quota rescans, and it goes away with lockdep on.  I'm trying to nail
> down where we went wrong, but please try lockdep on.
>
> -chris

With lockdep on I'm unable to reproduce the lockups, and there are no
lockdep warnings.

Marc
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to