On 06/18/2014 07:30 PM, Waiman Long wrote:
> On 06/18/2014 07:27 PM, Chris Mason wrote:
>> On 06/18/2014 07:19 PM, Waiman Long wrote:
>>> On 06/18/2014 07:10 PM, Josef Bacik wrote:
>>>>
>>>> On 06/18/2014 03:47 PM, Waiman Long wrote:
>>>>> On 06/18/2014 06:27 PM, Josef Bacik wrote:
>>>>>>
>>>>>> On 06/18/2014 03:17 PM, Waiman Long wrote:
>>>>>>> On 06/18/2014 04:57 PM, Marc Dionne wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I've been seeing very reproducible soft lockups with 3.16-rc1
>>>>>>>> similar
>>>>>>>> to what is reported here:
>>>>>>>> https://urldefense.proofpoint.com/v1/url?u=http://marc.info/?l%3Dlinux-btrfs%26m%3D140290088532203%26w%3D2&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=aoagvtZMwVb16gh1HApZZL00I7eP50GurBpuEo3l%2B5g%3D%0A&s=c62558feb60a480bbb52802093de8c97b5e1f23d4100265b6120c8065bd99565
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> , along with the
>>>>>>>> occasional hard lockup, making it impossible to complete a parallel
>>>>>>>> build on a btrfs filesystem for the package I work on.  This was
>>>>>>>> working fine just a few days before rc1.
>>>>>>>>
>>>>>>>> Bisecting brought me to the following commit:
>>>>>>>>
>>>>>>>>     commit bd01ec1a13f9a327950c8e3080096446c7804753
>>>>>>>>     Author: Waiman Long<waiman.l...@hp.com>
>>>>>>>>     Date:   Mon Feb 3 13:18:57 2014 +0100
>>>>>>>>
>>>>>>>>         x86, locking/rwlocks: Enable qrwlocks on x86
>>>>>>>>
>>>>>>>> And sure enough if I revert that commit on top of current mainline,
>>>>>>>> I'm unable to reproduce the soft lockups and hangs.
>>>>>>>>
>>>>>>>> Marc
>>>>>>> The queue rwlock is fair. As a result, recursive read_lock is not
>>>>>>> allowed unless the task is in an interrupt context. Doing recursive
>>>>>>> read_lock will hang the process when a write_lock happens
>>>>>>> somewhere in
>>>>>>> between. Are recursive read_lock being done in the btrfs code?
>>>>>>>
>>>>>> We walk down a tree and read lock each node as we walk down, is that
>>>>>> what you mean?  Or do you mean read_lock multiple times on the same
>>>>>> lock in the same process, cause we definitely don't do that.  Thanks,
>>>>>>
>>>>>> Josef
>>>>> I meant recursively read_lock the same lock in a process.
>>>> I take it back, we do actually do this in some cases.  Thanks,
>>>>
>>>> Josef
>>> This is what I thought when I looked at the looking code in btrfs. The
>>> unlock code doesn't clear the lock_owner pid, this may cause the
>>> lock_nested to be set incorrectly.
>>>
>>> Anyway, are you going to do something about it?
>> Thanks for reporting this, we shouldn't be actually taking the lock
>> recursively.  Could you please try with lockdep enabled?  If the problem
>> goes away with lockdep on, I think I know what's causing it.  Otherwise,
>> lockdep should clue us in.
>>
>> -chris
> 
> I am not sure if lockdep will report recursive read_lock as this is
> possible in the past. If not, we certainly need to add that capability
> to it.
> 
> One more thing, I saw comment in btrfs tree locking code about taking a
> read lock after taking a write (partial?) lock. That is not possible
> with even with the old rwlock code.

With lockdep on, the clear_path_blocking function you're hitting
softlockups in is different.  Futjitsu hit a similar problem during
quota rescans, and it goes away with lockdep on.  I'm trying to nail
down where we went wrong, but please try lockdep on.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to