On Tue, Jul 1, 2014 at 6:05 PM, Cody P Schafer <d...@codyps.com> wrote:
> On Tue, Jul 1, 2014 at 4:04 PM, Chris Mason <c...@fb.com> wrote:
>> On 06/30/2014 07:42 PM, Cody P Schafer wrote:
>>> On Mon, Jun 30, 2014 at 1:30 PM, Chris Mason <c...@fb.com> wrote:
>>>> On 06/30/2014 02:11 PM, Chris Mason wrote:
>>>>> On 06/29/2014 04:02 PM, Cody P Schafer wrote:
>>>>>> On Fri, Jun 27, 2014 at 7:22 PM, Chris Samuel <ch...@csamuel.org> wrote:
>>>>>>> On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:
>>>>>>>
>>>>>>>> If I'm not mistaken the fix for the 3.16 series bug was:
>>>>>>>>
>>>>>>>> ea4ebde02e08558b020c4b61bb9a4c0fcf63028e
>>>>>>>>
>>>>>>>> Btrfs: fix deadlocks with trylock on tree nodes.
>>>>>>>
>>>>>>> That patch applies cleanly to 3.15.2 so if it is indeed the fix it 
>>>>>>> should
>>>>>>> probably go to -stable for the next 3.15 release..
>>>>>>>
>>>>>>> Unfortunately my test system died a while ago (hardware problem) and 
>>>>>>> I've not
>>>>>>> been able to resurrect it yet.
>>>>>>
>>>>>> I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).
>>>>>> I've also tried 3.15.2 with ea4ebde02e08558b020c4b61bb9a4c applied on
>>>>>> top with similar results.
>>>>>> I've been triggering the hang with 'rsync -hPaHAXx --del /mnt/home/a/
>>>>>> /home/a/' where /mnt/home and /home are 2 separate btrfs filesystems
>>>>>> on 2 separate disks.
>>>>>>
>>>>>> dmesg with w-trigger: 
>>>>>> https://urldefense.proofpoint.com/v1/url?u=http://bpaste.net/show/419555&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=SAjzDO8AnhJBEWtUi6s8VGVQd2sORQ%2FJz5tWH4nOYWg%3D%0A&s=2c4ff3f7f39b2e6d3dcd4947905df54d6a534b35adf63c55d8c50e28ef5781b6
>>>>>> --
>>>>>
>>>>> These traces show us waiting for IO, but it doesn't show anyone doing
>>>>> the IO.  Either we're failing to kick off our work queues or they are
>>>>> stuck on something else.
>>>>>
>>>>> Could you please send a sysrq-t and sysrq-l while you're stuck?  That
>>>>> will show us all the procs and all the CPUs.
>>>>
>>>> Also, do you have any nodatacow files in here?  Please say yes.
>>>>
>>>
>>> kernel log from 3.15.2 + ea4ebde02 showing the blocked tasks,
>>> sysrq-{w,t,l} included
>>> https://urldefense.proofpoint.com/v1/url?u=http://bpaste.net/show/423296/&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=SAjzDO8AnhJBEWtUi6s8VGVQd2sORQ%2FJz5tWH4nOYWg%3D%0A&s=5af8bc75059925af242b0eef1f4b94348d233d79968d53ff36b7c2594c9dd6b9
>>>
>>> I haven't explicitely created any nodatacow files, is there a quick
>>> way to tell if there are any? Right now I'm doing
>>> `lsattr -R /mnt/home/a/ 2>/dev/null | grep -- '^-*C-* '` to try and check.
>>>
>>> (2>/dev/null is hiding lots of "Operation not supported While reading
>>> flags on" warnings)
>>>
>>
>> If you haven't turned nodatacow on intentionally, you don't have any
>> nodatacow files ;)  I have been trying to reproduce this with rsync and
>> other code that hammers on the ordered writeback, but no luck yet.
>>
>> Before we spend too much time triggering it again, I'd like you to
>> please try a patch from Filipe that is in current mainline.  I've cherry
>> picked on top of 3.15.3 in a branch called v3.15.y:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git v3.15.y
>
> Will do. The rsync I'm running is processing a lot of chromium cache
> files when it hangs (just for a reference), and ends up triggering a
> bunch of deletes as well.

Still a problem with your v3.15.y (eb97581), here's the log with
sysrq-t and sysrq-l
http://bpaste.net/show/428234/

Also, correction, it's a firefox cache dir rsync that seems to trigger
it (stalls pretty early on and very consistently):

[... snip ...]
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/1F/F43F9d01
          5.23M 100%   17.82MB/s    0:00:00 (xfr#452, ir-chk=1201/6659)
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/20/
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/20/23A66d01
        116.82K 100%  376.50kB/s    0:00:00 (xfr#453, ir-chk=1200/6659)
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/21/
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/23/
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/24/
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/25/
.cache/mozilla/firefox/kqtl1tlc.test/Cache/7/25/7C836d01
[... stall here ...]
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to