On Fri, Jan 18, 2019 at 10:58 AM Krishna Mannem <kman...@pivotal.io> wrote: > > Hi, > > I work on Concourse-CI (https://concourse-ci.org/). It's a container > based CI system where we create volumes using Btrfs. Due to the nature > of Concourse, btrfs subvolumes are short lived ( sometimes a few > seconds if it's a small automation task, sometimes hours if its a long > build and test suite). We ran into a failure and we're not really sure > what to make of it. Looks like a lock failed to get acquired before > scheduling? We need another eye on this. > > >uname -a > Linux a06d0ae3-b242-4518-85ba-977426a3f214 4.15.0-43-generic > #46~16.04.1-Ubuntu SMP Fri Dec 7 13:31:08 UTC 2018 x86_64 x86_64 > x86_64 GNU/Linux > > > btrfs df > btrfs filesystem df /var/vcap/data/baggageclaim/volumes > Data, single: total=263.01GiB, used=262.59GiB > System, DUP: total=8.00MiB, used=48.00KiB > Metadata, DUP: total=9.00GiB, used=6.60GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > Calltrace: > https://gist.github.com/kcmannem/3baf848845bc986f5ac7d23d3df51e56
Can you please dump a "echo w > /proc/sysrq-trigger" so that developers can have a good understanding on what's going on? >From the above info, 1. mm->mmap_sem is held 2. transaction is being committed and others are waiting for it. 3. subvolumes are deleted in short time which may make btrfs-cleaner thread busy as hell. thanks, liubo