On Sat, Aug 2, 2014 at 8:28 PM, Mitch Harder
<mitch.har...@sabayonlinux.org> wrote:
> On Sat, Aug 2, 2014 at 6:35 PM, Peter Waller <pe...@scraperwiki.com> wrote:
>> Hi All,
>>
>> My TL;DR questions are at the bottom, before the stack trace.
>>
>> I'm running Ubuntu 14.04. I wonder if this problem is related to the
>> thread titled "Machine lockup due to btrfs-transaction on AWS EC2
>> Ubuntu 14.04" which I started on the 29th of July:
>>
>>> http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
>>
>> Kernel: 3.15.7-031507-generic
>>
>> I'm on a single block device system, i.e, no RAID.
>>
>> I was observing ENOSPC from `mkdir` and `rename` on this system, with
>> a good amount of free disk space (df -h reports 62 GB remain). I added
>> enospc_debug (full umount/mount, not just mount -o remount), but this
>> had no apparent effect when receiving ENOSPC from userland.
>>
>> $ sudo btrfs fi df /path/to/volume
>> Data, single: total=489.97GiB, used=427.75GiB
>> System, DUP: total=8.00MiB, used=60.00KiB
>> System, single: total=4.00MiB, used=0.00
>> Metadata, DUP: total=5.00GiB, used=4.50GiB
>> Metadata, single: total=8.00MiB, used=0.00
>> unknown, single: total=512.00MiB, used=820.00KiB
>>
>> After a thorough search of the internet for ENOSPC BTRFS I found
>> various resources and came to understand a little bit more. One thing
>> which broke my intuition severely is that I expected if there is a
>> large number of free GiB, I should expect things to continue to work.
>>
>> In this case, for example, metadata has 0.5GiB free ("sounds like
>> plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
>> would I get ENOSPC for a file rename?
>>
>> I expected that if metadata needed more space, it would just eat it
>> from the 'data'. Now I believe this not to be the case and that it
>> wanted to allocate > 0.5GiB, and this is why I was getting ENOSPC.
>>
>> I tried a rebalance with btrfs balance start -dusage=10 and tried
>> increasing the value until I saw reallocations in dmesg.
>>
>> This spat out a large number of messages in dmesg, of this form:
>>
>>> [376096.546353] BTRFS info (device dm-0): relocating block group
>>> 530457821184 flags 1
>>> [376010.736879] BTRFS info (device dm-0): 40 enospc errors during balance
>>
>> (and a full stack trace at the end of this message).
>>
>> The rebalance printed:
>>
>>> ERROR: error during balancing '/path/to/volume' - No space left on device
>>> There may be more info in syslog - try dmesg | tail
>>
>> Eventually, not knowing what else to do I had to take my escape hatch
>> and enlarge the volume. When I did this, metadata grew by 1GiB:
>>
>>> Data, single: total=490.97GiB, used=427.75GiB
>>> System, DUP: total=8.00MiB, used=60.00KiB
>>> System, single: total=4.00MiB, used=0.00
>>> Metadata, DUP: total=5.50GiB, used=4.50GiB
>>> Metadata, single: total=8.00MiB, used=0.00
>>> unknown, single: total=512.00MiB, used=0.00
>>
>> A few questions:
>>
>> * Why didn't the metadata grow before enlarging the disk?
>> * Why didn't the rebalance enable the metadata to grow?
>> * Why is it necessary to rebalance? Can't it automatically take some
>> free space from 'data'?
>> * Are my machine lockups related to the fact I was low on space?
>> * Can we improve the documentation/FAQ for this? I was scratching my
>> head in particular because my notion of free space definitely does not
>> match up with BTRFS', and I didn't find the FAQ very helpful for
>> getting out of this mess.
>> * It isn't documented on the wiki what enospc_debug is supposed to do,
>> so I couldn't tell whether I should have expected it to tell me
>> anything in my circumstances.
>> * What is the best course of action to take (other than enlarging the
>> disk or deleting files) if I encounter this situation again?
>>
>
> Looking at this line:
>
>> Data, single: total=489.97GiB, used=427.75GiB
>
> I see that btrfs has allocated almost the entire disk to Data, and it
> appears you are starved for Metadata room.
>
> Once btrfs allocates space for either Data or Metadata, there are
> currently no build-in kernel mechanisms re-allocate that space. We
> have to use the userland balance tools.
>
> I agree that this behavior can become a "gotcha". Btrfs has the
> capability to run in a mode where Data and Metadata are combined, but
> there is a speed penalty running in Mixed Data/Metadata mode.
>
> The btrfs balance tools have to ability to use filters to run a
> quicker pass on just the mostly-empty blocks, skipping a full balance.
>
> https://btrfs.wiki.kernel.org/index.php/Balance_Filters
>
> I would suggest this as the next step.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Mitch,
I have run into this error to and this seems to be a rather big issue as ext4
seems to never run of metadata room at least from my testing. I feel greatly
that this part of btrfs needs be improved and moved into a function or set
of functions for re balancing metadata in the kernel itself.
Regards Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html