Hi list,

On 05/31/2016 03:36 AM, Qu Wenruo wrote:


Hans van Kranenburg wrote on 2016/05/06 23:28 +0200:
Hi,

I've got a mostly inactive btrfs filesystem inside a virtual machine
somewhere that shows interesting behaviour: while no interesting disk
activity is going on, btrfs keeps allocating new chunks, a GiB at a time.

A picture, telling more than 1000 words:
https://syrinx.knorrie.org/~knorrie/btrfs/keep/btrfs_usage_ichiban.png
(when the amount of allocated/unused goes down, I did a btrfs balance)

Nice picture.
Really better than 1000 words.

AFAIK, the problem may be caused by fragments.

And even I saw some early prototypes inside the codes to allow btrfs do
allocation smaller extent than required.
(E.g. caller needs 2M extent, but btrfs returns 2 1M extents)

But it's still prototype and seems no one is really working on it now.

So when btrfs is writing new data, for example, to write about 16M data,
it will need to allocate a 16M continuous extent, and if it can't find
large enough space to allocate, then create a new data chunk.

Despite the already awesome chunk level usage pricutre, I hope there is
info about extent level allocation to confirm my assumption.

You could dump it by calling "btrfs-debug-tree -t 2 <device>".
It's normally recommended to do it unmounted, but it's still possible to
call it mounted, although not 100% perfect though.
(Then I'd better find a good way to draw a picture of
allocate/unallocate space and how fragments the chunks are)

So, I finally found some spare time to continue investigating. In the meantime, the filesystem has happily been allocating new chunks every few days, filling them up way below 10% with data before starting a new one.

The chunk allocation primarily seems to happen during cron.daily. But, manually executing all the cronjobs that are in there, even multiple times, does not result in newly allocated chunks. Yay. :(

After the previous post, I put a little script in between every two jobs in /etc/cron.daily that prints the output of btrfs fi df to syslog and sleeps for 10 minutes so I can easily find out afterwards during which one it happened.

Bingo! The "apt" cron.daily, which refreshes package lists and triggers unattended-upgrades.

Jun 7 04:01:46 ichiban root: Data, single: total=12.00GiB, used=5.65GiB
[...]
2016-06-07 04:01:56,552 INFO Starting unattended upgrades script
[...]
Jun 7 04:12:10 ichiban root: Data, single: total=13.00GiB, used=5.64GiB

And, this thing is clever enough to do things once a day, even if you would execute it multple times... (Hehehe...)

Ok, let's try doing some apt-get update then.

Today, the latest added chunks look like this:

# ./show_usage.py /
[...]
chunk vaddr 63495471104 type 1 stripe 0 devid 1 offset 9164554240 length 1073741824 used 115499008 used_pct 10 chunk vaddr 64569212928 type 1 stripe 0 devid 1 offset 12079595520 length 1073741824 used 36585472 used_pct 3 chunk vaddr 65642954752 type 1 stripe 0 devid 1 offset 14227079168 length 1073741824 used 17510400 used_pct 1 chunk vaddr 66716696576 type 4 stripe 0 devid 1 offset 3275751424 length 268435456 used 72663040 used_pct 27 chunk vaddr 66985132032 type 1 stripe 0 devid 1 offset 15300820992 length 1073741824 used 86986752 used_pct 8 chunk vaddr 68058873856 type 1 stripe 0 devid 1 offset 16374562816 length 1073741824 used 21188608 used_pct 1 chunk vaddr 69132615680 type 1 stripe 0 devid 1 offset 17448304640 length 1073741824 used 64032768 used_pct 5 chunk vaddr 70206357504 type 1 stripe 0 devid 1 offset 18522046464 length 1073741824 used 71712768 used_pct 6

Now I apt-get update...

before: Data, single: total=13.00GiB, used=5.64GiB
during: Data, single: total=13.00GiB, used=5.59GiB
after : Data, single: total=14.00GiB, used=5.64GiB

# ./show_usage.py /
[...]
chunk vaddr 63495471104 type 1 stripe 0 devid 1 offset 9164554240 length 1073741824 used 119279616 used_pct 11 chunk vaddr 64569212928 type 1 stripe 0 devid 1 offset 12079595520 length 1073741824 used 36585472 used_pct 3 chunk vaddr 65642954752 type 1 stripe 0 devid 1 offset 14227079168 length 1073741824 used 17510400 used_pct 1 chunk vaddr 66716696576 type 4 stripe 0 devid 1 offset 3275751424 length 268435456 used 73170944 used_pct 27 chunk vaddr 66985132032 type 1 stripe 0 devid 1 offset 15300820992 length 1073741824 used 82251776 used_pct 7 chunk vaddr 68058873856 type 1 stripe 0 devid 1 offset 16374562816 length 1073741824 used 21188608 used_pct 1 chunk vaddr 69132615680 type 1 stripe 0 devid 1 offset 17448304640 length 1073741824 used 6041600 used_pct 0 chunk vaddr 70206357504 type 1 stripe 0 devid 1 offset 18522046464 length 1073741824 used 46178304 used_pct 4 chunk vaddr 71280099328 type 1 stripe 0 devid 1 offset 19595788288 length 1073741824 used 84770816 used_pct 7

Interesting. There's a new one at 71280099328, 7% filled, and the usage of the 4 previous ones went down a bit.

Now I want to know what the distribution of data inside these chunks, to find out how fragmented it might be, so I spent some time this evening to play a bit more with the search ioctl, and list all extents and free space inside a chunk:

https://github.com/knorrie/btrfs-heatmap/blob/master/chunk-contents.py

Currently the output looks like this:

# ./chunk-contents.py 70206357504 .
chunk vaddr 70206357504 length 1073741824
0x1058a00000 0x105a0cafff  23900160 2.23%
0x105a0cb000 0x105a0cbfff      4096 0.00% extent
0x105a0cc000 0x105a12ffff    409600 0.04%
0x105a130000 0x105a130fff      4096 0.00% extent
0x105a131000 0x105a21dfff    970752 0.09%
0x105a21e000 0x105a220fff     12288 0.00% extent
0x105a221000 0x105a222fff      8192 0.00% extent
0x105a223000 0x105a224fff      8192 0.00% extent
0x105a225000 0x105a225fff      4096 0.00% extent
0x105a226000 0x105a226fff      4096 0.00% extent
0x105a227000 0x105a227fff      4096 0.00% extent
0x105a228000 0x105a2c3fff    638976 0.06%
0x105a2c4000 0x105a2c5fff      8192 0.00% extent
0x105a2c6000 0x105a317fff    335872 0.03%
0x105a318000 0x105a31efff     28672 0.00% extent
0x105a31f000 0x105a3affff    593920 0.06%
0x105a3b0000 0x105a3b2fff     12288 0.00% extent
0x105a3b3000 0x105a3b6fff     16384 0.00%
0x105a3b7000 0x105a3bbfff     20480 0.00% extent
0x105a3bc000 0x105a3e2fff    159744 0.01%
0x105a3e3000 0x105a3e3fff      4096 0.00% extent
0x105a3e4000 0x105a3e4fff      4096 0.00% extent
0x105a3e5000 0x105a468fff    540672 0.05%
0x105a469000 0x105a46cfff     16384 0.00% extent
0x105a46d000 0x105a493fff    159744 0.01%
0x105a494000 0x105a495fff      8192 0.00% extent
0x105a496000 0x105a49afff     20480 0.00%
[...]

After running apt-get update a few extra times, only the last (new) chunk keeps changing a bit, and stabilizes around 10% usage:

chunk vaddr 71280099328 type 1 stripe 0 devid 1 offset 19595788288 length 1073741824 used 112271360 used_pct 10

chunk vaddr 71280099328 length 1073741824
0x1098a00000 0x109e00dfff  90234880 8.40%
0x109e00e000 0x109e00efff      4096 0.00% extent
0x109e00f000 0x109e00ffff      4096 0.00% extent
0x109e010000 0x109e010fff      4096 0.00%
0x109e011000 0x109e011fff      4096 0.00% extent
0x109e012000 0x109e342fff   3346432 0.31%
0x109e343000 0x109e344fff      8192 0.00% extent
0x109e345000 0x109e47cfff   1277952 0.12%
0x109e47d000 0x109e47efff      8192 0.00% extent
0x109e47f000 0x109e480fff      8192 0.00%
0x109e481000 0x109e482fff      8192 0.00% extent
0x109e483000 0x109e484fff      8192 0.00% extent
0x109e485000 0x109e48afff     24576 0.00% extent
0x109e48b000 0x109e48cfff      8192 0.00%
0x109e48d000 0x109e48efff      8192 0.00% extent
0x109e48f000 0x109e490fff      8192 0.00%
0x109e491000 0x109e492fff      8192 0.00% extent
0x109e493000 0x109e493fff      4096 0.00% extent
0x109e494000 0x109eb00fff   6737920 0.63%
0x109eb01000 0x109eb10fff     65536 0.01% extent
0x109eb11000 0x109ebc0fff    720896 0.07%
0x109ebc1000 0x109ec00fff    262144 0.02% extent
0x109ec01000 0x109ecc4fff    802816 0.07%

Full output at https://syrinx.knorrie.org/~knorrie/btrfs/keep/2016-06-08-extents.txt

Free space is extremely fragmented. The last one, which just got filled a bit using apt-get update looks better, with a few blocks up to 25% of free space, but the previous ones are a mess.

So, instead of being the cause, apt-get update causing a new chunk to be allocated might as well be the result of existing ones already filled up with too many fragments.

The next question is what files these extents belong to. To find out, I need to open up the extent items I get back and follow a backreference to an inode object. Might do that tomorrow, fun.

To be honest, I suspect /var/log and/or the file storage of mailman to be the cause of the fragmentation, since there's logging from postfix, mailman and nginx going on all day long in a slow but steady tempo. While using btrfs for a number of use cases at work now, we normally don't use it for the root filesystem. And the cases where it's used as root filesystem don't do much logging or mail.

And no, autodefrag is not in the mount options currently. Would that be helpful in this case?

--
Hans van Kranenburg - System / Network Engineer
T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to