Btrfs: cut down on loops through the allocator - a5e681d9bd641c4f0677e87d3a0c92a8f4f16293

Alex Lyakas Thu, 07 Mar 2019 01:35:11 -0800

Hi Josef,

This commit added the following code in find_free_extent:
                        ret = do_chunk_alloc(trans, root, flags,
                                             CHUNK_ALLOC_FORCE);
+
+                       /*
+                        * If we can't allocate a new chunk we've already looped
+                        * through at least once, move on to the NO_EMPTY_SIZE
+                        * case.
+                        */
+                       if (ret == -ENOSPC)
+                               loop = LOOP_NO_EMPTY_SIZE;
+


With this, I am hitting an early enospc, with the following scenario:
- assume a file system is almost full, and has 5GB free space on the device left
- there are multiple threads (say 6) calling find_free_extent() in
parallel with empty_size=0
- they all don't find a block group to allocate from, so they call
do_chunk_alloc()
- 5x1GB chunks are allocated, but additional do_chunk_alloc call()
returns -ENOSPC
- As a result, this thread moves to LOOP_NO_EMPTY_SIZE, but since
empty_size is already zero, it returns -ENOSPC to the caller. But in
fact we have 5GB free space to allocate from. We just need to to an
extra loop.

Basically, do_chunk_alloc() returning -ENOSPC does not mean that there
is no space. It can happen that parallel chunk allocations exhausted
the device, but we have plenty of space.

Furthermore, this thread now sets space_info->max_extent_size. And
from now on, any allocation that needs more than max_extent_size will
immediately fail. But if we unmount and mount again, we will have
plenty of space.

This happens to me on 4.14, but I think the mainline still has the same logic.

Thanks,
Alex.

Btrfs: cut down on loops through the allocator - a5e681d9bd641c4f0677e87d3a0c92a8f4f16293

Reply via email to