[Qemu-block] RFC: some problems with bdrv_get_block_status

Denis V. Lunev Fri, 28 Apr 2017 09:32:25 -0700

Hello, All!

Recently we have experienced problems with very slow
bdrv_get_block_status call, which is massive called f.e.
from mirror_run().


The problem was narrowed down to slow lseek64() system
call, which can take 1-2-3 seconds.

root@s158 ~]# strace -f -p 77048 -T  -e lseek
Process 77048 attached with 14 threads
[pid 77053] lseek(23, 6359613440, SEEK_DATA) = 6360276992 <1.250323>
[pid 77053] lseek(23, 6360334336, SEEK_DATA) = 6360334336 <0.000021>
[pid 77053] lseek(23, 6360334336, SEEK_HOLE) = 6360412160 <0.007850>
[pid 77053] lseek(23, 6360465408, SEEK_DATA) = 6360596480 <0.244460>
[pid 77053] lseek(23, 6360596480, SEEK_DATA) = 6360596480 <0.000012>
[pid 77053] lseek(23, 6360596480, SEEK_HOLE) = 6361440256 <0.009102>
[pid 77053] lseek(23, 6361448448, SEEK_DATA) = 6362243072 <1.482338>
[pid 77053] lseek(23, 6361710592, SEEK_DATA) = 6362243072 <0.987192>
[pid 77053] lseek(23, 6362300416, SEEK_DATA) = 6362300416 <0.000012>
[pid 77053] lseek(23, 6362300416, SEEK_HOLE) = 6363058176 <0.008983>
[pid 77053] lseek(23, 6363086848, SEEK_DATA) = 6364467200 <2.554859>
[pid 77053] lseek(23, 6363807744, SEEK_DATA) = 6364467200 <1.220386>
[pid 77053] lseek(23, 6364528640, SEEK_DATA) = 6364528640 <0.000012>
[pid 77053] lseek(23, 6364528640, SEEK_HOLE) = 6365327360 <0.007906>
[pid 77053] lseek(23, 6365380608, SEEK_DATA) = 6366162944 <1.442824>
[pid 77053] lseek(23, 6365904896, SEEK_DATA) = 6366162944 <0.478188>
[pid 77053] lseek(23, 6366167040, SEEK_DATA) = 6366167040 <0.000013>
[pid 77053] lseek(23, 6366167040, SEEK_HOLE) = 6367002624 <0.009230>
[pid 77053] lseek(23, 6367019008, SEEK_DATA^CProcess 77048 detached


The process eats 100% in system time.

The problem comes from this branch of the code

bdrv_co_get_block_status
    .......
    if (bs->file &&
        (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
        (ret & BDRV_BLOCK_OFFSET_VALID)) {
        int file_pnum;

        ret2 = bdrv_co_get_block_status(bs->file, ret >> BDRV_SECTOR_BITS,
                                        *pnum, &file_pnum);
        if (ret2 >= 0) {
            /* Ignore errors.  This is just providing extra information, it
             * is useful but not necessary.
             */
            if (!file_pnum) {
                /* !file_pnum indicates an offset at or beyond the EOF;
it is
                 * perfectly valid for the format block driver to point
to such
                 * offsets, so catch it and mark everything as zero */
                ret |= BDRV_BLOCK_ZERO;
            } else {
                /* Limit request to the range reported by the protocol
driver */
                *pnum = file_pnum;
                ret |= (ret2 & BDRV_BLOCK_ZERO);
            }
        }
    }

which was added in the commit

commit 5daa74a6ebce7543aaad178c4061dc087bb4c705
Author: Paolo Bonzini <pbonz...@redhat.com>
Date:   Wed Sep 4 19:00:38 2013 +0200

    block: look for zero blocks in bs->file
   
    Reviewed-by: Eric Blake <ebl...@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
    Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>

without much details.

Frankly speaking, this optimization should not give much.
If upper layer format (QCOW2) says that we have data
here, then nowadays in 99.9% we do have the data.
Meanwhile this branch can cause problems. We would
need block cleared entirely to get the benefit for most
cases in mirror and backup operations.

At my opinion it worth to drop this at all.

Guys, do you have any opinion?

Den

P.S. The kernel is one based on RedHat 3.10.0-514. The same
      problem was observed in 3.10.0-327 too.

[Qemu-block] RFC: some problems with bdrv_get_block_status

Reply via email to