Thanks for you comment Kern.

> Hello,
>
> This is a problem I have been considering for some time.  The problem is
> that current architecture of Bacula just does not properly handle
> multi-treading the FD.  Yes, you can do it by running two Jobs, but if
> you are not 100% on top of the  design limitations of Bacula, it is
> unlikely that restores will work.  Perhaps you have some clever way
> around this, so at some point in the future, I would like to look at
> this with you.  In between time, perhaps you can describe more in detail
> what you propose.

We'll I actually think this is a "read problem" only .. getting data off
the devices. The need for restore parallel streams is a very different
use-case it attack. The tapedevices can stream an infinite stream of very
small files very quickly and if you have a raid with battery-backed raid
controller the disk-system can also absorb without much parallism.

It is on the read side the challenge gets in .. this test is done on
a local spinning disk with xfs ~8ms seek time. using CephFS (which
we do in prodution) file access latency can be way higher on harddisk
storage.

pseudocode:

for f in dir then:
   posix_fadvise(f)
fi

for f in dir then:
   read_file(f)
fi


Test C-code attached .. not pretty by demonstrates the approach to get
the OS to do the parallism.

Test script:

#!/bin/bash

for bs in 4096 8192 16384 32768 65536 131072; do
        for i in $(seq 1 10000); do
                dd if=/dev/zero of=$i.file bs=$bs count=1 status=none
        done
        echo 3 > /proc/sys/vm/drop_caches
        echo "With fadvise blocksize $bs"
        time ~jk/test f > /dev/null
        echo 3 > /proc/sys/vm/drop_caches
        echo "Without fadvise blocksize $bs"
        time ~jk/test > /dev/null
done


With fadvise blocksize 4096
Issuing fadvise

real    0m0.419s
user    0m0.062s
sys     0m0.250s
Without fadvise blocksize 4096

real    0m0.554s
user    0m0.034s
sys     0m0.184s
With fadvise blocksize 8192
Issuing fadvise

real    0m0.443s
user    0m0.068s
sys     0m0.269s
Without fadvise blocksize 8192

real    0m0.613s
user    0m0.044s
sys     0m0.204s
With fadvise blocksize 16384
Issuing fadvise

real    0m0.394s
user    0m0.056s
sys     0m0.328s
Without fadvise blocksize 16384

real    0m0.727s
user    0m0.035s
sys     0m0.268s
With fadvise blocksize 32768
Issuing fadvise

real    0m0.596s
user    0m0.080s
sys     0m0.395s
Without fadvise blocksize 32768

real    0m1.210s
user    0m0.060s
sys     0m0.398s
With fadvise blocksize 65536
Issuing fadvise

real    0m0.897s
user    0m0.129s
sys     0m0.491s
Without fadvise blocksize 65536

real    0m1.788s
user    0m0.119s
sys     0m0.653s
With fadvise blocksize 131072
Issuing fadvise

real    0m19.406s
user    0m0.206s
sys     0m0.738s
Without fadvise blocksize 131072

real    0m35.755s
user    0m0.269s
sys     0m1.058s


Thus depending on file-size there is approximately a 2x read-side
improvement on this naive solution.


I'm not a C-coder (as the attached program demonstrates :-) - but
limiting the problem to "read-size/single job/single-stream" parallism
simplifies the solution and changes to the codebase. It is only the
callback stuff that I dont understand. I was thinking about just
overloading breaddir to have an internal buffer of the next X (1000 files
< 512KB) that it would advance the fadvise call on, but that would
also advise the files that would be skipped by an incremntal backup.

Hope above both explains background, concept and proves the benefits.

So the "clever way" for restores would be to ignore it, as it really isn't
that relevant to this problem. Thats
at least what I see from real-world production backup and restore scenarios.

-- 
Jesper

Attachment: test.c
Description: Binary data

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to