When you say "CopyJobs start queueing" do you mean the status bconsole command
shows them as waiting for something?  If so, can you post the output?

Are you using attribute spooling (the log would show "Sending spooled attrs to
the Director..." if you are)?

__Martin


>>>>> On Fri, 16 Jan 2026 13:03:19 -0300, Leandro Saldivar via Bacula-users 
>>>>> said:
> 
> Hi all,
> 
> I’m running Bacula Community Edition 15.0.2 and I’m trying to improve Copy
> and Restore job throughput between two on-prem Storage Daemons (Hetzner
> bare metal servers) connected via 10 Gbps links.
> 
> I’ve been investigating this for a while but haven’t been able to pinpoint
> what’s limiting throughput, so I’m sharing the data here to see if anyone
> has run into something similar.
> 
> Setup summary:
> 
>    -
> 
>    Director and main Storage Daemon on the same host
>    -
> 
>    Remote Storage Daemon on a separate host
>    -
> 
>    Both servers running Bacula CE 15.0.2
>    -
> 
>    Mostly default configuration
>    -
> 
>    Maximum Concurrent Jobs = 100
>    -
> 
>    ~35 parallel CopyJobs during the tests
>    -
> 
>    Maximum Network Buffer Size: default
>    -
> 
>    Minimum block size: default
> 
> Hardware / network:
> 
>    -
> 
>    Both servers are Hetzner bare metal
>    -
> 
>    10 Gbps NICs on both sides
>    -
> 
>    Servers are in different datacenters (routed)
>    -
> 
>    No NIC errors, no packet drops, no CPU saturation observed
> 
> Observed behavior:
> 
> During large CopyJobs and RestoreJobs (main SD → remote SD), aggregate
> throughput consistently plateaus around 128–150 MiB/s (~1–1.2 Gbps). Once
> this level is reached, CopyJobs start queueing and throughput does not
> scale further, even with ~35 concurrent jobs. CPU, disk I/O, and network
> all appear to have plenty of headroom.
> 
> Network validation:
> 
> To rule out network limitations, I ran aggressive iperf3 tests. From three
> different servers, I ran:
> 
> iperf3 -c remote-sd-server.io -t 300 -P 32
> 
> On the remote Storage Daemon, I ran multiple listeners on different ports.
> Aggregate inbound traffic on the remote SD easily exceeds Bacula traffic,
> sustaining multi-Gbps throughput. iperf traffic scales well beyond 1 Gbps
> and coexists without issues alongside Bacula jobs, which remain flat around
> ~1 Gbps.
> 
> Disk I/O validation (remote Storage Daemon):
> 
> The remote SD uses 14 × 20 TB 7200 RPM SATA disks with an LVM volume
> mounted at /data. fio direct-I/O tests show:
> 
>    -
> 
>    Single job: ~246 MiB/s sequential write
>    -
> 
>    8 parallel jobs: ~915 MiB/s sustained write
>    -
> 
>    16 parallel jobs: ~1014 MiB/s sustained write
> 
> This puts disk throughput well above the ~150 MiB/s observed with Bacula.
> Full fio output is available here:
> 
> https://gist.github.com/LeandroSaldivarmrf/6b1a354f845f4afb26a2fa39183e269b
> 
> For additional context, I also posted the same findings along with a couple
> of network and CPU usage graphs here:
> 
> https://community.spiceworks.com/t/bacula-ce-15-0-2-copyjobs-capped-at-1-gbps-despite-10gbps-network/1248466
> 
> At this point, network, disk, and CPU do not appear to be the limiting
> factors, which makes me suspect a Bacula CopyJob-level limitation or
> missing tuning.
> 
> Questions:
> 
>    -
> 
>    Has anyone consistently achieved multi-Gbps (5–10 Gbps) CopyJob
>    throughput with Bacula CE?
>    -
> 
>    Are there Director or Storage Daemon tuning parameters that
>    significantly affect CopyJob performance?
>    -
> 
>    Would running multiple Directors help here, or should a single Director
>    be able to drive higher throughput?
> 
> If there’s any specific configuration you’d like me to share (Director,
> Storage resources, JobDefs, etc.), I can post it.
> 
> Thanks in advance.
> 


_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to