Re: [Bacula-users] Issue with concurrent jobs in disk based auto changer

2020-04-04 Thread Radosław Korzeniewski
Hello,

sob., 4 kwi 2020 o 23:27 Shaligram Bhagat, Yateen (Nokia - IN/Bangalore) <
yateen.shaligram_bha...@nokia.com> napisał(a):

>
>
> Issue:
>
> With the above mentioned configs, When I start 200 virtual full jobs I
> expect all these jobs to run concurrently.
>
>
>

Every Virtual Full job requires at least 2 devices to operate. You
need more devices or less jobs.

best regards
-- 
Radosław Korzeniewski
rados...@korzeniewski.net
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Issue with concurrent jobs in disk based auto changer

2020-04-04 Thread Shaligram Bhagat, Yateen (Nokia - IN/Bangalore)
Hello,

I am trying out Bacula Community version 9.4.4 on centos 6.4 with PostGreSQL
There are 200 bacula clients from where data of average size of 30GB each needs 
to be backed up

There will be incremental nightly backup every weekday and a virtual full on 
the weekend.
The storage is disk based, there are 10 NFS mounted disks on Bacula server (zfs 
exports from a remote filer host).

Each disk corresponds to one storage, namely,  StorageA, StorageB,..StorageJ
Each Storage has its own media type defined.
Each Storage has one Autochanger associated with it, and each Autochanger has 
20 devices,

StorageA-> AutochnagerA-> DeviceA1, DeviceA2...DeviceA20
StorageB-> AutochnagerB->DeviceB1, DeviceB2...Deviceb20
..
SorageJ-> AutochangerJ-> DeviceJ1, DeviceJ2...DeviceJ20

Each Device has Maximum Concurrent Job = 1

Hence as per my understanding the maximum Concurrent jobs that can be handled 
by this configuration
is 200 (10 Storages X 20 Devices per storage X 1 max concurent job per device )

I have defined the Maximum Concurrent Jobs in other places as under :
1. for each storage definition, namely StorageA, StorageB, ... 
StorageJ : Maximum Concurrent Jobs = 100
2. for the bacula storage daemon : Maximum Concurrent Jobs =500
3. for the bacula daemon : Maximum Concurrent Jobs =500
4. in PostGreSQL database : max_connections set to 500

Issue:
With the above mentioned configs, When I start 200 virtual full jobs I expect 
all these jobs to run concurrently.

However I find that although few jobs run concurrently, many jobs still show 
the state "created not yet running".
The bconsole status command DOES NOT show a single job in state "waiting to 
reserve a device", Also many devices are still
shown as "not open". Hence I assume that there are enough free devices 
available, to handle all the 200 concurrent jobs.

All these jobs including the ones that are initially shown as  "created not yet 
running" eventually complete successfully, but after a long time (~36 Hours),
But the very purpose of concurrency is defeated.

So my question is how to find the reason why a job is getting into "created not 
yet running" state?
( Note : all jobs have equal priority of 10)

Will setting the debug level dynamically (console command setdebug) help in 
getting more info about the jobs that are already in state "created not yet 
running" ?. I tired that but does not yiled any extra info in joblog.

Thanks,

Yateen

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users