Hi,

Arno Lehmann wrote:
> Hi,
>
> 05.07.2007 12:07,, Alfredo Marchini wrote::
>   
>> Hi Arno,
>> I don't know if you know bacula source code,
>>     
>
> A bit, but I usually look for problems in the configuration as, in my 
> experience, the source code is quite stable. Of course, there are 
> bugs, but these should be reproduceable in other installations, too. 
> Unless I find a setup that looks unique to me, I'm assuming the source 
> is ok and the problem lies in the configuration or general system.
>
>   
>> so I post you some 
>> parameters and information in my configuration that I think can cause 
>> this problem or I think is not well configured because I don't well 
>> understand the manual:
>>
>> Arno Lehmann wrote:
>>     
>>> Hi,
>>>
>>> 04.07.2007 17:40,, Alfredo Marchini wrote::
>>>   
>>>       
>>>> Hi,
>>>> The system and db logs doesn't tell me anything about this problem, like 
>>>> all the director processes or thread are locked concurrently.
>>>> If I restart only bacula-dir without restarting bacula-sd and 16 
>>>> bacula-fd the system restart working fine.
>>>>     
>>>>         
>>> It might be possible that the DIR is busy working on the catalog (like 
>>> pruning data) and just needs more time. You can check this using 
>>> 'mysqladmin processlist', for example.
>>>   
>>>       
>>     ok, when rehappen I'll make also this test, but if bacula makes jobs 
>> and files pruning when volumes are all used, and there are no more 
>> appendable volumes, I don't have this problem because I've got used only 
>> 10 volumes of 50Gb and have other 8 volumes avalaible and not already 
>> created.
>>     
>
> Are you saying that there are always volumes available and thus no 
> pruning happens?
>
>   
When the error occured I had 10 volumes used and 8 volumes avalaible, 
but after 3 weeks 18 volumes are all used and then bacula makes 
recycling of the oldest that have inside the jobs older than 14 days 
(max file, job and volume retention period).

Here's my sd config:

Storage {
  Name = mystorage
  SDPort = 9103
  SDAddress = binding ip
  WorkingDirectory = "/var/bacula/storage-wk"
  PidDirectory = "/var/run"
  Maximum Concurrent Jobs = 60
  Heartbeat Interval = 10
  Client Connect Wait = 60
}

Here's my sd device config

Device {
  Name = mydevice
  Media Type = File
  Archive Device = "/mnt/storage/volumes"
  LabelMedia = yes
  Random Access = yes
  AutomaticMount = yes
  RemovableMedia = no
  AlwaysOpen = yes
}

Here's my dir config:

Director {
  Name = mydirector
  DIRAddress = binding ip
  DIRport = 9101
  QueryFile = "/etc/bacula/query.sql"
  WorkingDirectory = "/var/bacula/director-wk"
  PidDirectory = "/var/run"
  Maximum Concurrent Jobs = 30
  Password = "password"
  Messages = "mymessages-daemon"
}

Here's my dir sd config:

Storage {
  Name = mystorage
  Address = ip
  SDPort = 9103
  Password = "password"
  Device = mydevice
  Media Type = File
  Maximum Concurrent Jobs = 60
}

Here's my dir pool config:

Pool {
  Name = mypool
  Pool Type = Backup
  Storage = mystorage
  Recycle = yes
  AutoPrune = yes
  Maximum Volumes = 18
  Maximum Volume Bytes = 50000000000
  Volume Retention = 14 days
  Label Format = "Volume-"
}


> That would indeed rule out the catalog as a bottle neck.
>
>   
>>>   
>>>       
>>>> Now I have already restarted bacula-dir  and all works fine (I backup 16 
>>>> servers, I cannot take it in offline mode or someone kill me this 
>>>> evening), so I'm not able to reproduce the error until about 10-15 days.
>>>> Last time that I'd got this problem I used top and I didn't find 
>>>> anything strange.
>>>>     
>>>>         
>>> Ok, so let's assume the hardware, OS and relevant applications are 
>>> running ok.
>>>
>>>   
>>>       
>>     Yes, I think is the right way.
>>     
>>>> But the test with time command will be the first when It will rehappen.
>>>> I don't think that the problem is with database, when I connect to 
>>>> database with mysql command line to db bacula it works fine and quickly.
>>>>     
>>>>         
>>> Bacula uses its own, internal locking, so you won't necessarily notice 
>>> anything from outside of Bacula.
>>>
>>>   
>>>       
>> Ok
>>     
> ...
>   
>>>> Another thing is the maximum concurrent jobs :
>>>> On director = 30
>>>> On storage side director configuration file = 60
>>>> On storage = 60
>>>>     
>>>>         
>>> Quite a lot, I think. Running up to 30 jobs in parallel might load 
>>> your backup server beyond its reasonable working maximum, but that 
>>> depends on your hardware, software, and requirements.
>>>
>>>   
>>>       
>> I've set this value because:
>> director = 30 because i've 16 fd that can connects concurrently (it is 
>> not the truth) plus
>> one job for fd to ask the status (16x2 = 32 rounded to 30).
>>     
>
> I don't understand why you reserve job slots for the FDs... the FDs 
> don't connect to the DIR to as for a status as far as I know. Or do 
> you refer to some sort of tray monitor? 
>   

Sorry, yes, I have got configured also one monitor for all fd.
>   
>> storage = 60 because when 16 fd connects concurrently to the storage 
>> i've go also 16 connections from the director to the storage (when jobs 
>> starts).
>>     
>
> The limit for the SD refers to running jobs, not to connections as far 
> as I know.
>
> For example, I run four jobs concurrently, and even if these jobs are 
> all running, the DIR can connect for status display and the monitoring 
> application can ask for the SD status, too.
>   
Ah, ok, so If I have 16 concurrent jobs to the sd I can set maximum 
concurrent jobs
to sd to 16 (also on director side). Is correct?

>   
>> I thought that the not responding problem was caused by this params, so 
>> I setted high values because I don't know how (at devel level) bacula 
>> works with tcp connections (I thought that the problem was caused by 
>> missing sufficient concurrent threads).
>>     
>
> I don't think so... the limits you set do not control how many threads 
> can be created, or how many network connections can exist 
> simultaneously. At least my impression is different.
>
>   
>> Another thing:
>> I've setted for all fd the messages that points to the director messages.
>> Example:
>> on director named = bacula-dir I've created messages named = 
>> bacula-dir-messages
>> on all fd I've setted message named = bacula-dir-message that points to 
>> director bacula-dir
>>     
>
> I don't think this is relevant here, unless you have reason to believe 
> that messages are not sent to the DIR.
>
>   
No, messages are correctly sent to director.
>> Last thing and I've got no more:
>>
>> If I go to working directory of bacula-dir, when is not responding, I 
>> find the files of mail that have to be send via e-mail to the operators 
>> old 2-3 days, as the bacula-dir is blocked and cannot send the e-mail 
>> (when is working fine the mail are correctly sent to all the operators).
>>     
>
> Obviously, when the DIR is blocked, it will not finish jobs and thus 
> not send mail.
>
> Does your above statement imply that your DIR is stuck for some days, 
> when it happens? That would probably rule out catalog performance 
> issues as even an underpowered database server should finish the 
> queries after a few days...
>
>   
Yes, I find the dir locked yesterday, but the last log, mail, and backup 
is at 30-06-2007 in the night.
After that the director is locked, and I have no more info about it.
For the db I use mysql 5.0, standard rpm installation, and in the log I 
have not info about problems.
The biggest table is the File, 832432 records,FileName are 424158, Path 
are 31414 and Log 38362.

If you think I need to enlarge mysql resources, Is not a problem.
Can the problem be caused because I specified in messages this?

catalog = all, !skipped, !saved, !terminate

this write many data to the db, if the problem is the db, I can remove 
this rule.
>> I use a postfix smtp server configured for local and bsmtp to send email 
>> to a smtp server
>> installed in my LAN on another linux server.
>>     
>
> That's not important here, too.
>
> Arno
>
>   
Alfredo


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to