Re: [Bacula-users] bacula-dir crashes while building the dir tree on a restore job

Lars Uhlemann Tue, 27 Jun 2006 02:00:45 -0700

Hello,

Kern Sibbald wrote:
> On Monday 26 June 2006 19:14, Alex Dioso wrote:
>> On Jun 19, 2006, at 8:15 AM, Lars Uhlemann wrote:
>>> Hi,
>>>
>>> we are runing bacula now for one year. Current runing version is
>>> 1.38.8.
>>>   We backup a amount of around 4 TB of data and more than 15
>>> Million files.
>>>
>>> OK, i start to restore a job with 1,9 TB and 12 Million files. I
>>> select
>>> the JobId's and wait for the restore command line. After 3,5 hours
>>> bacula-dir crashes with a:
>>>
>>> *19-Jun 16:49 backup-dir: ABORTING due to ERROR in smartall.c:132
>>> *Out of memory
>>> *19-Jun 16:49 backup-dir: Fatal Error because: Bacula interrupted by
>>> *signal 11: Segmentation violation
>> I have seen something similar on our backup system when trying to
>> restore a single job that has 2 TB and ~15 million files.
> 
> I'm not too surprised. Perhaps in the near future, I can calculate the memory 
> that is required per file and that would allow users to calculate how much 
> page space they need to keep from getting Out of memory.


Thats sounds interesting!

> 
>>> The select from the postgres was done after 55 min (Postgres bacula
>>> has
>>> idle status) and an btraceback shows me that bacula-dir is building
>>> the
>>> dir tree. But it crashes.
>> In our case MySQL used up all the memory and the system started to
>> kill off processes.
> 
> In the above case, it was Bacula that actually ran out of memory, but of 
> course, it could be that MySQL used it all up.
> 
We use postgres and bacula runs out of memory


>>> The Backup machine has 2 Gbyte of Ram and 4 Gbyte of swap. During
>>> building the dir tree the phsysical Ram is only 50Mbyte and the
>>> swap has
>>>   only 50% of his capacity. With "sar -r" i followed the rising memory
>>> allocation of bacula-dir but in the last 50 minutes it stay at 50% and
>>> crahes.
>> Our setup is RedHat Enterprise 4 AS on a dual Opteron with 2 GB RAM
>> and 2 GB swap.
> 
> I've read that RH recommends a swap file (or swap files) twice the physical 
> memory size.  I don't know.
> 
>>> Thanks for your ideas and support!
>> What worked for me was splitting the one backup job of ~15 million
>> files into 2 smaller jobs of ~8 million files and ~7 million files.
>> Now it takes ~20 minutes to build the tree for either job.
> 
> Perhaps in the near future I will need to move the tree code into either a 
> paged file or a memory mapped file.  That would allow much larger numbers of 
> files. Unfortunately, it would also slow down building the tree.  Perhaps 
> also, one could add an option that allows the user to select only one or more 
> branch of the tree to be loaded ...
> 
The last sentence (building only one selected branch of the tree) sounds 
very interesting. I think this feature is one of my most wanted things 
for the future versions of bacula!

Thanks for your support

Lars

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] bacula-dir crashes while building the dir tree on a restore job

Reply via email to