Found it :)

Did a test with a few other servers and two other servers with 630k
files took only 35 seconds to build the tree, so it was a specified
problem with our "erp"-server.

We found that the problem was with a lot af sequential files (XXXX000 -
XXX9999) that this server holds:
[erp]jvc> find . -name 'JVCREC.5.00000*' | wc -l
  23517
and this server holds more of those files.

Did a test on my own laptop from my /usr/ directory (+- 100.000 files)
and the build tree took only 3,8 sec.
After adding a directory /usr/test/ with files JVC.5.000000000 -
JVC.5.000099999 the build tree took 5 min and 56 sec.

We first try to sort the output from the database (small patch on
src/dird/sql_cmds.c) and that speed up the tree building from erp from
50 min to 5 min 21 sec.
Problem was that the tree building for the other servers went from 35
sec to 8,5 min, so that wasn't the best solution.

Her some comment from Stephan Leemburg about the problem:
---
In bacula/src/lib there are 2 sources: tree.c and dlist.c.

Within tree.c the dlist class is used. As dlist is implements a  double
linked list, inserting a large quantity of non-ordered, almost  similar
filenames leads to an enormous degradation of performance.  The
optimization looks at the tail and the head of the list does not  work
if the filenames are presented unordered, causing the list to be
searched. The simulation of a binary search on the dlist does not
improve the performance much in the case where a lot of filenames  only
differ in the last couple of characters.

It would be better to use a balanced binary search tree, like an avl-
tree. Maybe using the code from libavl (http://www.stanford.edu/~blp/
avl/) can help here.
---

Hope this all helps solve the problem.

Reinier


Kern Sibbald wrote:
> Hello,
> 
> I just did a "restore" on a 400MHz PII with 250MB main memory that was in the 
> process of completing my nightly backups.  That is to say, it was pretty busy 
> already. 
> 
> From the restore command until the tree prompt it took 1 minute 43 seconds.  
> This was for something like 475,000 files.  In the try.  I am using MySQL 
> 4.12 on the same machine. The SD and Catalog reside on the same machine as 
> the Directory (the machine noted above). I'm running on FC4 fully updated 
> with g++ 4.0.1.
> 
> I suspect that you have some serious problem somewhere, but I don't know 
> where. I leave it to you to do the research and let us know.
> 
> 
> 
> On Monday 08 August 2005 13:26, Reinier Haasjes wrote:
> 
>>Hi all,
>>
>>Kern Sibbald wrote:
>>
>>>To answer your question, you will need to do some timing.  See below.
>>>
>>>On Monday 08 August 2005 09:37, Reinier Haasjes wrote:
>>>
>>>>Hi, sorry for the late reply, weekend.
>>
>>>>I did the test again like you did it (with the time command) and the
>>>>result is as follow:
>>>>
>>>>-bash-2.05b$  time echo "restore jobid=1553,1561,1576,1598,1607,1617"
>>>>
>>>>|bconsole
>>>>
>>>>Connecting to Director tapeserver:9101
>>>>1000 OK: tapeserver-dir Version: 1.37.30 (14 July 2005)
>>>>Enter a period to cancel a command.
>>>>restore jobid=1553,1561,1576,1598,1607,1617
>>>>Using default Catalog name=MyCatalog DB=bacula
>>>>You have selected the following JobIds: 1553,1561,1576,1598,1607,1617
>>>
>>>  The time between printing the above line and printing the following
>>>line from your email is
>>>  pure database time.  After that, it is a bit of both, with probably 10%
>>>  database and 90% putting the records in memory.
>>>
>>>  So a good cut of DB vs Bacula memory time would be messured here.
>>
>>Here are some timings (i hope you ment this). Restarted bacula and mysql
>>so all the cpu-time taken are for the bacula build directory tree.
>>Before:
>>PID, TIME, COMMAND
>>20457 00:00:00 /bin/sh /usr/bin/mysqld_safe
>>20493 00:00:00 /usr/sbin/mysqld
>>20579 00:00:00 /opt/bacula-1.37.30/sbin/bacula-sd
>>20583 00:00:00 /opt/bacula-1.37.30/sbin/bacula-fd
>>20588 00:00:00 /opt/bacula-1.37.30/sbin/bacula-dir
>>
>>After:
>>20457 00:00:00 /bin/sh /usr/bin/mysqld_safe
>>20493 00:00:08 /usr/sbin/mysqld
>>20579 00:00:00 /opt/bacula-1.37.30/sbin/bacula-sd
>>20583 00:00:00 /opt/bacula-1.37.30/sbin/bacula-fd
>>20588 00:42:48 /opt/bacula-1.37.30/sbin/bacula-dir
>>
>>As you can see almost all the cputime is taken by bacula and the DB only
>>takes 8 seconds of cpu-time. So I think the problem lies in putting the
>>records into memory.
>>
>>
>>>  You might check that you *really* have the indexes that are defined in
>>>the 1.37 src/cats/create_xx_databases files.  Perhaps you are missing an
>>>index or two because you upgraded from an older version, or if you
>>>fiddled with your database, the old indexes could have been dropped.
>>>
>>>   See your vendor's manual for how to see which indexes exist.
>>
>>checked all the indexes and they *all* exist.
>>
>>This part of the restore takes (almost) no time. The next line "Building
>>directory tree for JobId 1553 " takes the most time (+- 50 minutes),
>>'writing' all the +-signs takes this time.
>>
>>
>>>>Building directory tree for JobId 1553 ...
>>>>+++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>the next 5 lines takes about 5 minutes total.
>>
>>
>>Thank Reinier
>>
>>
>>>>Building directory tree for JobId 1561 ...
>>>>Building directory tree for JobId 1576 ...
>>>>Building directory tree for JobId 1598 ...  +
>>>>Building directory tree for JobId 1607 ...
>>>>Building directory tree for JobId 1617 ...
>>>>6 Jobs, 429,836 files inserted into the tree.
>>>>
>>>>You are now entering file selection mode where you add (mark) and
>>>>remove (unmark) files to be restored. No files are initially added,
>>>>unless you used the "all" keyword on the command line.
>>>>Enter "done" to leave this mode.
>>>>
>>>>cwd is: /
>>>>$
>>>>real    54m48.701s
>>>>user    0m0.031s
>>>>sys     0m0.030s
>>>>
>>>>
>>>>54 minutus on the bacula server (Pentium III 800Mhz, 512Mb RAM) with a
>>>>local database (on a RAID0 vinum (FreeBSD) disc)
>>>>
>>>>the same test on my laptop (Pentium 4, 1,8Ghz, 1Gb RAM):
>>>>[EMAIL PROTECTED]:/opt/bacula/etc # time echo "restore
>>>>jobid=1553,1561,1576,1598,1607,1617" |bconsole
>>>>Connecting to Director penta:9101
>>>>1000 OK: penta-dir Version: 1.37.30 (14 July 2005)
>>>>Enter a period to cancel a command.
>>>>restore jobid=1553,1561,1576,1598,1607,1617
>>>>Using default Catalog name=MyCatalog DB=bacula
>>>>You have selected the following JobIds: 1553,1561,1576,1598,1607,1617
>>>>
>>>>Building directory tree for JobId 1553 ...
>>>>+++++++++++++++++++++++++++++++++++++++++++++++++
>>>>Building directory tree for JobId 1561 ...
>>>>Building directory tree for JobId 1576 ...
>>>>Building directory tree for JobId 1598 ...
>>>>Building directory tree for JobId 1607 ...
>>>>Building directory tree for JobId 1617 ...
>>>>6 Jobs, 424,612 files inserted into the tree.
>>>>
>>>>You are now entering file selection mode where you add (mark) and
>>>>remove (unmark) files to be restored. No files are initially added,
>>>>unless you used the "all" keyword on the command line.
>>>>Enter "done" to leave this mode.
>>>>
>>>>cwd is: /
>>>>$
>>>>real    43m32.431s
>>>>user    0m0.006s
>>>>sys     0m0.010s
>>>>
>>>>So yes it's a little bit faster but not as fast as you (a few minutes).
>>>>
>>>>My question is what is the biggest problem the 'slow' database or the
>>>>slow processor/memory combination?
>>>>
>>>>Thanks,
>>>>
>>>>Reinier
>>>>
>>>>Thomas Simmons wrote:
>>>>
>>>>>That seems pretty slow to me. I just did a test and it took 10 seconds
>>>>>to build the tree for ~400,000 files. Like you, I too have an opteron
>>>>>system, a dual 246 w 1/GB ram, however I keep the database on a set of
>>>>>mirrored sata disks on the local server. Have you tried installing the
>>>>>database on the same server?
>>>>>
>>>>>sioux:~# time echo "restore jobid=1,2,3,4" |bconsole
>>>>>Connecting to Director sioux:9101
>>>>>1000 OK: sioux-dir Version: 1.37.30 (14 July 2005)
>>>>>Enter a period to cancel a command.
>>>>>restore jobid=1,2,3,4
>>>>>Using default Catalog name=MyCatalog DB=bacula
>>>>>You have selected the following JobIds: 1,2,3,4
>>>>>
>>>>>Building directory tree for JobId 1 ...
>>>>>+++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>Building directory tree for JobId 2 ...  ++++++++
>>>>>Building directory tree for JobId 3 ...  ++
>>>>>Building directory tree for JobId 4 ...  ++
>>>>>4 Jobs, 397,356 files inserted into the tree.
>>>>>
>>>>>You are now entering file selection mode where you add (mark) and
>>>>>remove (unmark) files to be restored. No files are initially added,
>>>>>unless you used the "all" keyword on the command line.
>>>>>Enter "done" to leave this mode.
>>>>>
>>>>>cwd is: /
>>>>>$
>>>>>real    0m10.371s
>>>>>user    0m0.001s
>>>>>sys     0m0.006s
>>>>>
>>>>>Thanks,
>>>>>Thomas
>>>>>
>>>>>[EMAIL PROTECTED] wrote:
>>>>>
>>>>>>I was just thinking. With my setup. The new bacula
>>>>>>server is an Opteron 246 server with 4GB of memory and
>>>>>>the database is running on an Athlon 2400 with only
>>>>>>256 MB of memory and it takes (1 to 3) minutes to get
>>>>>>the file list for around 10,000 files with version
>>>>>>1.36.3 and a postgresql database. The hard drive light
>>>>>>on machine with the database is solid for the whole
>>>>>>time. If I ran 400,000 files which is 40 times as many
>>>>>>files it could easily take an hour. I'm thinking its
>>>>>>time to update my database server...
>>>>>>
>>>>>>John
>>>>>>
>>>>>>--- Reinier Haasjes <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>>[EMAIL PROTECTED] wrote:
>>>>>>>
>>>>>>>>>Hi,
>>>>>>>>>
>>>>>>>>>I'm using bacula 1.37.30 for a few days now and I
>>>>>>>>>decided to test a full server recovery.
>>>>>>>>>I discovered that the building the directory tree
>>>>>>>>>takes a very long time (almost an hour) for
>>>>>>>
>>>>>>>429,836
>>>>>>>
>>>>>>>
>>>>>>>>>files.
>>>>>>>>>I started the building of the tree at 11:20 and at
>>>>>>>>>12:15 I got a prompt.
>>>>>>>>>The machine is a dedicated bacula machine and was
>>>>>>>>>doing nothing else than building the tree, it'a a
>>>>>>>>>Penium III at 800Mhz with 512MB memory.
>>>>>>>>>
>>>>>>>>>My question is if this is normal for builing the
>>>>>>>>>tree? Because if one of the server dies we want to
>>>>>>>>>recover as soon as possible.
>>>>>>>>>
>>>>>>>>>Thanks,
>>>>>>>>>
>>>>>>>>>Reinier
>>>>>>>>>
>>>>>>>>>output bconsole:
>>>>>>>>>----
>>>>>>
>>>>>>+-------+-------+----------+----------------+---------------------+----
>>>>>>- -------+-----------+
>>>>>>
>>>>>>
>>>>>>>>>| JobId | Level | JobFiles | JobBytes       |
>>>>>>>>>
>>>>>>>>>StartTime           | VolumeName | StartFile |
>>>>>>
>>>>>>+-------+-------+----------+----------------+---------------------+----
>>>>>>- -------+-----------+
>>>>>>
>>>>>>
>>>>>>>>>| 1,553 | F     |  426,357 | 13,464,836,018 |
>>>>>>>>>
>>>>>>>>>2005-07-30 15:40:00 | 000017L1   |       100 |
>>>>>>>>>
>>>>>>>>>| 1,561 | I     |       41 |     41,277,841 |
>>>>>>>>>
>>>>>>>>>2005-07-31 01:32:52 | 000034L1   |       106 |
>>>>>>>>>
>>>>>>>>>| 1,576 | I     |      137 |    101,733,681 |
>>>>>>>>>
>>>>>>>>>2005-08-01 01:31:02 | 000034L1   |       116 |
>>>>>>>>>
>>>>>>>>>| 1,598 | I     |    2,315 |    101,235,058 |
>>>>>>>>>
>>>>>>>>>2005-08-03 02:09:03 | 000034L1   |       146 |
>>>>>>>>>
>>>>>>>>>| 1,607 | I     |    1,641 |    220,426,382 |
>>>>>>>>>
>>>>>>>>>2005-08-04 02:03:39 | 000034L1   |       191 |
>>>>>>>>>
>>>>>>>>>| 1,617 | I     |    1,345 |    100,591,549 |
>>>>>>>>>
>>>>>>>>>2005-08-05 01:51:58 | 000035L1   |        28 |
>>>>>>
>>>>>>+-------+-------+----------+----------------+---------------------+----
>>>>>>- -------+-----------+
>>>>>>
>>>>>>
>>>>>>>>>You have selected the following JobIds:
>>>>>>>>>1553,1561,1576,1598,1607,1617
>>>>>>>>>
>>>>>>>>>Building directory tree for JobId 1553 ...
>>>>>>>>>+++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>Building directory tree for JobId 1561 ...  Building directory tree
>>>>>>>>>for JobId 1576 ...  Building directory tree for JobId 1598 ...  +
>>>>>>>>>Building directory tree for JobId 1607 ...  Building directory tree
>>>>>>>>>for JobId 1617 ...  6 Jobs, 429,836 files inserted into the tree.
>>>>>>>>>----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Output top (during building tree):
>>>>>>>>>----
>>>>>>>>>32 processes:  2 running, 30 sleeping
>>>>>>>>>CPU states: 98.1% user,  0.0% nice,  0.4% system, 1.6% interrupt,
>>>>>>>>>0.0% idle
>>>>>>>>>Mem: 131M Active, 36M Inact, 89M Wired, 60M Buf,
>>>>>>>>>242M Free
>>>>>>>>>Swap: 1020M Total, 120K Used, 1020M Free
>>>>>>>>>
>>>>>>>>>PID USERNAME PRI NICE  SIZE    RES STATE    TIME
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>WCPU    CPU COMMAND
>>>>>>>>>91877 root      64   0 88948K 87612K RUN    269:36
>>>>>>>>>98.39% 98.39% bacula-dir
>>>>>>>>>93618 root      29   0  1908K  1072K RUN      0:00
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>2.96%  0.54% top
>>>>>>>>>----
>>>>>>
>>>>>>-------------------------------------------------------
>>>>>>
>>>>>>
>>>>>>>>>SF.Net email is Sponsored by the Better Software
>>>>>>>>>Conference & EXPO
>>>>>>>>>September 19-22, 2005 * San Francisco, CA *
>>>>>>>>>Development Lifecycle Practices
>>>>>>>>>Agile & Plan-Driven Development * Managing
>>>>>>>
>>>>>>>Projects
>>>>>>>
>>>>>>>
>>>>>>>>>& Teams * Testing & QA
>>>>>>>>>Security * Process Improvement & Measurement *
>>>>>>>>>http://www.sqe.com/bsce5sf
>>>>>>>>>_______________________________________________
>>>>>>>>>Bacula-users mailing list
>>>>>>>>>Bacula-users@lists.sourceforge.net
>>>>>>
>>>>>>https://lists.sourceforge.net/lists/listinfo/bacula-users
>>>>>>
>>>>>>
>>>>>>>>You are restoring 14GB of data right? For most
>>>>>>>
>>>>>>>tape
>>>>>>>
>>>>>>>
>>>>>>>>drives this is not a long time. For me a 40GB
>>>>>>>
>>>>>>>DLT-IV
>>>>>>>
>>>>>>>
>>>>>>>>(native) tape takes 3 to 4 hours to restore. The
>>>>>>>
>>>>>>>drive
>>>>>>>
>>>>>>>
>>>>>>>>has a 3MB/s data rate which is about 11GB / hour
>>>>>>>>(native). And that is as fast as it will go. If
>>>>>>>
>>>>>>>you
>>>>>>>
>>>>>>>
>>>>>>>>manage ot get a fileset that is highly
>>>>>>>
>>>>>>>compressible
>>>>>>>
>>>>>>>
>>>>>>>>you can get better times. But most of my data I
>>>>>>>
>>>>>>>get no
>>>>>>>
>>>>>>>
>>>>>>>>where near a compression rate of 2.0.
>>>>>>>>John
>>>>>>>
>>>>>>>I'm not talking about the actual recovery itself
>>>>>>>(data transfer) but
>>>>>>>about building the directory tree (before the 'mark
>>>>>>>*' command).
>>>>>>>
>>>>>>>The actual restore takes 'only' 2:35 hours.
>>>>>>>
>>>>>>>Reinier
>>>>>>
>>>>>>-------------------------------------------------------
>>>>>>
>>>>>>
>>>>>>>SF.Net email is Sponsored by the Better Software
>>>>>>>Conference & EXPO
>>>>>>>September 19-22, 2005 * San Francisco, CA *
>>>>>>>Development Lifecycle Practices
>>>>>>>Agile & Plan-Driven Development * Managing Projects
>>>>>>>& Teams * Testing & QA
>>>>>>>Security * Process Improvement & Measurement *
>>>>>>>http://www.sqe.com/bsce5sf
>>>>>>>_______________________________________________
>>>>>>>Bacula-users mailing list
>>>>>>>Bacula-users@lists.sourceforge.net
>>>>>>
>>>>>>https://lists.sourceforge.net/lists/listinfo/bacula-users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>-------------------------------------------------------
>>>>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
>>>>>>Practices
>>>>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing
>>>>>>& QA
>>>>>>Security * Process Improvement & Measurement *
>>>>>>http://www.sqe.com/bsce5sf
>>>>>>_______________________________________________
>>>>>>Bacula-users mailing list
>>>>>>Bacula-users@lists.sourceforge.net
>>>>>>https://lists.sourceforge.net/lists/listinfo/bacula-users
>>>>
>>>>-------------------------------------------------------
>>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
>>>>Practices Agile & Plan-Driven Development * Managing Projects & Teams *
>>>>Testing & QA Security * Process Improvement & Measurement *
>>>>http://www.sqe.com/bsce5sf
>>>>_______________________________________________
>>>>Bacula-users mailing list
>>>>Bacula-users@lists.sourceforge.net
>>>>https://lists.sourceforge.net/lists/listinfo/bacula-users
> 
> 


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to