Re: [Bacula-users] Restoring large directory does not work
Thanks Martin, You have put a good closure on the quest for knowledge. If I upgrade Bacula, will I have to upgrade the database? Meaning do I have to run those update table scripts. I am on postgresql version 8.29. Yudhvir OK, this shows why it is slow. The algorithm in add_findex is only efficient when called with consecutive index values (the third number printed). The code for restore all in 2.4.4 doesn't do that, so it can take a very long time to complete. This was fixed in later version, so I think the best solution is to upgrade Bacula. __Martin -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
On Fri, 26 Jun 2009 11:53:26 -0700, mehma sarja said: Thanks Martin, You have put a good closure on the quest for knowledge. If I upgrade Bacula, will I have to upgrade the database? Meaning do I have to run those update table scripts. I am on postgresql version 8.29. Sorry, I don't know. Check the version table in the catalog. The latest Bacula uses version 11, so if your version table is the same then there should be no need to run update scripts. __Martin -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
On Wed, 24 Jun 2009 13:59:26 -0700, mehma sarja said: Thanks for all your help you guys. I am impressed with the level of expertise here! Error accessing memory address 0x7fbff000: Bad address. #0 0x0040c043 in add_findex () The function add_findex is interesting, but I think like your bacula-dir was Try the following gdb commands (I assume you are running 64-bit FreeBSD): break *add_findex commands printf arguments: %x %x %x\n, $rdi, $rsi, $rdx end continue When it stops, enter the continue command again and time how long it takes before it stops again. Do this a few times and post the results (including the arguments: output). Yes, it is FreeBSD 64 bit. The continue command comes right back with these arguments: Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe00b arguments: 1b17068 a0 5fe00b arguments: 1b17068 a0 5fe00b arguments: 1b17068 a0 5fe00b (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe039 arguments: 1b17068 a0 5fe039 arguments: 1b17068 a0 5fe039 arguments: 1b17068 a0 5fe039 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe055 arguments: 1b17068 a0 5fe055 arguments: 1b17068 a0 5fe055 arguments: 1b17068 a0 5fe055 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe060 arguments: 1b17068 a0 5fe060 arguments: 1b17068 a0 5fe060 arguments: 1b17068 a0 5fe060 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe071 arguments: 1b17068 a0 5fe071 arguments: 1b17068 a0 5fe071 arguments: 1b17068 a0 5fe071 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe079 arguments: 1b17068 a0 5fe079 arguments: 1b17068 a0 5fe079 arguments: 1b17068 a0 5fe079 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe0ac arguments: 1b17068 a0 5fe0ac arguments: 1b17068 a0 5fe0ac arguments: 1b17068 a0 5fe0ac OK, this shows why it is slow. The algorithm in add_findex is only efficient when called with consecutive index values (the third number printed). The code for restore all in 2.4.4 doesn't do that, so it can take a very long time to complete. This was fixed in later version, so I think the best solution is to upgrade Bacula. __Martin -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
On Tue, 23 Jun 2009 16:09:57 -0700, mehma sarja said: Did you wait till the cpu went back to low cpu usage? No, it stays high overnight and my patience runs out before cpu pegging does. I suggest attaching gdb to the bacula-dir process to see what it is doing, e.g. thread apply all bt Then detach gdb, let it run some more, and do the above again to see how it differs. You might need to build a debugging version of Bacula to get useful backtraces. __Martin -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
I got into gdb but know very little how to move around in there. I tried: [r...@lucifer ~]# gdb /usr/local/sbin/bacula-dir 27410 GNU gdb 6.1.1 [FreeBSD] This GDB was configured as amd64-marcel-freebsd...(no debugging symbols found)... Attaching to program: /usr/local/sbin/bacula-dir, process 27410 Reading symbols from /usr/local/lib/libpq.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/libpq.so.5 Reading symbols from /lib/libcrypt.so.4...(no debugging symbols found)...done. Loaded symbols for /lib/libcrypt.so.4 Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done. [New Thread 0x801a10300 (LWP 100334)] [New Thread 0x801902600 (LWP 100329)] [New Thread 0x801902480 (LWP 100192)] [New Thread 0x801902180 (LWP 100350)] Loaded symbols for /lib/libthr.so.3 Reading symbols from /usr/local/lib/libintl.so.8...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/libintl.so.8 Reading symbols from /usr/lib/libwrap.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libwrap.so.5 Reading symbols from /usr/local/lib/libiconv.so.3...(no debugging symbols found)...done. Loaded symbols for /usr/local/lib/libiconv.so.3 Reading symbols from /usr/lib/libssl.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libssl.so.5 Reading symbols from /lib/libcrypto.so.5...(no debugging symbols found)...done. Loaded symbols for /lib/libcrypto.so.5 Reading symbols from /usr/lib/libstdc++.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done. Loaded symbols for /libexec/ld-elf.so.1 [Switching to Thread 0x801a10300 (LWP 100334)] 0x0040c043 in add_findex () -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
(gdb) thread apply all bt Thread 4 (Thread 0x801902180 (LWP 100350)): #0 0x0008016f98cc in nanosleep () from /lib/libc.so.7 #1 0x0008009078c5 in nanosleep () from /lib/libthr.so.3 #2 0x0044e21e in bmicrosleep () #3 0x0042408d in wait_for_next_job () #4 0x00408a3c in main () Thread 3 (Thread 0x801902480 (LWP 100192)): #0 0x000801715afc in select () from /lib/libc.so.7 #1 0x0008009074d4 in select () from /lib/libthr.so.3 #2 0x0044f9b2 in bnet_thread_server () #3 0x00438ba8 in connect_thread () #4 0x000800908a27 in pthread_getprio () from /lib/libthr.so.3 #5 0x in ?? () Error accessing memory address 0x7fbff000: Bad address. #0 0x0040c043 in add_findex () -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
On Wed, 24 Jun 2009 10:41:03 -0700, mehma sarja said: (gdb) thread apply all bt Thread 4 (Thread 0x801902180 (LWP 100350)): #0 0x0008016f98cc in nanosleep () from /lib/libc.so.7 #1 0x0008009078c5 in nanosleep () from /lib/libthr.so.3 #2 0x0044e21e in bmicrosleep () #3 0x0042408d in wait_for_next_job () #4 0x00408a3c in main () Thread 3 (Thread 0x801902480 (LWP 100192)): #0 0x000801715afc in select () from /lib/libc.so.7 #1 0x0008009074d4 in select () from /lib/libthr.so.3 #2 0x0044f9b2 in bnet_thread_server () #3 0x00438ba8 in connect_thread () #4 0x000800908a27 in pthread_getprio () from /lib/libthr.so.3 #5 0x in ?? () Error accessing memory address 0x7fbff000: Bad address. #0 0x0040c043 in add_findex () The function add_findex is interesting, but I think like your bacula-dir was compiled without debugging info so it is difficult to see what is happening. Try the following gdb commands (I assume you are running 64-bit FreeBSD): break *add_findex commands printf arguments: %x %x %x\n, $rdi, $rsi, $rdx end continue This sets a breakpoint at the start of the function add_findex to print the arguments and starts Bacula running again. It should stop in gdb when it reaches the beginning of add_findex again (with a message like Breakpoint 1...in add_findex...). When it stops, enter the continue command again and time how long it takes before it stops again. Do this a few times and post the results (including the arguments: output). __Martin -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
Thanks for all your help you guys. I am impressed with the level of expertise here! Error accessing memory address 0x7fbff000: Bad address. #0 0x0040c043 in add_findex () The function add_findex is interesting, but I think like your bacula-dir was Try the following gdb commands (I assume you are running 64-bit FreeBSD): break *add_findex commands printf arguments: %x %x %x\n, $rdi, $rsi, $rdx end continue When it stops, enter the continue command again and time how long it takes before it stops again. Do this a few times and post the results (including the arguments: output). Yes, it is FreeBSD 64 bit. The continue command comes right back with these arguments: Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe00b arguments: 1b17068 a0 5fe00b arguments: 1b17068 a0 5fe00b arguments: 1b17068 a0 5fe00b (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe039 arguments: 1b17068 a0 5fe039 arguments: 1b17068 a0 5fe039 arguments: 1b17068 a0 5fe039 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe055 arguments: 1b17068 a0 5fe055 arguments: 1b17068 a0 5fe055 arguments: 1b17068 a0 5fe055 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe060 arguments: 1b17068 a0 5fe060 arguments: 1b17068 a0 5fe060 arguments: 1b17068 a0 5fe060 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe071 arguments: 1b17068 a0 5fe071 arguments: 1b17068 a0 5fe071 arguments: 1b17068 a0 5fe071 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe079 arguments: 1b17068 a0 5fe079 arguments: 1b17068 a0 5fe079 arguments: 1b17068 a0 5fe079 (gdb) continue Continuing. Breakpoint 1, 0x0040bfc0 in add_findex () arguments: 1b17068 a0 5fe0ac arguments: 1b17068 a0 5fe0ac arguments: 1b17068 a0 5fe0ac arguments: 1b17068 a0 5fe0ac -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] Restoring large directory does not work
Trying to restore files using bconsole: * restore client=client1-fd fileset=Client1-Fileset select current all done. It does the 'select', 'current', and 'all' but sits there on the 'done' part. I have left it like this overnight with no change in status. My setup is Bacula 2.4.4 DIR and SD on a FreeBSD 7.1. +---+---+-+-+-+--+ | jobid | level | jobfiles | jobbytes| starttime | volumename | +---+---+-+-+-+--+ | 160 | F | 11,600,468 | 371,831,421,845 | 2009-06-17 14:15:37 | Volumes0004 | +---+---++-+-+--+ You have selected the following JobId: 160 Building directory tree for JobId 160 ... + 1 Job, 11,415,174 files inserted into the tree and marked for extraction. and nothing more Anyone have any idea what I am doing wrong? Can I compile a newer bacula and connect to the current catalog database and try the restore again? Is there another way to restore? Yudhvir -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
2009/6/23 mehma sarja mehmasa...@gmail.com: Trying to restore files using bconsole: * restore client=client1-fd fileset=Client1-Fileset select current all done. It does the 'select', 'current', and 'all' but sits there on the 'done' part. I have left it like this overnight with no change in status. My setup is Bacula 2.4.4 DIR and SD on a FreeBSD 7.1. +---+---+-+-+-+--+ | jobid | level | jobfiles | jobbytes | starttime | volumename | +---+---+-+-+-+--+ | 160 | F | 11,600,468 | 371,831,421,845 | 2009-06-17 14:15:37 | Volumes0004 | +---+---++-+-+--+ You have selected the following JobId: 160 Building directory tree for JobId 160 ... + 1 Job, 11,415,174 files inserted into the tree and marked for extraction. and nothing more Anyone have any idea what I am doing wrong? Can I compile a newer bacula and connect to the current catalog database and try the restore again? Is there another way to restore? Are you running out of memory on the director, database or client machine? John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
John, The dir and database are on the same machine and memory is not a problem. I tried a partial restore - it restores files but not recursively. Meaning no subdirectories. Then I tried restoring the subdirectory. It get that too but no sub-sub directories. Yudhvir -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
Although the cpu is pinged at 100% Yudhvir The dir and database are on the same machine and memory is not a problem. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
Did you wait till the cpu went back to low cpu usage? No, it stays high overnight and my patience runs out before cpu pegging does. Depending on your configuration and optimization of your database this could take anywhere from a few minutes to a few hours to finish. I assume the disk / array is thrashing during this time? John There is no disk activity - zip. I do see load averages: 0.99, 0.97, 0.92 CPU: 24.8% user, 0.0% nice, 0.0% system, 0.1% interrupt, 75.1% idle Mem: 1650M Active, 1544M Inact, 832M Wired, 226M Cache, 214M Buf, 3647M Free Swap: 4096M Total, 252K Used, 4096M Free PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 27410 bacula 4 1180 1607M 1589M CPU1 1 101:59 100.00% bacula-dir 27484 pgsql 1 40 54668K 37488K sbwait 0 1:20 0.00% postgres Another interesting thing is that it is doing involuntary context switching like so: PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 27410 bacula0 32 0 0 0 0 0.00% bacula-dir 27484 pgsql 0 0 0 0 0 0 0.00% postgres -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Restoring large directory does not work
I'm pretty sure that a postgresql server running with so low memory 27484 pgsql 1 40 54668K 37488K sbwait 0 1:20 0.00% postgres could give a suffisant throughput. 54MB tend to indicate a default deb/rpm installation value which are very low. During the process you can get the query running, with tools console or pgadmin. And you can copy and retry it directly against the postgres server, and have measure of how much time it would take. I think you need to tweak a bit the postgresql config. Infos are present in list-archive wiki related postgres websites. mehma sarja wrote: Did you wait till the cpu went back to low cpu usage? No, it stays high overnight and my patience runs out before cpu pegging does. Depending on your configuration and optimization of your database this could take anywhere from a few minutes to a few hours to finish. I assume the disk / array is thrashing during this time? John There is no disk activity - zip. I do see load averages: 0.99, 0.97, 0.92 CPU: 24.8% user, 0.0% nice, 0.0% system, 0.1% interrupt, 75.1% idle Mem: 1650M Active, 1544M Inact, 832M Wired, 226M Cache, 214M Buf, 3647M Free Swap: 4096M Total, 252K Used, 4096M Free PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 27410 bacula 4 1180 1607M 1589M CPU1 1 101:59 100.00% bacula-dir 27484 pgsql 1 40 54668K 37488K sbwait 0 1:20 0.00% postgres Another interesting thing is that it is doing involuntary context switching like so: PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 27410 bacula0 32 0 0 0 0 0.00% bacula-dir 27484 pgsql 0 0 0 0 0 0 0.00% postgres -- Bruno Friedmann -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users