Hello, It appears that it is dying on the posix_fadvise() OS call. The call is in <bacula-source>/src/lib/bsock.c at line 492. All the arguments to the call are valid, so, it would appear that something in the posix_fadvise() call (the OS I think) is broken.
You can test this theory by disabling the posix_fadvise() call by putting // at the beginning of the line and rebuilding. You might also want to disable the posix_advise() calls in src/findlib/bfile.c (2 of them) and in src/stored/spool.c (one). You can disable them all *after* doing a ./configure, by editing src/config.h and commenting out the #define HAV_POSIX_FADVISE line. Actually the crash is happening at the beginning of despooling attribute data. Regards, Kern On Sunday 16 September 2007 13:53, Marc Schiffbauer wrote: > * Kern Sibbald schrieb am 16.09.07 um 08:59 Uhr: > > Please read the Kaboom chapter of the manual. It will explain how to > > manually run the program under the debugger. I believe you left of the > > -s -f options when running it so the traceback doesn't contain any useful > > information. > > Ok thanks for the hint. Here is the traceback. > > Crash happens right after a job: > > *m > 16-Sep 13:41 lisa-sd: Job write elapsed time = 00:02:37, Transfer rate = > 4.489 M bytes/second 16-Sep 13:41 lisa-sd: Sending spooled attrs to the > Director. Despooling 12,242 bytes ... * > > ... then kaboom > > [EMAIL PROTECTED]:/usr/sbin# gdb /usr/sbin/bacula-sd > GNU gdb 6.3-debian > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and > you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "i386-linux"...Using host libthread_db > library "/lib/libthread_db.so.1". > > (gdb) run -s -f -c /etc/bacula/bacula-sd.conf > Starting program: /usr/sbin/bacula-sd -s -f -c > /etc/bacula/bacula-sd.conf > [Thread debugging using libthread_db enabled] > [New Thread 16384 (LWP 14910)] > [New Thread 32769 (LWP 14912)] > [New Thread 16386 (LWP 14913)] > [New Thread 32771 (LWP 14914)] > [New Thread 49156 (LWP 14965)] > [Thread 16386 (LWP 14913) exited] > [New Thread 65541 (LWP 14975)] > [Thread 65541 (LWP 14975) exited] > [New Thread 81926 (LWP 14976)] > [Thread 81926 (LWP 14976) exited] > [New Thread 98311 (LWP 14994)] > [Thread 98311 (LWP 14994) exited] > [New Thread 114696 (LWP 14996)] > [Thread 114696 (LWP 14996) exited] > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 49156 (LWP 14965)] > 0x080e0018 in ?? () > (gdb) thread apply all bt > > Thread 5 (Thread 49156 (LWP 14965)): > #0 0x080e0018 in ?? () > #1 0x00000000 in ?? () > #2 0x00000000 in ?? () > #3 0x0808751b in BSOCK::despool (this=0x80e0018, > update_attr_spool_size=0x807df90 <update_attr_spool_size>, tsize=12242) at > bsock.c:492 #4 0x0807e15b in commit_attribute_spool (jcr=0x80e0368) at > spool.c:636 #5 0x08053dff in do_append_data (jcr=0x80e0368) at > append.c:334 > #6 0x080691f8 in append_data_cmd (jcr=0x80e0368) at fd_cmds.c:194 > #7 0x08069131 in do_fd_commands (jcr=0x80e0368) at fd_cmds.c:165 > #8 0x08068f40 in run_job (jcr=0x80e0368) at fd_cmds.c:128 > #9 0x0806a517 in run_cmd (jcr=0x80e0368) at job.c:192 > #10 0x080636ba in handle_connection_request (arg=0x80e0018) at dircmd.c:224 > #11 0x080a2219 in workq_server (arg=0x80c0be0) at workq.c:357 > #12 0x40161e51 in pthread_start_thread () from /lib/libpthread.so.0 > #13 0x40161ecf in pthread_start_thread_event () from /lib/libpthread.so.0 > #14 0x404a68aa in clone () from /lib/libc.so.6 > > Thread 4 (Thread 32771 (LWP 14914)): > #0 0x40168456 in nanosleep () from /lib/libpthread.so.0 > #1 0x00000001 in ?? () > #2 0x4016452a in __pthread_timedsuspend_new () from /lib/libpthread.so.0 > #3 0x40161122 in pthread_cond_timedwait_relative () from > /lib/libpthread.so.0 #4 0x080a183d in watchdog_thread (arg=0x0) at > watchdog.c:307 > #5 0x40161e51 in pthread_start_thread () from /lib/libpthread.so.0 > #6 0x40161ecf in pthread_start_thread_event () from /lib/libpthread.so.0 > #7 0x404a68aa in clone () from /lib/libc.so.6 > > Thread 2 (Thread 32769 (LWP 14912)): > #0 0x4049da5a in poll () from /lib/libc.so.6 > #1 0x40161b50 in __pthread_manager () from /lib/libpthread.so.0 > #2 0x40161d57 in __pthread_manager_event () from /lib/libpthread.so.0 > #3 0x404a68aa in clone () from /lib/libc.so.6 > > Thread 1 (Thread 16384 (LWP 14910)): > #0 0x404a0001 in select () from /lib/libc.so.6 > #1 0x00000009 in ?? () > #2 0x404fec80 in ?? () from /lib/libc.so.6 > #3 0xbfffec00 in ?? () > #4 0x00000000 in ?? () > #5 0x08085c32 in bnet_thread_server (addrs=0x80c24a8, max_clients=-514, > client_wq=0x80c0be0, handle_client_request=0xfffffdfe) at bnet_server.c:161 > #6 0x0804d2a4 in main (argc=0, argv=0x804dde0) at stored.c:263 > #0 0x080e0018 in ?? () > (gdb) > > is this useful? > > > For vagrind, I don't know what options to use, please read their > > documentation. > > > > For improving the output from smartalloc, find all the sm_check() calls > > in the SD and modify them to have true as the last argument. You can > > also add more sm_checks at various strategic places where you think the > > code breaks to get closer to the problem. > > valgrind / smartalloc output will follow later... > > -Marc ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
