I'm reposting in case my first post was missed.

Bacula 1.38.11 installed from FreeBSD ports on FreeBSD 6.0.

Reliably on Saturday mornings, the director will lock up.  Jobs stop
running.  I can start bconsole, but actually asking for any information,
(i.e. "status dir") results in bconsole freezing up.  Minimal CPU and
IO are occurring.

The database is responsive.  I can psql in and do queries and the like
with no problems.

Running the btraceback while the system is frozen yields the following:
warning: Unable to get location for thread creation breakpoint: generic error
[New Thread 0x8133a00 (sleeping)]
[New Thread 0x8133800 (sleeping)]
[New Thread 0x8107e00 (sleeping)]
[New Thread 0x811cc00 (sleeping)]
[New Thread 0x811ce00 (runnable)]
[New Thread 0x8150000 (runnable)]
[New Thread 0x8107c00 (sleeping)]
[New Thread 0x8107a00 (runnable)]
[New Thread 0x8107600 (LWP 100161)]
[New Thread 0x80ea000 (sleeping)]
[New LWP 100164]
[Switching to LWP 100164]
0x2814a277 in pthread_testcancel () from /usr/lib/libpthread.so.2
$1 = "mindwipe-dir", '\0' <repeats 17 times>
$2 = 0x80ed018 "bacula-dir"
$3 = 0x80ed058 "/usr/local/sbin/"
$4 = "PostgreSQL"
$5 = 0x80cc358 "1.38.11 (28 June 2006)"
$6 = 0x80cc36f "i386-portbld-freebsd6.0"
$7 = 0x80cc387 "freebsd"
$8 = 0x80cc38f "6.0-RELEASE-p5"
#0  0x2814a277 in pthread_testcancel () from /usr/lib/libpthread.so.2
#1  0x281436f3 in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#2  0x08107c00 in ?? ()

Thread 11 (LWP 100164):
#0  0x2814a277 in pthread_testcancel () from /usr/lib/libpthread.so.2
#1  0x281436f3 in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#2  0x08107c00 in ?? ()

Thread 10 (Thread 0x80ea000 (sleeping)):
#0  0x28142e7f in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#1  0x28143013 in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#2  0x281474bd in _pthread_cond_wait () from /usr/lib/libpthread.so.2
#3  0x28147a06 in pthread_cond_wait () from /usr/lib/libpthread.so.2
#4  0x080a752e in rwl_writelock (rwl=0x8109e20) at rwlock.c:226
#5  0x08086ad6 in _db_lock (file=0x80c5aac "sql_create.c", line=69, 
mdb=0x8109e18) at sql.c:238
#6  0x08087870 in db_create_job_record (jcr=0x8126018, mdb=0x8109e18, 
jr=0x8126260) at sql_create.c:69
#7  0x0805c61a in run_job (jcr=0x8126018) at job.c:117
#8  0x0804c644 in main (argc=0, argv=0xbfbfebd0) at dird.c:246

Thread 9 (Thread 0x8107600 (LWP 100161)):
#0  0x2814a277 in pthread_testcancel () from /usr/lib/libpthread.so.2
#1  0x28142dac in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#2  0x00000000 in ?? ()

Thread 8 (Thread 0x8107a00 (runnable)):
#0  0x284910b3 in select () from /lib/libc.so.6
#1  0x28133639 in select () from /usr/lib/libpthread.so.2
#2  0x080970b9 in bnet_thread_server (addrs=0x80ed1d8, max_clients=10, 
client_wq=0x80e05a0, handle_client_request=0x807e34c <handle_UA_client_request>)
    at bnet_server.c:148
#3  0x0807e23d in connect_thread (arg=0x80ed1d8) at ua_server.c:73
#4  0x28135ab1 in pthread_create () from /usr/lib/libpthread.so.2
#5  0x284ec45f in _ctx_start () from /lib/libc.so.6

Thread 7 (Thread 0x8107c00 (sleeping)):
#0  0x28142e7f in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#1  0x28143013 in pthread_mutexattr_init () from /usr/lib/libpthread.so.2
#2  0x28147dd9 in _pthread_cond_timedwait () from /usr/lib/libpthread.so.2
#3  0x28148342 in pthread_cond_timedwait () from /usr/lib/libpthread.so.2
#4  0x080b27cf in watchdog_thread (arg=0x0) at watchdog.c:292
#5  0x28135ab1 in pthread_create () from /usr/lib/libpthread.so.2
#6  0x284ec45f in _ctx_start () from /lib/libc.so.6

Thread 6 (Thread 0x8150000 (runnable)):
#0  0x28491833 in read () from /lib/libc.so.6
#1  0x08132818 in ?? ()
#2  0xbeff4fec in ?? ()
#3  0x280fa050 in ?? ()
#4  0xbeff4958 in ?? ()
#5  0x08093d22 in read_nbytes (bsock=0xa, ptr=0x4 <Error reading address 0x4: 
Bad address>, nbytes=-1090565752) at bnet.c:73
#6  0x00093d22 in ?? ()
#7  0x0000000a in ?? ()
#8  0x00000004 in ?? ()
#9  0xbeff4988 in ?? ()
#10 0x0809411a in bnet_recv (bsock=0x5b245c7e) at bnet.c:194
/usr/local/share/bacula/btraceback.gdb:10: Error in sourced command file:
Previous frame inner to this frame (corrupt stack?)
#0  0x2814a277 in pthread_testcancel () from /usr/lib/libpthread.so.2

If there's more information I can collect to help track this problem
down, please let me know.

-- 
Bill Moran
Collaborative Fusion Inc.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to