Public bug reported: Binary package hint: bacula
Error message in daemon.log: bacula-sd: Bacula interrupted by signal 11: Segmentation violation Error messages given by the bacula-director: Not accessible clients: 17-Aug 04:37 cxl05010-dir JobId 869: Warning: bsock.c:129 Could not connect to Client: CXW11010-fd on 192.168.11.10:9102. ERR=Connection timed out Retrying ... 17-Aug 04:40 cxl05010-dir JobId 869: Fatal error: bsock.c:135 Unable to connect to Client: CXW11010-fd on 192.168.11.10:9102. ERR=Connection timed out 17-Aug 04:42 cxl05010-dir JobId 870: Warning: bsock.c:129 Could not connect to Client: CXW11011-fd on 192.168.11.11:9102. ERR=Connection timed out Retrying ... 17-Aug 04:45 cxl05010-dir JobId 870: Fatal error: bsock.c:135 Unable to connect to Client: CXW11011-fd on 192.168.11.11:9102. ERR=Connection timed out 17-Aug 04:45 cxl05010-dir JobId 870: Error: openssl.c:86 TLS read/write failure.: ERR=error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac Subsequent backup-jobs which failed: 17-Aug 04:45 cxl05010-dir JobId 0: Warning: bsock.c:129 Could not connect to Storage daemon on cxb24010.consultix.admin:9103. ERR=Connection refused (notice that the JobID is 0) 17-Aug 04:15 cxl05006-fd JobId 860: Fatal error: backup.c:892 Network send error to SD. ERR=Broken pipe 17-Aug 04:50 cxl05010-dir JobId 871: Warning: bsock.c:129 Could not connect to Storage daemon on cxb24010.consultix.admin:9103. ERR=Connection refused 17-Aug 05:15 cxl05010-dir JobId 0: Fatal error: bsock.c:135 Unable to connect to Storage daemon on cxb24010.consultix.admin:9103. ERR=Connection refused 17-Aug 11:01 cxl05010-dir JobId 0: Fatal error: bsock.c:135 Unable to connect to Storage daemon on cxb24010.consultix.admin:9103. ERR=Connection refused This error occurs if one (or more) File-daemons can’t be contacted by the bacula-director. The Storage-daemon dies when the job whose file-daemon is unreachable is canceled by the director. The biggest problem is that one client whose file-daemon cant’t be reached will crash the storage-daemon and all the running and subsequent backups will fail. The segfault did occur several times, twice because the IP-address of the client did change, while is wasn’t changed in the bacula-fd.conf and once when the client was shut-down. In terms of performance there should be no problems as the raid-array could handle a throughput to disk of 650Mbyte/s . The network-capacity should be no problem either, as the storage-server got a 10gbit connection, the clients a 1gbit one. Cpu- and memory load (on the storage-server) is low too (about 10% of one cpu-core per backup-job, about 900mb of ram-usage). To make sure it is not a temporary issue, the bacula-director and the storage-daemon were restarted. To make sure it is not an issue depending on the Number of concurrent jobs we tested it with three jobs at a time. To make sure it is not a hardware related issue another storage-deamon system was installed and tested. We could always re-produce the crash of the storage-daemon on other hardware under similar conditions. Additional Info: ProblemType: Bug Uname: 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:28:05 UTC 2010 x86_64 Architecture: x86_64 Package the bug was found in: bacula-sd (5.0.1-1ubuntu1) SourcePackage: bacula Release: Ubuntu 10.04 LTS ** Affects: bacula (Ubuntu) Importance: Undecided Status: New ** Tags: bacula bacula-sd segfault -- Bacula Storage-daemon dies with segfault if a File-daemon can’t be contacted https://bugs.launchpad.net/bugs/622742 You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to bacula in ubuntu. -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs