Hi, The "broken pipe"s continue to happen. Here are some debug informations retrieved with 'strace'. I hope this will help, we have about 1 crash per week per server and we run hundreds.
We use : dpkg -l |grep bacula ii bacula-common 5.0.1-1ubuntu1 network backup, recovery and verification - ii bacula-common-sqlite3 5.0.1-1ubuntu1 network backup, recovery and verification - ii bacula-console 5.0.1-1ubuntu1 network backup, recovery and verification - ii bacula-director-common 5.0.1-1ubuntu1 network backup, recovery and verification - ii bacula-director-sqlite3 5.0.1-1ubuntu1 network backup, recovery and verification - ii bacula-fd 5.0.1-1ubuntu1 network backup, recovery and verification - ii bacula-pkg 2.3-eole54~3.gbpf865a6 Configuration (dictionnaire et templates) et ii bacula-sd 5.0.1-1ubuntu1 network backup, recovery and verification - ii bacula-sd-sqlite3 5.0.1-1ubuntu1 network backup, recovery and verification - linux-image-2.6.32-46-server SMP x86_64 GNU/Linux SD, FD and DIR are on the same computer. "/mnt/sauvegardes" is the directory where SD writes its volumes and this directory is a SMB share mounted using "mount -t cifs". Command used to retrieve debug : strace -s 1000 -ttT -f -o /tmp/<process_name>.out -p<pid_process> * bacula-fd : 13814 13:57:40.339464 write(5, "\0\1\0\32x\234\0\t@\366\277\302\236m\312\256\30\217\273\233\340\310\352\353~\276f\4\273\273w4\26\221O\263\315-D\364\226\347HKYN\305\373\247\326\357\217\210\313f\245\0212e\vm\3 55\34\200:l}:\327G\345\320\376\345\263:\244\335\322(\300\232\302\330/\323`w\225\224\223+\261\254\355sb8\350f\234V\270\275\275\354\22\373{\300\216\351J\v\34\304S\352\320)6zP\254\267\33\32\364\372\2\273\37Hj\2 642&\36:93\342\203\251\351^\272l\254'66\335\345\302ME\341\310\t\262\334\237\242\246\262\371\343u\2537l\232kU\220\345\317cAK\274\f\362\372\345\36R\230\272\0332\360\20\301\304\312\6\340A\rm\213R\352 z,\346\220 \30)\"3\221j\22)<\304\271\220i\33\r2\212H\24\221q\300\351\203\":P\371\376?\330\206\357\v\331F\347\361\377c\305s\1\17n%\26\343P\177C\243\224I&VqU\211\205\200\203\366\312(\3306[\330W\353\340\21~\371\312LU\rO\2 74\262rX\275\246\267\246<(\310\30\204i\327\274R\206\354K\213\267\236\232W\316T\200\360\325\276G{Q\270C\361\204\254\267\276\344\0\0\206\t\367\223/\30\355n\325\257\20\27\233\255QV\26\365a\\VvFaO\275d\3447\26\" \366|\26\37\326\353~\34\354aD\207\302=E\300\"D\267\204\30\2359$\306\16 [\31\236\246T\32\323G<\321\213'\300a\262+\215V\353\235\377\354\335\306\t\214\231\f\361]%LT\224\277\4\207\256}\377Z\253\225h\33\266\336\2 67\331;#\313\306I#Dm\3429\317\250\0]\362\340\31\321#\317\241|Q\227)\24\220\366\352m\22G\313\16A\3103\277R\256+\373\224\177N/\354\263,\t\325\311\345\225I\364\24\373\27\355\350IJ+\"}~c\257\206\245\361\302u\37\ 372Vw\343\204\211\374\244\363]\231\376\221n5\253\16l\241\205v\355[\37\35\264]\303\221\233\fp\0jv^\232-\260j\2034\206\356\351\":]\262<\357y\2262\353\177\300\v\371\365Q`/\273\226G\332L_\342.\375.\263\20~h\333\ 346\223\262Z\277\0173s\255\360\321\307[\225F\375\211\330\366\340\344\216\272\222S\310\2069?N\177\366D\235s\222\261\360#cBF\320r\260\4R@\201w\366\254\240t\32|\"\t\361\203\3\231\260\267X\376$[\347b9\7\373\235\ 355\251\1\205\36\3441V\260F\340N\177\322\206'\220eC\232Vv|=zO\370Ls|Y\231\376\364\17\311Z\336\230\216\23\235l\212kh:\271cdS\335o.\t\313\353u\1\262\233\202~\0\307\337\201\375\224\36\17\373\0227}\261\34\367R\2 211\353\260\324\351\311\364\204Zx\356n:I\232.)k\302\205\311H\305\20\240\224\10M\220.\304+9y\265}\244|NA3\207\304\25\35\265\333\331\30\342\317k\2z\267p\23Q\361\361\260\323\2273d\34\200\277\326C)u\217\2\201\32 4\25\375\r\226\310\335\311*d#\371\20\211\204\363\331}\352;\337\254\24\244\252\266\2142n\243[\r\212}9\222\367\\,\231\356\3016\25\255\360Z\302\254\304\36Q\35\244\2371\347c\220\272\2\330\253\344\331\241\210\213 \320\302'\356mz=$\224L\314\345f\307F<\270H\262l\272\361\247S\16\fv\1\327aoL]\336i\324\ft\312\327\24^\345\272If\233u\213\300\222rn\350\25\322b\340~\351>=o\24\367C\6\276\353\231\377p:J?\30\230\303\366\2339\352 +\320M\36s\233\237\332z\215\227p\31\353\206\240\353$Mub\f\310\353;~\216"..., 65566) = 65566 <0.000128> 13814 13:57:40.339965 read(16, "d\235\2534\"\312j\370\271\360\301\36\22?\20\f\255\216\317\24z\241\374/\321B\303<\220\10\267\36g\276N6<!\235\374F\4\254\vv@\311\3277x\20\373\353\376\231\304\24\326\350\373pt\37 3'G\32|}\0\232 E\352\321\353\3631\211D\3513g>h\273A\225W\361\300\334\205\3601\302\234\300S\5\375\235\210Q\206GDy\276\334\2310'gL\236\21\356\177U\20q\356\367j\362U8\321\213\344\30p\237L\37\246$\25K\256\330c\vzV\310w\361\3448\236\241\6^\372bX\372(6(q\332\357\303M\26be,X!\361\250\311\327\177\360;\214\211\323\263\314\231\247g\335\3232\332\362\202*:\"\22\36E\362\213\257\275\r\207\367d1\265v\373\250\336\20\37\270\267^\347\257q\360|>,\251Z\275\363\376\33\363\326N\3250H\10%\354\320\212\3541\312Im\217\34'B\251\375\314\10\377\271v~\316\356\230\266v\216;\214b\332\3758S\365\247\316\313F\v]m2\200\234?\200M\356\252>\312#\26{s\0}\321\317\253\"\177\376\26768\263\236Q\367\nO(\244dD\24\306|\361\263\221\315\336\0J2\264P \210\220\t\275\251\4 \10\242P(7v\245\330>p'\316;\237\215$\373\242\36\246\3036\246_\f\5\305\26\31ho\205\30>\201\36:\253\275!\376\21\365\17\324aQ[\264\336\352\343\"\306)O\222CBEJ$\364\371\6\3706\35y2\360\374\237\350\223\3\307\267{\314\355\350lOD\f\235?\374\357\n\21n?\366\211\201\322V}=\16\2128V\t79\310\205\360fr\31\330\336\321\23\344@Gg\376p\3I\373~\321Iq\21\10\347\353'\16\242\220R\216\376S\237g\377q\226\322\266\213\357\314\270\367\353\213\f\37\337Z\371\333\310\343\274\326\3\n8\261\210^>}OA\37)\210\300\16~\202~\371W\240\212{\360j\247\204\267|\267g\34c\264x\305r\215\367\f\345\325\31\2212|/:\366\271\245.b\2006c\276*\214\305\342\270\264s\330P\246\314\34NC2>h= \t\366\263\36\33\3123\377\306\277\343\32\257\274\237\261\322\335\2722\327z^s\246B\2460\26\221DL\303]\217\243\216\31R6\201\337\357]\206\204c\20YL\32\346\30F\344\240\203\354A\360\1\226\306o\306\225\361v\234\304k\372\223\30\23}\221\352\225\250\372\372q\34\244\237O'*z\203\320H\377\0318>\2\340q\235\23\371E\317=\225\"\214\207V\261\6\277\370\273\367Q\361a\37\355\"AC\272\311\307\250\271\27\342\32c\6FJr\332\30\247\306\262\21\222mw\235r\203x*\253\260\361G\3679\n\326=\255\332\351-\1I\343\7R\252\264\347\3\260\217\375\277][\32T\204/\177\0176\344#\316%\350\311\263O\376\212\365%!\317a\373\23G=\237\213U\206>\316\375O\373X\370\277\240\3048\314>\257\226[#\353\216\330\337\0\202#+&TW\247\177\246\360\316\21\360`W\301q\345\210O\203y\213\342C\342\355\3757\212\31,M\344\2352\203y\1\362\373\17;\213\rO\366\250\237\2639\3\17\376U\315L\20\215\20\6\204\357\23\234\f\310\fsv\314\335\v\204T\363\225\326\223B\262V\240=o\206CP\275\357D\320~?\374\344\200\252\335P\237\200\0\357\251\37\20b\276\37\240@ws$4\327\241P\366v\21\ 356@A\264\355\354E\236=0j\244Z\37\0029\374\275gD6\271\"M\360{\210|\350\232\2231ah\246\303\367\177\327Q\360d#\275\247\317\344\215\376\236'Rn\310\376\367|\352\240<\254Hs\244\33\362\260"..., 65536) = 65536 <0.000153> 13865 13:57:40.341229 <... select resumed> ) = 1 (in [5], left {3, 820192}) <1.179856> 13865 13:57:40.341293 read(5, "", 4) = 0 <0.000069> 13865 13:57:40.341476 madvise(0x7f6380185000, 8368128, MADV_DONTNEED) = 0 <0.000027> 13865 13:57:40.341559 _exit(0) = ? 13814 13:57:40.343830 write(5, "\0\1\0\32x\234\0\37@\340\277d\235\2534\"\312j\370\271\360\301\36\22?\20\f\255\216\317\24z\241\374/\321B\303<\220\10\267\36g\276N6<!\235\374F\4\254\vv@\311\3277x\20\373\353\376\231\304\24\326\350\373pt\373'G\32|}\0\232 E\352\321\353\3631\211D\3513g>h\273A\225W\361\300\334\205\3601\302\234\300S\5\375\235\210Q\206GDy\276\334\2310'gL\236\21\356\177U\20q\356\367j\362U8\321\213\344\30p\237L\37\246$\25K\256\330c\vzV\310w\361\3448\236\241\6^\372bX\372(6(q\332\357\303M\26be,X!\361\250\311\327\177\360;\214\211\323\263\314\231\247g\335\3232\332\362\202*:\"\22\36E\362\213\257\275\r\207\367d1\265v\373\250\336\20\37\270\267^\347\257q\360|>,\251Z\275\363\376\33\363\326N\3250H\10%\354\320\212\3541\312Im\217\34'B\251\375\314\10\377\271v~\316\356\230\266v\216;\214b\332\3758S\365\247\316\313F\v]m2\200\234?\200M\356\252>\312#\26{s\0}\321\317\253\"\177\376\26768\263\236Q\367\nO(\244dD\24\306|\361\263\221\315\336\0J2\264P \210\220\t\275\251\4 \10\242P(7v\245\330>p'\316;\237\215$\373\242\36\246\3036\246_\f\5\305\26\31ho\205\30>\201\36:\253\275!\376\21\365\17\324aQ[\264\336\352\343\"\306)O\222CBEJ$\364\371\6\3706\35y2\360\374\237\350\223\3\307\267{\314\355\350lOD\f\235?\374\357\n\21n?\366\211\201\322V}=\16\2128V\t79\310\205\360fr\31\330\336\321\23\344@Gg\376p\3I\373~\321Iq\21\10\347\353'\16\242\220R\216\376S\237g\377q\226\322\266\213\357\314\270\367\353\213\f\37\337Z\371\333\310\343\274\326\3\n8\261\210^>}OA\37)\210\300\16~\202~\371W\240\212{\360j\247\204\267|\267g\34c\264x\305r\215\367\f\345\325\31\2212|/:\366\271\245.b\2006c\276*\214\305\342\270\264s\330P\246\314\34NC2>h= \t\366\263\36\33\3123\377\306\277\343\32\257\274\237\261\322\335\2722\327z^s\246B\2460\26\221DL\303]\217\243\216\31R6\201\337\357]\206\204c\20YL\32\346\30F\344\240\203\354A\360\1\226\306o\306\225\361v\234\304k\372\223\30\23}\221\352\225\250\372\372q\34\244\237O'*z\203\320H\377\0318>\2\340q\235\23\371E\317=\225\"\214\207V\261\6\277\370\273\367Q\361a\37\355\"AC\272\311\307\250\271\27\342\32c\6FJr\332\30\247\306\262\21\222mw\235r\203x*\253\260\361G\3679\n\326=\255\332\351-\1I\343\7R\252\264\347\3\260\217\375\277][\32T\204/\177\0176\344#\316%\350\311\263O\376\212\365%!\317a\373\23G=\237\213U\206>\316\375O\373X\370\277\240\3048\314>\257\226[#\353\216\330\337\0\202#+&TW\247\177\246\360\316\21\360`W\301q\345\210O\203y\213\342C\342\355\3757\212\31,M\344\2352\203y\1\362\373\17;\213\rO\366\250\237\2639\3\17\376U\315L\20\215\20\6\204\357\23\234\f\310\fsv\314\335\v\204T\363\225\326\223B\262V\240=o\206CP\275\357D\320~?\374\344\200\252\335P\237\200\0\357\251\37\20b\276\37\240@ws$4\327\241P\366v\21\ 356@A\264\355\354E\236=0j\244Z\37\0029\374\275gD6\271\"M\360{\210|\350\232\2231ah\246\303\367\177\327Q\360d#\275\247\317\344\215\376\236'Rn\310\376\367"..., 65566) = -1 EPIPE (Broken pipe) <0.000079> 13814 13:57:40.344260 --- SIGPIPE (Broken pipe) @ 0 (0) --- 13814 13:57:40.344423 write(4, "\0\0\0\242Jmsg Job=JobSauvegarde.2013-04-26_13.47.36_09 type=3 level=1366977460 127.0.0.1-fd JobId 36: Fatal error: backup.c:1019 Network send error to SD. ERR=Broken pipe\n", 166) = 166 <0.000071> 13814 13:57:40.344608 close(16) = 0 <0.000047> 13814 13:57:40.344713 close(15) = 0 <0.000016> 13814 13:57:40.344790 close(14) = 0 <0.000047> 13814 13:57:40.344884 close(13) = 0 <0.000014> 13814 13:57:40.345010 close(12) = 0 <0.000018> 13814 13:57:40.345068 close(10) = 0 <0.000014> 13814 13:57:40.345117 close(8) = 0 <0.000012> 13814 13:57:40.345163 close(7) = 0 <0.000013> 13814 13:57:40.345210 close(6) = 0 <0.000016> 13814 13:57:40.345260 nanosleep({0, 50000000}, NULL) = 0 <0.050100> 13814 13:57:40.395482 nanosleep({0, 50000000}, NULL) = 0 <0.050145> 13814 13:57:40.445766 nanosleep({0, 50000000}, NULL) = 0 <0.050142> 13814 13:57:40.495980 nanosleep({0, 50000000}, NULL) = 0 <0.050112> 13814 13:57:40.546225 nanosleep({0, 50000000}, NULL) = 0 <0.050145> 13814 13:57:40.596512 nanosleep({0, 50000000}, NULL) = 0 <0.050145> 13814 13:57:40.646794 nanosleep({0, 50000000}, NULL) = 0 <0.050142> 13814 13:57:40.697076 nanosleep({0, 50000000}, NULL) = 0 <0.050145> 13814 13:57:40.747364 nanosleep({0, 50000000}, <unfinished ...> 4678 13:57:40.780160 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) <30.000074> 4678 13:57:40.780288 futex(0x7f63835f65e0, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000086> 4678 13:57:40.780521 futex(0x7f63835f6624, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1119, {1366977490, 780468000}, ffffffff <unfinished ...> 13814 13:57:40.797546 <... nanosleep resumed> NULL) = 0 <0.050145> 13814 13:57:40.797667 nanosleep({0, 50000000}, NULL) = 0 <0.050100> 13814 13:57:40.847863 nanosleep({0, 50000000}, NULL) = 0 <0.050137> 13814 13:57:40.898122 nanosleep({0, 50000000}, NULL) = 0 <0.050144> 13814 13:57:40.948406 nanosleep({0, 50000000}, NULL) = 0 <0.050145> 13814 13:57:40.998674 nanosleep({0, 50000000}, NULL) = 0 <0.050125> * bacula-sd : 13812 13:57:39.978936 read(6, "x\234\0\3@\374\277\20[\272\331\16M\323\337\205\3\313\22\211\355\323\303\376\3677\v\373s\255\26\272\30\253\351\351\377\373]\22\265\204\360\366\340p\270v'-M\267[\243\347\370\327?\4r\0160\360\325\350O\34\27\5\300\275\364R\344\262H\246\2468qm\10\246\372n\301#\21\250zasw(f\205L R\216\1\320G\270\265\216\240+5\24\264O\300\206~1&\355\320\247\250U?B\3\202\367M\354\204\32\361\253\271\336\242\334\301\0\372\327\214\355\256\325\370\305\20HT\210\177_z4;\256Fo\324\35\301q\4\323\3207\n\324<\322>\303t\227\217\306\317\212\370k\321R\316\353\36\247\213\363\204~\332\27\303\346\256\321(\376G,\321_&=\212\213;\30\310\323+\371\303jX^\31\254\207\264\316E\312\t\34\207\337\357\334jV\377\372\332n\334\26\27\336U:\3YPawI\346&\275\243\323\2069\"\324^hH\226\326\3550\33\262\376$\276\315\225\266\252\211\357\177`\260n\367&\201\350\367\10\0K\320\277\367G\364\277\201\177\203\207\374\313\300\224\270\315m\271\27\342\nxc1Ci\324\324\243\10\321WQ\310\315\21o\263\360\376\317$\235\246\363\3\367\364\224\333Fm7\307\312\5<\233\256\216\16]\341\3715l\332#mD\31\360J:\353\261\201A\231\303\10b\231j,\2150P\252\303I%\222\3503Y\332\312\250\265t\3275Fb\210\355\311\265GX\3105,Z\321\212>Z\21\r\26\34\346\207\233\212\tH\33\215]m\370N[S%v\373\3016\210\371\254\335\252Qb\20\346\305Z\0326z\307\320\231M\360\367\3319\2 4\373(\354\225\275\263\331h\237\363\342\361\1\355\271\300\326\227\243\205\32\364\304Z\266\220Y-\177a\223b\315ct\266Q\335\230N\273_\204C)y\372\36\206$\214\r\306\252\274\324\303\306I!\26'\307H\334\4\264]&\336\311=\324\304l\322P\331\217T3f\31\271\2079oZ\233\334\314\342\1\304}&j\360]\335\355F\217=]Yf\21\2317\303\302\231\306\3119\271f\263\234U\2x\330sy\311vH\232\302\351\305D\225\272\344\324\25G\305\254sF\v?\277Q\234\326\203k\247\301\354\356\235tJ\242Q-q\224*\23w\360\277\370*\213bj\34\320\3426H(\214j\313\37\321\257\367\217.\20\341\1\266\3770\20x[X\325\6\202\347h\316\375\371c\26p\347\204\r\1\3506\vr\271,\355\311r\36\275\377R\363\263\264\r\345\223\3523\277\237\326\2606\303zO \225\344\355p\314/_\222Q1\250\224vD\35L\1!|k\36\215N/%\303\22\343\241\316Z\264\353u\335Y\272|B\337Th\24\316\345\7\20F:\220\366\377j\302\364+\364\3\5\276\201>~\357\6\336\263s\340\221\331z\235g\0008qXu\267\315\35\313\260iHBu\314%\352\177\373\337\6\201\226\361\220\342\226\353$\304\r\253\277\333\267\356\1T\"\326\22\37\237\1Z \213\374\200vF\365\317\301C|\204\365;\371\302G\2723=`f\t\36\241\236\4\0:\3269\367\305\2724xK\377\353-\232\372\17Q\252\240+\313<ws\315r\242\301\235D\361\206\t\3124\254\342,\304{\220\376\260\243\310K\327v\4\363\312\10\215\3400\0254A\332x@\206\222~\236\371V7\222\214\205\364\2673*\263\364&\266\241KS)'\34'\212Z\31\266*\375S1v\"\234\245\274S]SUKGw\27\367+\365\20\372\260/.@`L\337\230*4x\372\340+2\300\5g\351\367\271\233\334\10\374\215: \201\200\276T`#/\366\22\343\33\334x\242\20\377\244q"..., 65562) = 65562 <0.000073> 13812 13:57:39.979711 write(5, "\0\0\0\305Jmsg Job=JobSauvegarde.2013-04-26_13.47.36_09 type=6 level=1366977459 127.0.0.1-sd JobId 36: User defined maximum volume capacity 2,000,000,000 exceeded on device \"FileStorage\" (/mnt/sauvegardes).\n", 201) = 201 <0.000039> 13812 13:57:39.979841 write(5, "\0\0\0\272CatReq Job=JobSauvegarde.2013-04-26_13.47.36_09 CreateJobMedia FirstIndex=31170 LastIndex=31170 StartFile=0 EndFile=0 StartBlock=1999872167 EndBlock=1999936678 Copy=0 Strip=0 MediaId=85\n", 190) = 190 <0.000031> 13812 13:57:39.979967 read(5, "\0\0\0\27", 4) = 4 <0.183571> 13812 13:57:40.163599 read(5, "1000 OK CreateJobMedia\n", 23) = 23 <0.000013> 13812 13:57:40.163672 write(5, "\0\0\1TCatReq Job=JobSauvegarde.2013-04-26_13.47.36_09 UpdateMedia VolName=127.0.0.1-dir-full-0085 VolJobs=1 VolFiles=0 VolBlocks=31001 VolBytes=1999936679 VolMounts=1 VolErrors=0 VolWrites=31002 MaxVolBytes=2000000000 EndTime=1366977460 VolStatus=Full Slot=0 relabel=0 InChanger=0 VolReadTime=0 VolWriteTime=15053773 VolFirstWritten=0 VolParts=0\n", 344) = 344 <0.000066> 13812 13:57:40.163831 read(5, "\0\0\1T", 4) = 4 <0.124515> 13812 13:57:40.288406 read(5, "1000 OK VolName=127.0.0.1-dir-full-0085 VolJobs=1 VolFiles=0 VolBlocks=31001 VolBytes=1999936679 VolMounts=1 VolErrors=0 VolWrites=31002 MaxVolBytes=2000000000 VolCapacityBytes=0 VolStatus=Full Slot=0 MaxVolJobs=0 MaxVolFiles=0 InChanger=0 VolReadTime=0 VolWriteTime=15053773 EndFile=0 EndBlock=1999936678 VolParts=0 LabelType=0 MediaId=85\n", 340) = 340 <0.000014> 13812 13:57:40.288543 write(5, "\0\0\0\307Jmsg Job=JobSauvegarde.2013-04-26_13.47.36_09 type=6 level=1366977460 127.0.0.1-sd JobId 36: End of medium on Volume \"127.0.0.1-dir-full-0085\" Bytes=1,999,936,679 Blocks=31,001 at 26-Apr-2013 13:57.\n", 203) = 203 <0.000069> 13812 13:57:40.288694 close(7) = 0 <0.000027> 13812 13:57:40.288809 pipe([7, 8]) = 0 <0.000019> 13812 13:57:40.288882 rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, 0x7feeb5adf8f0}, {SIG_IGN, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7feeb5adf8f0}, 8) = 0 <0.000011> 13812 13:57:40.288968 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7feeb0d909d0) = 14391 <0.000243> 13812 13:57:40.289314 rt_sigaction(SIGPIPE, {SIG_IGN, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7feeb5adf8f0}, <unfinished ...> 14391 13:57:40.289351 getsockname(3, <unfinished ...> 13812 13:57:40.289377 <... rt_sigaction resumed> NULL, 8) = 0 <0.000030> 14391 13:57:40.289405 <... getsockname resumed> {sa_family=AF_INET, sin_port=htons(36934), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 <0.000043> 13812 13:57:40.289432 close(8) = 0 <0.000011> 14391 13:57:40.289469 getpeername(3, <unfinished ...> 13812 13:57:40.289485 fcntl(7, F_GETFL) = 0 (flags O_RDONLY) <0.000009> 14391 13:57:40.289525 <... getpeername resumed> {sa_family=AF_INET, sin_port=htons(389), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 <0.000044> 13812 13:57:40.289549 fstat(7, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 <0.000010> 14391 13:57:40.289597 fcntl(3, F_GETFD <unfinished ...> 13812 13:57:40.289614 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7feeb744a000 <0.000013> 13812 13:57:40.289660 lseek(7, 0, SEEK_CUR <unfinished ...> 14391 13:57:40.289676 <... fcntl resumed> ) = 0x1 (flags FD_CLOEXEC) <0.000067> 13812 13:57:40.289700 <... lseek resumed> ) = -1 ESPIPE (Illegal seek) <0.000028> 13812 13:57:40.289741 futex(0x7feeb67fd624, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7feeb67fd620, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 <0.000017> 4691 13:57:40.289799 <... futex resumed> ) = 0 <29.473957> 14391 13:57:40.289825 dup(3 <unfinished ...> 13812 13:57:40.289841 nanosleep({0, 100000}, <unfinished ...> 14391 13:57:40.289858 <... dup resumed> ) = 9 <0.000022> 4691 13:57:40.289875 futex(0x7feeb67fd5e0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 14391 13:57:40.289908 fcntl(9, F_SETFD, FD_CLOEXEC <unfinished ...> 4691 13:57:40.289932 <... futex resumed> ) = 0 <0.000046> 14391 13:57:40.289947 <... fcntl resumed> ) = 0 <0.000020> 14391 13:57:40.289982 socket(PF_INET, SOCK_STREAM, IPPROTO_IP <unfinished ...> 4691 13:57:40.290001 kill(14391, SIGTERM <unfinished ...> 14391 13:57:40.290014 <... socket resumed> ) = 10 <0.000018> 13812 13:57:40.290028 <... nanosleep resumed> NULL) = 0 <0.000174> 14391 13:57:40.290042 --- SIGTERM (Terminated) @ 0 (0) --- 13812 13:57:40.290115 read(7, <unfinished ...> 4691 13:57:40.290131 <... kill resumed> ) = 0 <0.000120> 4691 13:57:40.290157 futex(0x7feeb67fd624, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1301, {1366977465, 290146000}, ffffffff <unfinished ...> 14391 13:57:40.290237 socket(PF_FILE, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 11 <0.000016> 14391 13:57:40.290294 connect(11, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0 <0.000019> 14391 13:57:40.290359 sendto(11, "<27>Apr 26 13:57:40 bacula-sd: Shutting down Bacula service: 127.0.0.1-sd ...\n", 78, MSG_NOSIGNAL, NULL, 0) = 78 <0.000039> 14391 13:57:40.290467 futex(0x7feeb67fd624, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7feeb67fd620, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 0 <0.000011> 14391 13:57:40.290517 nanosleep({0, 100000}, NULL) = 0 <0.000165> 14391 13:57:40.290759 tgkill(14391, 14391, SIGUSR2) = 0 <0.000011> 14391 13:57:40.290803 nanosleep({0, 50000000}, NULL) = 0 <0.050106> 14391 13:57:40.341067 shutdown(6, 2 /* send and receive */) = 0 <0.000120> 14391 13:57:40.341281 close(6) = 0 <0.000052> 14391 13:57:40.341514 unlink("/var/lib/bacula/bacula-sd.9103.state") = 0 <0.000079> 14391 13:57:40.341658 open("/var/lib/bacula/bacula-sd.9103.state", O_WRONLY|O_CREAT, 0640) = 6 <0.000032> 14391 13:57:40.341730 write(6, "Bacula State\n\0\0\0\4\0\0\0\0\0\0\0\300\0\0\0\0\0\0\0\224\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 192) = 192 <0.000026> 14391 13:57:40.341845 lseek(6, 192, SEEK_SET) = 192 <0.000010> 14391 13:57:40.341887 write(6, "\n\0\0\0", 4) = 4 <0.000014> Le 26/03/2013 11:31, le dahut a écrit : > Hi, > > I have : > * Bacula 5.0.1-1ubuntu1 > * Ubuntu 10.04 > > Sometimes, not always, I get this error : > 25-Mar 20:46 127.0.0.1-fd JobId 210: Fatal error: backup.c:1019 > Network send error to SD. ERR=Broken pipe > > > Any idea ? > > > Debug (-d 99 -dt) has been enabled on FD and SD and shows : > FD : > 25-mars-2013 20:43:28 127.0.0.1-fd: cram-md5.c:73-0 send: auth cram-md5 > <1671005128.1364240608@127.0.0.1-fd> ssl=0 > 25-mars-2013 20:43:28 127.0.0.1-fd: cram-md5.c:152-0 sending resp to > challenge: n6A1g//3Y3Rp551zi+/V3A > 25-mars-2013 20:45:23 127.0.0.1-fd: cram-md5.c:73-0 send: auth cram-md5 > <783764871.1364240723@127.0.0.1-fd> ssl=0 > 25-mars-2013 20:45:23 127.0.0.1-fd: cram-md5.c:152-0 sending resp to > challenge: +xIuXS+ObG/kF8ZWtiIc7D > 25-mars-2013 20:46:50 127.0.0.1-fd: cram-md5.c:73-0 send: auth cram-md5 > <817676241.1364240810@127.0.0.1-fd> ssl=0 > 25-mars-2013 20:46:50 127.0.0.1-fd: cram-md5.c:152-0 sending resp to > challenge: n7+zG5k9K60pZ/pNXG+lAB > 25-mars-2013 20:46:51 127.0.0.1-fd: cram-md5.c:152-38 sending resp to > challenge: 8+4K0S/3O/9lG5+lB9wmKB > 25-mars-2013 20:46:51 127.0.0.1-fd: cram-md5.c:80-38 send: auth cram-md5 > <868595135.1364240811@127.0.0.1-fd> ssl=0 > 25-mars-2013 20:46:51 127.0.0.1-fd: cram-md5.c:99-38 Authenticate OK > mjZjqzckW99/X++fn6+C1D > 25-mars-2013 20:47:18 127.0.0.1-fd: cram-md5.c:73-0 send: auth cram-md5 > <1676509308.1364240838@127.0.0.1-fd> ssl=0 > 25-mars-2013 20:47:18 127.0.0.1-fd: cram-md5.c:152-0 sending resp to > challenge: EkJV79/BX//3V+Ecyi/BkD > 25-mars-2013 20:49:13 127.0.0.1-fd: cram-md5.c:73-0 send: auth cram-md5 > <1557604326.1364240953@127.0.0.1-fd> ssl=0 > 25-mars-2013 20:49:13 127.0.0.1-fd: cram-md5.c:152-0 sending resp to > challenge: 55+5Q+/qvh+yKk/ID+lvxA > > SD : > 25-mars-2013 20:37:44 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <67756834.1364240264@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:37:44 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: j+stRB8R/FVUs7/qj6FbOD > 25-mars-2013 20:37:44 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:38:23 127.0.0.1-sd: block.c:788-37 *** Leave > terminate_writing_volume -- OK > 25-mars-2013 20:38:23 127.0.0.1-sd: device.c:125-37 set_unload > > Volume Label: > Id : Bacula 1.0 immortal > VerNo : 11 > VolName : 127.0.0.1-dir-full-0151 > PrevVolName : > VolFile : 0 > LabelType : PRE_LABEL > LabelSize : 0 > PoolName : Full-Pool > MediaType : File > PoolType : Backup > HostName : ng371378 > Date label written: 25-mars-2013 20:38 > > Volume Label: > Id : Bacula 1.0 immortal > VerNo : 11 > VolName : 127.0.0.1-dir-full-0151 > PrevVolName : > VolFile : 0 > LabelType : PRE_LABEL > LabelSize : 0 > PoolName : Full-Pool > MediaType : File > PoolType : Backup > HostName : ng371378 > Date label written: 25-mars-2013 20:38 > > Volume Label: > Id : Bacula 1.0 immortal > VerNo : 11 > VolName : 127.0.0.1-dir-full-0151 > PrevVolName : > VolFile : 0 > LabelType : PRE_LABEL > LabelSize : 181 > PoolName : Full-Pool > MediaType : File > PoolType : Backup > HostName : ng371378 > Date label written: 25-mars-2013 20:38 > 25-mars-2013 20:38:24 127.0.0.1-sd: device.c:133-37 must_unload=0 > 25-mars-2013 20:39:38 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1028214411.1364240378@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:39:38 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: 6B/qhC/ZyghPDjURZ3+KrC > 25-mars-2013 20:39:38 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:41:33 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1607966024.1364240493@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:41:33 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: iX+Z80/F2l/0f61re6/DUB > 25-mars-2013 20:41:33 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:43:28 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1118591179.1364240608@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:43:28 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: Q7/1I8+LO++bK7+4MQ/IGD > 25-mars-2013 20:43:28 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:45:23 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <362267873.1364240723@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:45:23 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: J+90J5tLb5+tJihH15/gLA > 25-mars-2013 20:45:23 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:46:04 127.0.0.1-sd: block.c:788-37 *** Leave > terminate_writing_volume -- OK > 25-mars-2013 20:46:04 127.0.0.1-sd: device.c:125-37 set_unload > Orphaned buffer: 127.0.0.1-sd 272 bytes buf=857d358 allocated at > ../lib/mem_pool.h:93 > Orphaned buffer: 127.0.0.1-sd 1040 bytes buf=857d920 allocated at > ../lib/berrno.h:77 > Orphaned buffer: 127.0.0.1-sd 272 bytes buf=857ca60 allocated at > bpipe.c:82 > Orphaned buffer: 127.0.0.1-sd 1040 bytes buf=858e218 allocated at > bpipe.c:399 > Orphaned buffer: 127.0.0.1-sd 8 bytes buf=856a4a8 allocated at > workq.c:164 > Orphaned buffer: 127.0.0.1-sd 80 bytes buf=857c500 allocated at > block.c:116 > Orphaned buffer: 127.0.0.1-sd 64528 bytes buf=856c300 allocated at > block.c:128 > Orphaned buffer: 127.0.0.1-sd 4016 bytes buf=859ed58 allocated at > dev.c:2236 > Orphaned buffer: 127.0.0.1-sd 32001 bytes buf=859fd28 allocated at > bpipe.c:400 > Orphaned buffer: 127.0.0.1-sd 24 bytes buf=857d2a0 allocated at > bpipe.c:77 > Orphaned buffer: 127.0.0.1-sd 16 bytes buf=856a690 allocated at > watchdog.c:86 > Orphaned buffer: 127.0.0.1-sd 16 bytes buf=856a170 allocated at > watchdog.c:87 > > Volume Label: > Id : Bacula 1.0 immortal > VerNo : 11 > VolName : 127.0.0.1-dir-full-0152 > PrevVolName : > VolFile : 0 > LabelType : PRE_LABEL > LabelSize : 0 > PoolName : Full-Pool > MediaType : File > PoolType : Backup > HostName : ng371378 > Date label written: 25-mars-2013 20:46 > > Volume Label: > Id : Bacula 1.0 immortal > VerNo : 11 > VolName : 127.0.0.1-dir-full-0152 > PrevVolName : > VolFile : 0 > LabelType : PRE_LABEL > LabelSize : 0 > PoolName : Full-Pool > MediaType : File > PoolType : Backup > HostName : ng371378 > Date label written: 25-mars-2013 20:46 > > Volume Label: > Id : Bacula 1.0 immortal > VerNo : 11 > VolName : 127.0.0.1-dir-full-0152 > PrevVolName : > VolFile : 0 > LabelType : PRE_LABEL > LabelSize : 181 > PoolName : Full-Pool > MediaType : File > PoolType : Backup > HostName : ng371378 > Date label written: 25-mars-2013 20:46 > 25-mars-2013 20:46:07 127.0.0.1-sd: device.c:133-37 must_unload=0 > 25-mars-2013 20:46:07 127.0.0.1-sd: append.c:295-37 back from > write_end_session_label() > 25-mars-2013 20:46:14 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1539942765.1364240774@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:46:14 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: 4/dLN61EklY//69o378GIB > 25-mars-2013 20:46:14 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:46:50 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1229506003.1364240810@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:46:50 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: /j/fZC/iQ/B4R9BoMWxcYB > 25-mars-2013 20:46:50 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:46:50 127.0.0.1-sd: job.c:170-38 >dird jid=38: 3000 OK > Job SDid=3 SDtime=1364225914 > Authorization=CGEI-DDCB-PIOO-DJKN-EPHK-JGPD-IEOC-JCAK > 25-mars-2013 20:46:50 127.0.0.1-sd: job.c:203-38 > BackupCatalog.2013-03-25_18.08.36_51 waiting 1800 sec for FD to contact > SD key=CGEI-DDCB-PIOO-DJKN-EPHK-JGPD-IEOC-JCAK > 25-mars-2013 20:46:51 127.0.0.1-sd: dircmd.c:180-0 Hello Start Job > BackupCatalog.2013-03-25_18.08.36_51 > 25-mars-2013 20:46:51 127.0.0.1-sd: job.c:251-0 Found Job > BackupCatalog.2013-03-25_18.08.36_51 > 25-mars-2013 20:46:51 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <84414128.1364240811@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:46:51 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: mjZjqzckW99/X++fn6+C1D > 25-mars-2013 20:46:51 127.0.0.1-sd: job.c:274-0 OK Authentication jid=38 > Job BackupCatalog.2013-03-25_18.08.36_51 > 25-mars-2013 20:46:51 127.0.0.1-sd: job.c:218-38 Auth=1 canceled=0 errstat=0 > 25-mars-2013 20:46:51 127.0.0.1-sd: job.c:224-38 Running job > BackupCatalog.2013-03-25_18.08.36_51 > 25-mars-2013 20:46:51 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1235087439.1364240811@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:46:51 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: sl+ne9+0rw/oCR/yTRQmbA > 25-mars-2013 20:46:51 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:47:18 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1424382435.1364240838@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:47:18 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: m81Ro68OwRs43X/KX/8IMB > 25-mars-2013 20:47:18 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > 25-mars-2013 20:49:13 127.0.0.1-sd: cram-md5.c:73-0 send: auth cram-md5 > <1943585224.1364240953@127.0.0.1-sd> ssl=0 > 25-mars-2013 20:49:13 127.0.0.1-sd: cram-md5.c:152-0 sending resp to > challenge: s5+Kt/dzx4/dB0M8E4Q36C > 25-mars-2013 20:49:13 127.0.0.1-sd: dircmd.c:211-0 Message channel init > completed. > > ------------------------------------------------------------------------------ > Own the Future-Intel® Level Up Game Demo Contest 2013 > Rise to greatness in Intel's independent game demo contest. > Compete for recognition, cash, and the chance to get your game > on Steam. $5K grand prize plus 10 genre and skill prizes. > Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users