Re: [Bacula-users] bacula-fd hangs on ClientRunAfterJob
On Thursday 01 November 2007 22:22:40 Dane Miller wrote: > Eric Bollengier wrote: > > IHMO, i think that the script doesn't close stderr, stdout and stdin > > properly. > > Hey you're right! stderr is the culprit here. The following Client Run > After Job script works around the problem: > > #!/bin/sh > # AfterScript.sh > /usr/local/etc/rc.d/mysql-server start 2> /dev/null > > It might be an issue with mysql's launcher script mysqld_safe. I'll look > into it a bit more. > > But why does this cause bacula-fd to hang? You have the same behaviour in every programs that run commands... (I have this in ssh sessions for example) Bacula is reading stdout/stderr to get output commands, and if they are not closed properly (at the end of the script for example), bacula will wait forever. Bye - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula-fd hangs on ClientRunAfterJob
Eric Bollengier wrote: > IHMO, i think that the script doesn't close stderr, stdout and stdin > properly. Hey you're right! stderr is the culprit here. The following Client Run After Job script works around the problem: #!/bin/sh # AfterScript.sh /usr/local/etc/rc.d/mysql-server start 2> /dev/null It might be an issue with mysql's launcher script mysqld_safe. I'll look into it a bit more. But why does this cause bacula-fd to hang? Dane - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula-fd hangs on ClientRunAfterJob
Hi, > I added "Client Run Before/After Job" scripts to two backup jobs in > order to stop/start mysql. But the "After" script seems to hang the > bacula-fd. Using 'status dir', the bacula console shows a terminated > status for these jobs, and the command 'list jobs' shows their status as > 'R'. > > The script being called is the FreeBSD rc script used to start/stop > mysql: /usr/local/etc/rc.d/mysql-server [start|stop]. I've ensured that > this script works when run manually, and that it returns 0. > > When I restart the offending bacula-fd's (kill doesn't work; requires > kill -9), the offending jobs finish with errors and the rest of my > queued jobs begin to run. > > Any ideas what's causing this? Suggestions for fixing it? IHMO, i think that the script doesn't close stderr, stdout and stdin properly. You can try to use something like "nohup" when the mysql-server script starts the database. Or something like (in the script) : /path/to/mysqld 2> /dev/null 1> /dev/null < /dev/null Bye - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula-fd hangs on ClientRunAfterJob
Interesting... I've been testing this manually and have more info. When bacula-fd is hung, I can fix it by stopping mysql. As soon as I stop mysql, the bacula job completes successfully. Also, it doesn't matter if my fileset includes mysql's data directory or not. For example, if I set my fileset to only include "File = /tmp/empty.txt", the job still hangs after starting mysql. What the heck is bacula-fd doing during/after the Client Run After script executes? Is anyone else using a Client Run script to stop/start mysql on FreeBSD? Dane - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula-fd hangs on ClientRunAfterJob
Michael Lewinger wrote: > Is bacula running as root or "bacula" user ? "bacula" user cannot restart > mysql. Thanks Michael. bacula-fd is running as root. Also note that it *does* stop and start mysql, but hangs after-wards (see the job output from my o.p. below) Dane > On 10/30/07, Dane Miller <[EMAIL PROTECTED]> wrote: > > > > logged output on the director while job is hung: > >zeus-dir: sql_find.c:134 No Job record found: ERR= > >CMD=SELECT StartTime FROM Job WHERE JobStatus='T' AND Type='B' AND > > Level='F' AND Name='ritin' AND ClientId=5 AND FileSetId=5 ORDER BY > > StartTime DESC LIMIT 1 > >zeus-dir: No prior or suitable Full backup found in catalog. Doing FULL > > backup. > >zeus-dir: Start Backup JobId 31, Job=ritin.2007-10-30_05.05.04 > >zeus-dir: Created new Volume "ritin-Full-0002" in catalog. > >zeus-dir: Using Device "ritinFileStorage" > >ritin-fd: ClientRunBeforeJob: run command > > "/usr/local/etc/rc.d/mysql-server stop" > >ritin-fd: ClientRunBeforeJob: Stopping mysql. > >ritin-fd: ClientRunBeforeJob: Waiting for PIDS: 96261. > >zeus-sd: Labeled new Volume "ritin-Full-0002" on device > > "ritinFileStorage" (/bacula/disk2). > >zeus-sd: Wrote label to prelabeled Volume "ritin-Full-0002" on device > > "ritinFileStorage" (/bacula/disk2) > >zeus-dir: Max Volume jobs exceeded. Marking Volume "ritin-Full-0002" as > > Used. > >ritin-fd: Disallowed filesystem. Will not descend from / into /dev > >zeus-sd: Job write elapsed time = 00:57:03, Transfer rate = 9.499 M > > bytes/second > >ritin-fd: ClientAfterJob: run command "/usr/local/etc/rc.d/mysql-server > > start" > >ritin-fd: ClientAfterJob: Starting mysql. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula-fd hangs on ClientRunAfterJob
Hi Dane, Is bacula running as root or "bacula" user ? "bacula" user cannot restart mysql. Michael On 10/30/07, Dane Miller <[EMAIL PROTECTED]> wrote: > Hi, > > I added "Client Run Before/After Job" scripts to two backup jobs in > order to stop/start mysql. But the "After" script seems to hang the > bacula-fd. Using 'status dir', the bacula console shows a terminated > status for these jobs, and the command 'list jobs' shows their status as > 'R'. > > The script being called is the FreeBSD rc script used to start/stop > mysql: /usr/local/etc/rc.d/mysql-server [start|stop]. I've ensured that > this script works when run manually, and that it returns 0. > > When I restart the offending bacula-fd's (kill doesn't work; requires > kill -9), the offending jobs finish with errors and the rest of my > queued jobs begin to run. > > Any ideas what's causing this? Suggestions for fixing it? > > Here are some details: > > Director OS: FreeBSD 6.2-RELEASE > File Daemon OS: FreeBSD 6.1-RELEASE > bacula-dir/sd: 2.2.4 > bacula-fd: 2.2.4 > catalog: MySQL 5.0 > > Total # of jobs: 13, of which... >7 "Priority = 10" >5 "Priority = 12" >1 "Priority = 20" > > bacula-dir.conf: Director{Maximum Concurrent Jobs=10;...} > bacula-sd.conf: Storage{Maximum Concurrent Jobs=20;...} > > Offending Client Run Before/After Job scripts: > Client Run Before Job = "/usr/local/etc/rc.d/mysql-server stop" > Client Run After Job = "/usr/local/etc/rc.d/mysql-server start" > > logged output on the director while job is hung: >zeus-dir: sql_find.c:134 No Job record found: ERR= >CMD=SELECT StartTime FROM Job WHERE JobStatus='T' AND Type='B' AND > Level='F' AND Name='ritin' AND ClientId=5 AND FileSetId=5 ORDER BY StartTime > DESC LIMIT 1 >zeus-dir: No prior or suitable Full backup found in catalog. Doing FULL > backup. >zeus-dir: Start Backup JobId 31, Job=ritin.2007-10-30_05.05.04 >zeus-dir: Created new Volume "ritin-Full-0002" in catalog. >zeus-dir: Using Device "ritinFileStorage" >ritin-fd: ClientRunBeforeJob: run command > "/usr/local/etc/rc.d/mysql-server stop" >ritin-fd: ClientRunBeforeJob: Stopping mysql. >ritin-fd: ClientRunBeforeJob: Waiting for PIDS: 96261. >zeus-sd: Labeled new Volume "ritin-Full-0002" on device "ritinFileStorage" > (/bacula/disk2). >zeus-sd: Wrote label to prelabeled Volume "ritin-Full-0002" on device > "ritinFileStorage" (/bacula/disk2) >zeus-dir: Max Volume jobs exceeded. Marking Volume "ritin-Full-0002" as > Used. >ritin-fd: Disallowed filesystem. Will not descend from / into /dev >zeus-sd: Job write elapsed time = 00:57:03, Transfer rate = 9.499 M > bytes/second >ritin-fd: ClientAfterJob: run command "/usr/local/etc/rc.d/mysql-server > start" >ritin-fd: ClientAfterJob: Starting mysql. > > > 'status dir' output while job on 'ritin' is hung (truncated): > > Running Jobs: > JobId Level Name Status > == > 31 Fullritin.2007-10-30_05.05.04 has terminated > 34 Fullcomdev.2007-10-30_05.05.07 is waiting for higher priority jobs > to finish > 35 Fullcomstag.2007-10-30_05.05.08 is waiting execution > > > Dane > -- > Dane Miller > Systems Administrator > Great Schools, Inc > http://greatschools.net > > > - > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > ___ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > -- Michael Lewinger MBR Computers http://mbrcomp.co.il - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] bacula-fd hangs on ClientRunAfterJob
Hi, I added "Client Run Before/After Job" scripts to two backup jobs in order to stop/start mysql. But the "After" script seems to hang the bacula-fd. Using 'status dir', the bacula console shows a terminated status for these jobs, and the command 'list jobs' shows their status as 'R'. The script being called is the FreeBSD rc script used to start/stop mysql: /usr/local/etc/rc.d/mysql-server [start|stop]. I've ensured that this script works when run manually, and that it returns 0. When I restart the offending bacula-fd's (kill doesn't work; requires kill -9), the offending jobs finish with errors and the rest of my queued jobs begin to run. Any ideas what's causing this? Suggestions for fixing it? Here are some details: Director OS: FreeBSD 6.2-RELEASE File Daemon OS: FreeBSD 6.1-RELEASE bacula-dir/sd: 2.2.4 bacula-fd: 2.2.4 catalog: MySQL 5.0 Total # of jobs: 13, of which... 7 "Priority = 10" 5 "Priority = 12" 1 "Priority = 20" bacula-dir.conf: Director{Maximum Concurrent Jobs=10;...} bacula-sd.conf: Storage{Maximum Concurrent Jobs=20;...} Offending Client Run Before/After Job scripts: Client Run Before Job = "/usr/local/etc/rc.d/mysql-server stop" Client Run After Job = "/usr/local/etc/rc.d/mysql-server start" logged output on the director while job is hung: zeus-dir: sql_find.c:134 No Job record found: ERR= CMD=SELECT StartTime FROM Job WHERE JobStatus='T' AND Type='B' AND Level='F' AND Name='ritin' AND ClientId=5 AND FileSetId=5 ORDER BY StartTime DESC LIMIT 1 zeus-dir: No prior or suitable Full backup found in catalog. Doing FULL backup. zeus-dir: Start Backup JobId 31, Job=ritin.2007-10-30_05.05.04 zeus-dir: Created new Volume "ritin-Full-0002" in catalog. zeus-dir: Using Device "ritinFileStorage" ritin-fd: ClientRunBeforeJob: run command "/usr/local/etc/rc.d/mysql-server stop" ritin-fd: ClientRunBeforeJob: Stopping mysql. ritin-fd: ClientRunBeforeJob: Waiting for PIDS: 96261. zeus-sd: Labeled new Volume "ritin-Full-0002" on device "ritinFileStorage" (/bacula/disk2). zeus-sd: Wrote label to prelabeled Volume "ritin-Full-0002" on device "ritinFileStorage" (/bacula/disk2) zeus-dir: Max Volume jobs exceeded. Marking Volume "ritin-Full-0002" as Used. ritin-fd: Disallowed filesystem. Will not descend from / into /dev zeus-sd: Job write elapsed time = 00:57:03, Transfer rate = 9.499 M bytes/second ritin-fd: ClientAfterJob: run command "/usr/local/etc/rc.d/mysql-server start" ritin-fd: ClientAfterJob: Starting mysql. 'status dir' output while job on 'ritin' is hung (truncated): Running Jobs: JobId Level Name Status == 31 Fullritin.2007-10-30_05.05.04 has terminated 34 Fullcomdev.2007-10-30_05.05.07 is waiting for higher priority jobs to finish 35 Fullcomstag.2007-10-30_05.05.08 is waiting execution Dane -- Dane Miller Systems Administrator Great Schools, Inc http://greatschools.net - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users