On Monday 07 November 2005 12:49, Russell Howe wrote:
> After seeing people post about performance improvements with a larger
> network buffer size, I thought I'd give it a try on one of our Windows
> file daemons.
>
> My attempt failed, but more worryingly than the configuration change not
> having the desired effect (this may be due to lack of RTFMing on my
> part), the failure of the Windows file daemon caused all further jobs on
> the same storage device to block, waiting for me to hit 'OK' on the
> console of the Windows server.
>
> The director was showing this:
>
> 1920 Full    Artemis.2005-11-04_22.05.05 is waiting on max Storage jobs
> 1919 Full    Zetafax.2005-11-04_22.05.04 is waiting on max Storage jobs
> 1918 Full    Thor.2005-11-04_22.05.03 has terminated
>
> Thor being the job which had errored.
>
> Presumably thor-fd was still holding the connection to the sd open, and
> so Zetafax couldn't get started.
>
> The message the fd was logging was this, from bacula/src/lib/bnet.c:
>
>       if (dbuf_size % TAPE_BSIZE != 0) {
>          Qmsg1(bs->jcr, M_ABORT, 0,
>                _("Network buffer size %d not multiple of tape block
> size.\n"),
>                dbuf_size);
>       }
>
> Could the win32 fd not log this message either to the director, or to
> the Windows Event Log instead of popping up a dialog box and waiting for
> operator intervention?
>
> Luckily, the machine was local, and intervening wasn't difficult, but
> had it been remote, things could have been trickier.
>
> It worries me that a failure on a client can affect jobs queued for
> other clients in this way.

I think the action that the FD took (i.e. the pop up dialog) is correct, but 
possibly the implementation is not ideal.  The Win32 FD issues a pop up 
dialog in all cases where it determines that there is a program error (as 
opposed to a config error) and the FD is going to terminate.  This I believe 
is correct.  However, since it blocked all other jobs by keeping the 
connection with the SD open, then I would say that the implemention should be 
changed to clean up (i.e. release resources) before issuing the pop up.

I'd appreciate it if you would submit an RFC to this effect -- please wait 
until you see my email about Bacula projects and RFCs later this week. If 
this request sounds a bit strange, I think you will understand when you see 
my email ...

-- 
Best regards,

Kern

  (">
  /\
  V_V


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to