[Bacula-users] Windows fd unable to communicate with linux sd

2016-09-04 Thread Hodges
I have working bacula set-up on my linux box, which backs up to a 
network drive  which is permanently mounted on the linux box where the 
director runs .
When I try to get the fd on my windows 10 box (where it is running fine, 
as a service) to backup to the same network drive mounted on the linux 
box  (but to a different folder) I get


04-Sep 11:58 mailserver-dir JobId 1711: Start Backup JobId 1711, 
Job=studydata.2016-09-04_11.58.35_16
04-Sep 11:58 mailserver-dir JobId 1711: Using Device "StudyStorage"
04-Sep 11:58 study-fd JobId 1711: Fatal error: Authorization key rejected by 
Storage daemon.

It is not the key that is at fault, as they are all the same in my 
setup. I have read somewhere that when the the windows box cannot 
communicate for some other reason this error gets generated, but I have 
no idea how to trace what is going on and still less how to correct it


Any suggestions gratefully received. At present I back up the windows 
boxes with Genie, but I find it clunky and would like to standardise on 
bacula


Steve Hodge
--
--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Windows fd unable to communicate with linux sd

2016-09-06 Thread Hodges

Thanks for the idea Ralf, but no, I don't think its the firewall.

The system reports that port 9102, used by the windows-fd client, is 
open. Don't know how to confirm this absolutely. Could one telnet 9102 
from the linux box or something similar??


Anyway I set the firewall open to any machine on the local network when 
I first hit the problem


Steve
On 06/09/2016 07:42, Ralf Brinkmann wrote:

Am 04.09.2016 um 13:58 schrieb Hodges:
> I have read somewhere that when the the windows box cannot
> communicate for some other reason this error gets generated, but I
> have no idea how to trace what is going on and still less how to
> correct it

I suppose the Windows firewall blocks your required port.



--
--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Windows fd unable to communicate with linux sd

2016-09-07 Thread Hodges

Ralf,

Thanks for this idea. I have installed netcat on the windows box, and of 
course it is already on the linux box. Am struggling at the moment with 
how to use it, but will persvere


Steve


On 06/09/2016 12:13, Ralf Brinkmann wrote:

Am 06.09.2016 um 11:49 schrieb Hodges:

Anyway I set the firewall open to any machine on the local network when
I first hit the problem


hello Steve,

some years ago I made some tests with netcat to verify the network speed
between certain servers and the bacula host.

I'm not familiar with Bacula issues on Windows but there is a netcat
Windows version that might help:

https://joncraton.org/blog/46/netcat-for-windows/


 # netcat -h
[v1.10]
connect to somewhere:   netcat [-options] hostname port[s] [ports] ...
listen for inbound: netcat -l -p port [-options] [hostname] [port]
options:
-g gateway  source-routing hop point[s], up to 8
-G num  source-routing pointer: 4, 8, 12, ...
-h  this cruft
 -i secs delay interval for lines sent, ports 
scanned

-l  listen mode, for inbound connects
-n  numeric-only IP addresses, no DNS
-o file hex dump of traffic
-p port local port number
-r  randomize local and remote ports
-s addr local source address
-t  answer TELNET negotiation
-u  UDP mode
-v  verbose [use twice to be more verbose]
-w secs timeout for connects and final net reads
-z  zero-I/O mode [used for scanning]
port numbers can be individual or ranges: lo-hi [inclusive]




--
--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Windows fd unable to communicate with linux sd

2016-09-16 Thread Hodges

Been struggling with this for the last week or so, with no real success.

The Console program on the windows box works very nicely, so no 
communications problems there. I can run the whole system on the linux 
box from the windows machine


Cannot get any debug information from the windows client.  Looked at 
from the linux box the windows jobs seem to fail because the storage 
daemon does not get a  resonse from windows client - it terminates after 
'waiting on the windows fd' for a while. Nor can I get status 
information on the windows client from either box


I am now wondering whether the windows client has installed properly at 
all. It did not seem to finish completely when I first installed it and 
today reinstalled it. It stopped with a nearly blank window, with a just 
a 'finish' box in it which did not respond to a click. Had to crash it 
to get going


bacula is running as a service, and as instructed it was installed by 
the Administrator account. Maybe I am using the wrong version or 
something - the file I am using is called bacula-enterprise-win-32 -7.4.0


Steve Hodge


On 09/09/2016 10:55, Kern Sibbald wrote:

Hello,

Probably the best source of information for how to "debug" problems 
such as you are having is the Windows chapter of the manual.  
Specifically it tells you how to get debug output, and for connection 
problems you should invoke the command line with -d50 or greater.  For 
SD problems, you will, of course, have to run a backup job from the 
Director while this debug trace is turned on.

You can also turn on trace output for the SD, which is *much* simpler.

The manual is a bit old and out of date, but what is written is still 
valid.  That said, for getting trace output, it is probably easier to 
turn on as well as turn on output to a trace file by using the 
bconsole "setdebug" command.  Again the manual (Console manual) 
explains the setdebug command in more detail.


Best regards,
Kern


On 09/06/2016 11:49 AM, Hodges wrote:


Thanks for the idea Ralf, but no, I don't think its the firewall.

The system reports that port 9102, used by the windows-fd client, is 
open. Don't know how to confirm this absolutely. Could one telnet 
9102 from the linux box or something similar??


Anyway I set the firewall open to any machine on the local network 
when I first hit the problem


Steve
On 06/09/2016 07:42, Ralf Brinkmann wrote:

Am 04.09.2016 um 13:58 schrieb Hodges:
> I have read somewhere that when the the windows box cannot
> communicate for some other reason this error gets generated, but I
> have no idea how to trace what is going on and still less how to
> correct it

I suppose the Windows firewall blocks your required port.



--


--


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users





--
--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Windows fd unable to communicate with linux sd

2016-09-25 Thread Hodges

Solved

Just a version problem. My linux box has version 5.2.6 (latest debian 
version apparently) and the 7.4.3 windows version does not like it. Wind 
the windows version back to 5.2.6. and it works OK.


Anyone know why the Debian stable version is stuck at 5.2.6?


On 16/09/2016 12:04, Hodges wrote:


Been struggling with this for the last week or so, with no real success.

The Console program on the windows box works very nicely, so no 
communications problems there. I can run the whole system on the linux 
box from the windows machine


Cannot get any debug information from the windows client. Looked at 
from the linux box the windows jobs seem to fail because the storage 
daemon does not get a  resonse from windows client - it terminates 
after 'waiting on the windows fd' for a while. Nor can I get status 
information on the windows client from either box


I am now wondering whether the windows client has installed properly 
at all. It did not seem to finish completely when I first installed it 
and today reinstalled it. It stopped with a nearly blank window, with 
a just a 'finish' box in it which did not respond to a click. Had to 
crash it to get going


bacula is running as a service, and as instructed it was installed by 
the Administrator account. Maybe I am using the wrong version or 
something - the file I am using is called bacula-enterprise-win-32 -7.4.0


Steve Hodge


On 09/09/2016 10:55, Kern Sibbald wrote:

Hello,

Probably the best source of information for how to "debug" problems 
such as you are having is the Windows chapter of the manual.  
Specifically it tells you how to get debug output, and for connection 
problems you should invoke the command line with -d50 or greater.  
For SD problems, you will, of course, have to run a backup job from 
the Director while this debug trace is turned on.

You can also turn on trace output for the SD, which is *much* simpler.

The manual is a bit old and out of date, but what is written is still 
valid.  That said, for getting trace output, it is probably easier to 
turn on as well as turn on output to a trace file by using the 
bconsole "setdebug" command.  Again the manual (Console manual) 
explains the setdebug command in more detail.


Best regards,
Kern


On 09/06/2016 11:49 AM, Hodges wrote:


Thanks for the idea Ralf, but no, I don't think its the firewall.

The system reports that port 9102, used by the windows-fd client, is 
open. Don't know how to confirm this absolutely. Could one telnet 
9102 from the linux box or something similar??


Anyway I set the firewall open to any machine on the local network 
when I first hit the problem


Steve
On 06/09/2016 07:42, Ralf Brinkmann wrote:

Am 04.09.2016 um 13:58 schrieb Hodges:
> I have read somewhere that when the the windows box cannot
> communicate for some other reason this error gets generated, but I
> have no idea how to trace what is going on and still less how to
> correct it

I suppose the Windows firewall blocks your required port.



--


--


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users





--


--
--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Will latest release run on i686

2019-01-22 Thread Hodges
Been thinking about updating my bacula which is 5.2.6 and is running on 
an oldish linux server which runs Debian 8 Jessie on an intel Atom i686 
processor


All the downloads on the website for 9.2.2 are for AMD64 so far as I can 
see, which I guess will not work on my system. Am I right on this? Do I 
have to change my server to run 9.2.2


Stephen Hodge

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Will latest release run on i686

2019-01-23 Thread Hodges

Thanks Phil

I guess this means I am correct about  the  AMD64 binary not working then

Seems quite a process to get the source and compile it. I think it is 
probably beyond me


Steve

On 22/01/2019 22:45, Phil Stracchino wrote:

On 1/22/19 5:12 PM, Hodges wrote:

Been thinking about updating my bacula which is 5.2.6 and is running on
an oldish linux server which runs Debian 8 Jessie on an intel Atom i686
processor

All the downloads on the website for 9.2.2 are for AMD64 so far as I can
see, which I guess will not work on my system. Am I right on this? Do I
have to change my server to run 9.2.2


You could always compile from source, if you have an unusual hardware
platform.



--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Will latest release run on i686

2019-01-23 Thread Hodges
Thanks Heitor. This is good, all in one place and written in 'noddy' 
langiage I can understand! I will have a go


I will download the latest  Dec2018 version. I need a windows client 
too, will the latest bacula version work with the client on your website?


Steve

On 23/01/2019 10:35, Heitor Faria wrote:

Hello Hodges,

I guess this means I am correct about  the  AMD64 binary not
working then

Seems quite a process to get the source and compile it. I think it
is probably beyond me

It is pretty doable: http://bacula.us/compilation/
Client build instructions at the end. There are not many deps or caveats.

Steve



--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Installation of 9.x.x.

2019-01-29 Thread Hodges
Am following the Bacula Community Installation Guide in an attempt to 
upgrade from 5.0 to 9.20. I am on debian Jessie running on i686


Stuck at the end of paragraph 4.3 and start of 4.4. When I run apt-get 
update I get the following


Quote

Err:13 
https://www.bacula.org/packages/5c46fdc554a35/debs/9.2.0./jessie/amd64 
jessie Release
  Certificate verification failed: The certificate is NOT trusted. The 
certificate issuer is unknown.  Could not handshake: Error in the 
certificate verification. [IP: 80.244.178.6 443]

Reading package lists... Done
W: 
http://www.bacula.org/packages/5c46fdc554a35/debs/9.2.0./jessie/amd64/dists/jessie/InRelease: 
No system certificates available. Try installing ca-certificate s.
W: 
http://www.bacula.org/packages/5c46fdc554a35/debs/9.2.0./jessie/amd64/dists/jessie/Release: 
No system certificates available. Try installing ca-certificates.
E: The repository 
'http://www.bacula.org/packages/5c46fdc554a35/debs/9.2.0./jessie/amd64 
jessie Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is 
therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user 
configuration details.


Unquote

Just on the offchance I ran apt-get install bacula-postgresql, but of 
course I just got 'cannot locate package bacula-postgresql'.


I can install bacula 9.4.1 from the debian repositories but it doesn't 
install with same configuration as the Installation Guide, does not seem 
to recognise systemd and generally does not work


I tried compiling from source from Heitor Faria's website, but that 
doesn't work either for me at keast


Can anyone help?

Steve

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] systemd bacula-director.service masked

2019-02-28 Thread Hodges
When I try to start bacula in systemd I get a 'failed' message saying 
unit bacula-director.service is masked


Does this matter - I guess so. What to do about it? I have just 
installed bacula 9.4.2 from the debian repository into Debian 10 and am 
trying to get it working again


Steve Hodge

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] power failure problem

2015-09-09 Thread Kevin Hodges
Actually it was not a tape problem it seems, when the server rebooted
after the power failure the device names nst0 and nst1 switched which
physical device they linked to. Using persistent names with udev rules
fixed the problem and backups are running again.

Thanks

Kevin

On Thu, 2015-09-03 at 06:53 +, Luc Van der Veken wrote:
> > I don't understand why it want to write on drive 1 and then tries to load a 
> > tape on drive 0!
> 
> Notice the hyphens: drive 0 (/dev/nst0) seems to have a tag "Drive-1" stuck 
> onto to it, and Drive 1 the same with "Drive-2".
> 
> Your problem is probably that the power fail occurred half way through a 
> write operation, and now the drive can't locate the end of data anymore.
> 
> I suggest checking the tape condition with btape (commands like status, 
> readlabel, eod, scan).
> 
> 
> 
> -Original Message-
> From: Kevin I. Hodges [mailto:k.i.hod...@reading.ac.uk] 
> Sent: 02 September 2015 22:06
> To: bacula-users@lists.sourceforge.net
> Subject: [Bacula-users] power failure problem
> 
> hi
> 
> I've had bacula, version 7.0.5 running for sometime happily doing my 
> backups to a twin drive IBM autochanger. However, over the weekend we had a 
> power failure and bacula seems to have really got its knickers in a twist. 
> The autochanger itself seems to be fine, I can load and unload tapes with no 
> problems and all the volumes seem to be in the correct slot locations, 
> however when I try and run a backup job I get the following errors and no 
> backup:
> 
> 02-Sep 19:35 swlx1.rdg.ac.uk-dir JobId 396: Start Backup JobId 396, 
> Job=BackupClient1.2015-09-02_19.35.04_34
> 02-Sep 19:35 swlx1.rdg.ac.uk-dir JobId 396: Using Device "Drive-1" to write.
> 02-Sep 19:35 swlx1.rdg.ac.uk-sd JobId 396: 3304 Issuing autochanger "load 
> slot 3, drive 0" command.
> 02-Sep 19:40 swlx1.rdg.ac.uk-sd JobId 396: Fatal error: 3992 Bad autochanger 
> "load slot 3, drive 0": ERR=Child died from signal 15: Termination.
> Results=Program killed by Bacula (timeout)
> 
> 02-Sep 19:40 swlx1.rdg.ac.uk-fd JobId 396: Fatal error: job.c:2444 Bad 
> response from SD to Append Data command. Wanted 3000 OK data
> , got 3903 Error append data:
> 
> 02-Sep 19:40 swlx1.rdg.ac.uk-dir JobId 396: Error: Bacula swlx1.rdg.ac.uk-dir 
> 7.0.5 (28Jul14):
>   Build OS:   x86_64-unknown-linux-gnu redhat Enterprise release
>   JobId:  396
>   Job:BackupClient1.2015-09-02_19.35.04_34
>   Backup Level:   Incremental, since=2015-08-27 19:05:03
>   Client: "swlx1.rdg.ac.uk-fd" 7.0.5 (28Jul14) 
> x86_64-unknown-linux-gnu,redhat,Enterprise release
>   FileSet:"Athena" 2015-06-04 19:05:00
>   Pool:   "Default" (From Job resource)
>   Catalog:"MyCatalog" (From Client resource)
>   Storage:"DigitalTapeLibrary" (From Job resource)
>   Scheduled time: 02-Sep-2015 19:35:00
>   Start time: 02-Sep-2015 19:35:06
>   End time:   02-Sep-2015 19:40:07
>   Elapsed time:   5 mins 1 sec
>   Priority:   10
>   FD Files Written:   0
>   SD Files Written:   0
>   FD Bytes Written:   0 (0 B)
>   SD Bytes Written:   0 (0 B)
>   Rate:   0.0 KB/s
>   Software Compression:   None
>   VSS:no
>   Encryption: no
>   Accurate:   no
>   Volume name(s):
>   Volume Session Id:  1
>   Volume Session Time:1441218705
>   Last Volume Bytes:  7,912,094,883,840 (7.912 TB)
>   Non-fatal FD errors:1
>   SD Errors:  1
>   FD termination status:  Error
>   SD termination status:  Error
>   Termination:*** Backup Error ***
> 
> after this the tape from slot 3 is loaded in drive 0:
> 
> Data Transfer Element 0:Full (Storage Element 3 Loaded):VolumeTag = MD0011L6
> 
> the storage status after this is:
> 
> Device status:
> Autochanger "DigitalTapeLibrary" with devices:
>"Drive-1" (/dev/nst0)
>"Drive-2" (/dev/nst1)
> 
> Device "Drive-1" (/dev/nst0) is not open.
> Drive 0 is not loaded.
> ==
> 
> Device "Drive-2" (/dev/nst1) is not open.
> Drive 1 is not loaded.
> ==
> 
> 
> Used Volume status:
> Reserved volume: MD0011L6 on tape device "Drive-1" (/dev/nst0)
> Reader=0 writers=0 reserves=0 volinuse=1
> 
> I don't understand why it want to write on drive 1 and then tries to load a 
> tape on drive 0!
&g

[Bacula-users] tape problem

2018-09-11 Thread Kevin Hodges
hi

   I came across a problem recently after installing a new single tape
drive for backups. This is a HPE LTO-8 Ultrium machine connected to a
Redhat linux box: Linux swlx1.rdg.ac.uk 3.10.0-862.9.1.el7.x86_64

The problem occured whilst performing a backup that consists of several
millions of files which are several TB in total size. The backup
stopped after writing ~1.5TB with the director reporting the volume was
full and asking for a new labelled volume. LTO-8 should take at least
12TB (native). This was a surprise but I thought it might be a tape
problem so I unmounted the tape and tried to load a new tape to label
it and mount it to continue but I could not load the new blank tape.
It seemed like the machine continually tried to load the tape without
success and I had to keep pressing the eject button to extract the
tape.

Thinking this might be a hardware problem I stopped the backup shutdown
the bacula daemons and ran all the vendor tests which came back as
reporting no errors. On restarting the bacula daemons I found I was
able to load the tapes again and re-start the backup.

So my question is if this is not a hardware or tape problem what
prevents me loading a new tape and labelling during an ongoing backup
job, is there some way to pause the backup to allow a new tape to be
labelled?

My storage config is:

Device {
  Name = LTO-8
  Media Type = LTO-8
  Archive Device = /dev/nst0
  AutomaticMount = yes;  
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = no
  Spool Directory = /opt/bacula/working2 
  Maximum Spool Size = 100GB 
  Maximum Job Spool Size  = 50GB
}

Should the AutomaticMount be set to 'no' to stop attempts to
automatically mount any new tape even if it is not labelled?

The issue of the tape being labelled full well before its capacity is
still a mystery.

Thanks for any help

Kevin



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] tape problem

2018-09-17 Thread Kevin Hodges
hi Martin

   found this in the log:

Writing spooled data to Volume. Despooling 50,000,033,271 bytes ...
09-Sep 10:55 swlx1.rdg.ac.uk-sd2 JobId 36: Error: block.c:255 Write
error at 1512:13125 on device "LTO-8" (/dev/nst0). ERR=Input/output
error.
09-Sep 10:55 swlx1.rdg.ac.uk-sd2 JobId 36: Error: Error writing final
EOF to tape. This Volume may not be readable.
tape_dev.c:941 ioctl MTWEOF error on "LTO-8" (/dev/nst0).
ERR=Input/output error.

I've restarted the backup from scratch and so far it seems to have got
past the same point of failure that occured last time, so fingers
crossed. There was some network issues around the time of the failure!

Regards

Kevin

On Mon, 2018-09-17 at 14:13 +0100, Martin Simmons wrote:
> When the director stopped at ~1.5TB, did it report any other messages
> (e.g. I/O errors)?
> 
> I suggest looking in the system logs / console for messages around
> that time
> as well.
> 
> __Martin
> 
> 
> > > > > > On Tue, 11 Sep 2018 10:30:31 +0100, Kevin Hodges said:
> > 
> > hi
> > 
> >    I came across a problem recently after installing a new single
> > tape
> > drive for backups. This is a HPE LTO-8 Ultrium machine connected to
> > a
> > Redhat linux box: Linux swlx1.rdg.ac.uk 3.10.0-862.9.1.el7.x86_64
> > 
> > The problem occured whilst performing a backup that consists of
> > several
> > millions of files which are several TB in total size. The backup
> > stopped after writing ~1.5TB with the director reporting the volume
> > was
> > full and asking for a new labelled volume. LTO-8 should take at
> > least
> > 12TB (native). This was a surprise but I thought it might be a tape
> > problem so I unmounted the tape and tried to load a new tape to
> > label
> > it and mount it to continue but I could not load the new blank
> > tape.
> > It seemed like the machine continually tried to load the tape
> > without
> > success and I had to keep pressing the eject button to extract the
> > tape.
> > 
> > Thinking this might be a hardware problem I stopped the backup
> > shutdown
> > the bacula daemons and ran all the vendor tests which came back as
> > reporting no errors. On restarting the bacula daemons I found I was
> > able to load the tapes again and re-start the backup.
> > 
> > So my question is if this is not a hardware or tape problem what
> > prevents me loading a new tape and labelling during an ongoing
> > backup
> > job, is there some way to pause the backup to allow a new tape to
> > be
> > labelled?
> > 
> > My storage config is:
> > 
> > Device {
> >   Name = LTO-8
> >   Media Type = LTO-8
> >   Archive Device = /dev/nst0
> >   AutomaticMount = yes;  
> >   AlwaysOpen = yes;
> >   RemovableMedia = yes;
> >   RandomAccess = no;
> >   AutoChanger = no
> >   Spool Directory = /opt/bacula/working2 
> >   Maximum Spool Size = 100GB 
> >   Maximum Job Spool Size  = 50GB
> > }
> > 
> > Should the AutomaticMount be set to 'no' to stop attempts to
> > automatically mount any new tape even if it is not labelled?
> > 
> > The issue of the tape being labelled full well before its capacity
> > is
> > still a mystery.
> > 
> > Thanks for any help
> > 
> > Kevin___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] tape problem

2018-09-17 Thread Kevin Hodges
Martin

   found the following around the same time:


Sep  9 10:54:58 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium
Error [deferred] 
Sep  9 10:54:58 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write
append error
Sep  9 10:54:59 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium
Error [current] 
Sep  9 10:54:59 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write
append error
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium
Error [current] 
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write
append error
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium
Error [current] 
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write
append error
Sep  9 10:55:01 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium
Error [current] 
Sep  9 10:55:01 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write
append error

Kevin

On Mon, 2018-09-17 at 17:28 +0100, Martin Simmons wrote:
> The "ERR=Input/output error" can be caused by hardware problems, but
> I would
> not expect it from a network problem.  If you have the syslog
> (e.g. /var/log/messages) from that time, I would check for errors
> there too.
> 
> __Martin
> 
> 
> > > > > > On Mon, 17 Sep 2018 14:30:16 +0100, Kevin Hodges said:
> > 
> > hi Martin
> > 
> >    found this in the log:
> > 
> > Writing spooled data to Volume. Despooling 50,000,033,271 bytes ...
> > 09-Sep 10:55 swlx1.rdg.ac.uk-sd2 JobId 36: Error: block.c:255 Write
> > error at 1512:13125 on device "LTO-8" (/dev/nst0). ERR=Input/output
> > error.
> > 09-Sep 10:55 swlx1.rdg.ac.uk-sd2 JobId 36: Error: Error writing
> > final
> > EOF to tape. This Volume may not be readable.
> > tape_dev.c:941 ioctl MTWEOF error on "LTO-8" (/dev/nst0).
> > ERR=Input/output error.
> > 
> > I've restarted the backup from scratch and so far it seems to have
> > got
> > past the same point of failure that occured last time, so fingers
> > crossed. There was some network issues around the time of the
> > failure!
> > 
> > Regards
> > 
> > Kevin
> > 
> > On Mon, 2018-09-17 at 14:13 +0100, Martin Simmons wrote:
> > > When the director stopped at ~1.5TB, did it report any other
> > > messages
> > > (e.g. I/O errors)?
> > > 
> > > I suggest looking in the system logs / console for messages
> > > around
> > > that time
> > > as well.
> > > 
> > > __Martin
> > > 
> > > 
> > > > > > > > On Tue, 11 Sep 2018 10:30:31 +0100, Kevin Hodges said:
> > > > 
> > > > hi
> > > > 
> > > >    I came across a problem recently after installing a new
> > > > single
> > > > tape
> > > > drive for backups. This is a HPE LTO-8 Ultrium machine
> > > > connected to
> > > > a
> > > > Redhat linux box: Linux swlx1.rdg.ac.uk 3.10.0-
> > > > 862.9.1.el7.x86_64
> > > > 
> > > > The problem occured whilst performing a backup that consists of
> > > > several
> > > > millions of files which are several TB in total size. The
> > > > backup
> > > > stopped after writing ~1.5TB with the director reporting the
> > > > volume
> > > > was
> > > > full and asking for a new labelled volume. LTO-8 should take at
> > > > least
> > > > 12TB (native). This was a surprise but I thought it might be a
> > > > tape
> > > > problem so I unmounted the tape and tried to load a new tape to
> > > > label
> > > > it and mount it to continue but I could not load the new blank
> > > > tape.
> > > > It seemed like the machine continually tried to load the tape
> > > > without
> > > > success and I had to keep pressing the eject button to extract
> > > > the
> > > > tape.
> > > > 
> > > > Thinking this might be a hardware problem I stopped the backup
> > > > shutdown
> > > > the bacula daemons and ran all the vendor tests which came back
> > > > as
> > > > reporting no errors. On restarting the bacula daemons I found I
> > > > was
> > > > able to load the tapes again and re-start the backup.
> > > > 
> > > > So my question is if this is not a hardware or tape problem
> > > > what
> > > > prevents me loading a new tape and labelling during an ongoing
> > > > backup
> > > > job, is there some way to pause the backup to allow 

[Bacula-users] power failure problem

2015-09-02 Thread Kevin I. Hodges



hi

    I've had bacula, version 7.0.5 running for sometime happily doing my backups to a twin drive IBM autochanger. However, over the weekend we had a power failure and bacula seems to have really got its knickers in a twist. The autochanger itself seems
 to be fine, I can load and unload tapes with no problems and all the volumes seem to be in the correct slot locations, however when I try and run a backup job I get the following errors and no backup:

02-Sep 19:35 swlx1.rdg.ac.uk-dir JobId 396: Start Backup JobId 396, Job=BackupClient1.2015-09-02_19.35.04_34
02-Sep 19:35 swlx1.rdg.ac.uk-dir JobId 396: Using Device "Drive-1" to write.
02-Sep 19:35 swlx1.rdg.ac.uk-sd JobId 396: 3304 Issuing autochanger "load slot 3, drive 0" command.
02-Sep 19:40 swlx1.rdg.ac.uk-sd JobId 396: Fatal error: 3992 Bad autochanger "load slot 3, drive 0": ERR=Child died from signal 15: Termination.
Results=Program killed by Bacula (timeout)

02-Sep 19:40 swlx1.rdg.ac.uk-fd JobId 396: Fatal error: job.c:2444 Bad response from SD to Append Data command. Wanted 3000 OK data
, got 3903 Error append data: 

02-Sep 19:40 swlx1.rdg.ac.uk-dir JobId 396: Error: Bacula swlx1.rdg.ac.uk-dir 7.0.5 (28Jul14):
  Build OS:   x86_64-unknown-linux-gnu redhat Enterprise release
  JobId:  396
  Job:    BackupClient1.2015-09-02_19.35.04_34
  Backup Level:   Incremental, since=2015-08-27 19:05:03
  Client: "swlx1.rdg.ac.uk-fd" 7.0.5 (28Jul14) x86_64-unknown-linux-gnu,redhat,Enterprise release
  FileSet:    "Athena" 2015-06-04 19:05:00
  Pool:   "Default" (From Job resource)
  Catalog:    "MyCatalog" (From Client resource)
  Storage:    "DigitalTapeLibrary" (From Job resource)
  Scheduled time: 02-Sep-2015 19:35:00
  Start time: 02-Sep-2015 19:35:06
  End time:   02-Sep-2015 19:40:07
  Elapsed time:   5 mins 1 sec
  Priority:   10
  FD Files Written:   0
  SD Files Written:   0
  FD Bytes Written:   0 (0 B)
  SD Bytes Written:   0 (0 B)
  Rate:   0.0 KB/s
  Software Compression:   None
  VSS:    no
  Encryption: no
  Accurate:   no
  Volume name(s): 
  Volume Session Id:  1
  Volume Session Time:    1441218705
  Last Volume Bytes:  7,912,094,883,840 (7.912 TB)
  Non-fatal FD errors:    1
  SD Errors:  1
  FD termination status:  Error
  SD termination status:  Error
  Termination:    *** Backup Error ***

after this the tape from slot 3 is loaded in drive 0:

Data Transfer Element 0:Full (Storage Element 3 Loaded):VolumeTag = MD0011L6

the storage status after this is:

Device status:
Autochanger "DigitalTapeLibrary" with devices:
   "Drive-1" (/dev/nst0)
   "Drive-2" (/dev/nst1)

Device "Drive-1" (/dev/nst0) is not open.
    Drive 0 is not loaded.
==

Device "Drive-2" (/dev/nst1) is not open.
    Drive 1 is not loaded.
==


Used Volume status:
Reserved volume: MD0011L6 on tape device "Drive-1" (/dev/nst0)
    Reader=0 writers=0 reserves=0 volinuse=1

I don't understand why it want to write on drive 1 and then tries to load a tape on drive 0!

I've several posts that seem to have had similar problems but no solutions. Does anyone have any idea how to recover from this situation? Any help gratefully received.

Kevin






--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] power failure problem

2015-09-02 Thread Kevin I. Hodges
hi

I've had bacula, version 7.0.5 running for sometime happily doing my 
backups to a twin drive IBM autochanger. However, over the weekend we had a 
power failure and bacula seems to have really got its knickers in a twist. The 
autochanger itself seems to be fine, I can load and unload tapes with no 
problems and all the volumes seem to be in the correct slot locations, however 
when I try and run a backup job I get the following errors and no backup:

02-Sep 19:35 swlx1.rdg.ac.uk-dir JobId 396: Start Backup JobId 396, 
Job=BackupClient1.2015-09-02_19.35.04_34
02-Sep 19:35 swlx1.rdg.ac.uk-dir JobId 396: Using Device "Drive-1" to write.
02-Sep 19:35 swlx1.rdg.ac.uk-sd JobId 396: 3304 Issuing autochanger "load slot 
3, drive 0" command.
02-Sep 19:40 swlx1.rdg.ac.uk-sd JobId 396: Fatal error: 3992 Bad autochanger 
"load slot 3, drive 0": ERR=Child died from signal 15: Termination.
Results=Program killed by Bacula (timeout)

02-Sep 19:40 swlx1.rdg.ac.uk-fd JobId 396: Fatal error: job.c:2444 Bad response 
from SD to Append Data command. Wanted 3000 OK data
, got 3903 Error append data:

02-Sep 19:40 swlx1.rdg.ac.uk-dir JobId 396: Error: Bacula swlx1.rdg.ac.uk-dir 
7.0.5 (28Jul14):
  Build OS:   x86_64-unknown-linux-gnu redhat Enterprise release
  JobId:  396
  Job:BackupClient1.2015-09-02_19.35.04_34
  Backup Level:   Incremental, since=2015-08-27 19:05:03
  Client: "swlx1.rdg.ac.uk-fd" 7.0.5 (28Jul14) 
x86_64-unknown-linux-gnu,redhat,Enterprise release
  FileSet:"Athena" 2015-06-04 19:05:00
  Pool:   "Default" (From Job resource)
  Catalog:"MyCatalog" (From Client resource)
  Storage:"DigitalTapeLibrary" (From Job resource)
  Scheduled time: 02-Sep-2015 19:35:00
  Start time: 02-Sep-2015 19:35:06
  End time:   02-Sep-2015 19:40:07
  Elapsed time:   5 mins 1 sec
  Priority:   10
  FD Files Written:   0
  SD Files Written:   0
  FD Bytes Written:   0 (0 B)
  SD Bytes Written:   0 (0 B)
  Rate:   0.0 KB/s
  Software Compression:   None
  VSS:no
  Encryption: no
  Accurate:   no
  Volume name(s):
  Volume Session Id:  1
  Volume Session Time:1441218705
  Last Volume Bytes:  7,912,094,883,840 (7.912 TB)
  Non-fatal FD errors:1
  SD Errors:  1
  FD termination status:  Error
  SD termination status:  Error
  Termination:*** Backup Error ***

after this the tape from slot 3 is loaded in drive 0:

Data Transfer Element 0:Full (Storage Element 3 Loaded):VolumeTag = MD0011L6

the storage status after this is:

Device status:
Autochanger "DigitalTapeLibrary" with devices:
   "Drive-1" (/dev/nst0)
   "Drive-2" (/dev/nst1)

Device "Drive-1" (/dev/nst0) is not open.
Drive 0 is not loaded.
==

Device "Drive-2" (/dev/nst1) is not open.
Drive 1 is not loaded.
==


Used Volume status:
Reserved volume: MD0011L6 on tape device "Drive-1" (/dev/nst0)
Reader=0 writers=0 reserves=0 volinuse=1

I don't understand why it want to write on drive 1 and then tries to load a 
tape on drive 0!

I've several posts that seem to have had similar problems but no solutions. 
Does anyone have any idea how to recover from this situation? Any help 
gratefully received.

Kevin


--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users