Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-25 Thread Jerry Lowry
Hi, again!
I hate to return to this but I got the same errors on my other backup
server.  Running the same type of copy job!  Just minutes ago.  This system
is running the same configuration:
Centos 6.9
Linux 2.6.32-696.10.1.el6_x86_64
Mariadb 10.2.8
Bacula 9.0.3

Nothing has changed in the Bacula config files since before the upgrade to
the latest version.

Job {
Name = "CopyWKDiskToDisk"
Type = Copy
Level = Full
FileSet = "Bottom Set"
Client = distress-fd
Messages = Standard
Storage = workstations
Pool = WorkstationPool
Maximum Concurrent Jobs = 4
Selection Type = PoolUncopiedJobs
Selection Pattern = "DC-*"
}


# File Pool definition
Pool {
  Name = OffsiteBottom
  Pool Type = Copy
  Next Pool = OffsiteBottom
  Storage = bottomswap
  Recycle = yes   # Bacula can automatically recycle
Volumes
  AutoPrune = yes # Prune expired volumes
  Volume Retention = 30 years # thirty years
  Maximum Volume Bytes = 1800G  # Limit Volume to disk size
  Maximum Volumes = 10   # Limit number of Volumes in Pool
}


# Definition of file storage device
Storage {
  Name = bottomswap# offsite disk
# Do not use "localhost" here
  #Address = distress.ACCOUNTING.EDT.LOCAL# N.B. Use a
fully qua
lified name here
  Address = 10.10.10.3  # N.B. Use a fully qualified name here
  SDPort = 9103
  Password = ""
  Device = BottomSwap
  Media Type = File
}

Device {
  Name = BottomSwap
  Media Type = File
  Archive Device = /BottomSwap
  LabelMedia = yes;   # lets Bacula label unlabeled media
  Random Access = Yes;
  AutomaticMount = yes;   # when device opened, read it
  RemovableMedia = no;
  AlwaysOpen = no;
}

checked the message log and there are no network errors.

dmesg shows the disk change that I just finished, but there are no errors!

I'm at a loss, as I don't want to keep restarting these backups due to time
constraints with the other backup jobs.

Were there any changes in this part of the code for v 9.0.3?

jerry


On Fri, Sep 22, 2017 at 11:19 AM, Jerry Lowry 
wrote:

> Yes, kilchis is a bonifide hardware server. Only VM's I have are test
> systems running on my desktop.
>
> There are 2 copy jobs on this system. This particular job is the one that
> typically runs long enough that it will need a new volume during the
> night.  The other one will if it is run late in the day and the current
> volume does not have very much space left on it. The other daily backup
> jobs will wait until the copy job is finished, but there is nothing else
> running on the system that utilizes the network except for VNC traffic.
> This problem happened two weeks in a row and this last week it worked just
> fine.  The one thing that is different is that I dropped all of the current
> backup files and purged them from the DB. I then recreated new files to
> backup to.  Just wondering if one of the files was writing on a
> questionable sector on disk.  Nothing in the logs and smart does not give
> any details on that.
>
> I think I will call it a fluke and keep a watch on it in the future..
> Thanks!
>
> On Fri, Sep 22, 2017 at 10:27 AM, Martin Simmons 
> wrote:
>
>> That's odd -- the reading side looks normal to me until the error is
>> detected.
>>
>> Also, "Connection reset by peer" doesn't normally occur when connected to
>> the
>> current machine.
>>
>> Is kilchis a real computer (not a VM)?
>>
>> Is this the only copy job that waits overnight for someone to label a new
>> volume?
>>
>> Maybe something happens overnight on the system that causes networking to
>> be
>> disrupted in some subtle way, causing "Connection reset by peer" when the
>> connection is closed cleanly?
>>
>> __Martin
>>
>>
>> > On Tue, 19 Sep 2017 15:31:46 -0700, Jerry Lowry said:
>> >
>> > The reading side is the same system.  It is a copy job setup to backup
>> > daily backups to the offsite backup disk.
>> > The attachment is the bacula jobid 35202.
>> >
>> > jerry
>> >
>> > On Tue, Sep 19, 2017 at 10:08 AM, Martin Simmons 
>> > wrote:
>> >
>> > > The email below is from the writing side of the copy job and the
>> message:
>> > >
>> > > 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from
>> > > Storage daemon:kilchis:9103: ERR=Connection reset by peer
>> > >
>> > > shows that the connection to the reading side of the job was closed
>> > > unexpectedly from the reading end.
>> > >
>> > > Do you have the corresponding email from the reading side?  It will
>> have a
>> > > different JobId (but should mention JobId 35203) and should start with
>> > > something like "Using Device ... to read."
>> > >
>> > > __Martin
>> > >
>> > >
>> > > > On Mon, 18 Sep 2017 13:42:19 -0700, Jerry Lowry said:
>> > > >
>> > > > Martin,
>> > > > Here is the complete email that was sent just before the "Copy
>> Error"
>> > > > message:
>> > > >
>> > > > 

Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-22 Thread Jerry Lowry
Yes, kilchis is a bonifide hardware server. Only VM's I have are test
systems running on my desktop.

There are 2 copy jobs on this system. This particular job is the one that
typically runs long enough that it will need a new volume during the
night.  The other one will if it is run late in the day and the current
volume does not have very much space left on it. The other daily backup
jobs will wait until the copy job is finished, but there is nothing else
running on the system that utilizes the network except for VNC traffic.
This problem happened two weeks in a row and this last week it worked just
fine.  The one thing that is different is that I dropped all of the current
backup files and purged them from the DB. I then recreated new files to
backup to.  Just wondering if one of the files was writing on a
questionable sector on disk.  Nothing in the logs and smart does not give
any details on that.

I think I will call it a fluke and keep a watch on it in the future..
Thanks!

On Fri, Sep 22, 2017 at 10:27 AM, Martin Simmons 
wrote:

> That's odd -- the reading side looks normal to me until the error is
> detected.
>
> Also, "Connection reset by peer" doesn't normally occur when connected to
> the
> current machine.
>
> Is kilchis a real computer (not a VM)?
>
> Is this the only copy job that waits overnight for someone to label a new
> volume?
>
> Maybe something happens overnight on the system that causes networking to
> be
> disrupted in some subtle way, causing "Connection reset by peer" when the
> connection is closed cleanly?
>
> __Martin
>
>
> > On Tue, 19 Sep 2017 15:31:46 -0700, Jerry Lowry said:
> >
> > The reading side is the same system.  It is a copy job setup to backup
> > daily backups to the offsite backup disk.
> > The attachment is the bacula jobid 35202.
> >
> > jerry
> >
> > On Tue, Sep 19, 2017 at 10:08 AM, Martin Simmons 
> > wrote:
> >
> > > The email below is from the writing side of the copy job and the
> message:
> > >
> > > 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from
> > > Storage daemon:kilchis:9103: ERR=Connection reset by peer
> > >
> > > shows that the connection to the reading side of the job was closed
> > > unexpectedly from the reading end.
> > >
> > > Do you have the corresponding email from the reading side?  It will
> have a
> > > different JobId (but should mention JobId 35203) and should start with
> > > something like "Using Device ... to read."
> > >
> > > __Martin
> > >
> > >
> > > > On Mon, 18 Sep 2017 13:42:19 -0700, Jerry Lowry said:
> > > >
> > > > Martin,
> > > > Here is the complete email that was sent just before the "Copy Error"
> > > > message:
> > > >
> > > > 12-Sep 15:09 kilchis-dir JobId 35203: Using Device "MidSwap" to
> write.
> > > > 12-Sep 15:09 kilchis JobId 35203: Volume "homeMS-200" previously
> > > written, moving to end of data.
> > > > 12-Sep 15:27 kilchis JobId 35203: End of medium on Volume
> "homeMS-200"
> > > Bytes=1,932,735,274,146 Blocks=29,959,317 at 12-Sep-2017 15:27.
> > > > 12-Sep 15:28 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.
> 09_50
> > > is waiting. Cannot find any appendable volumes.
> > > > Please use the "label" command to create a new Volume for:
> > > > Storage:  "MidSwap" (/MidSwap)
> > > > Pool: OffsiteMid
> > > > Media type:   File
> > > > 12-Sep 15:36 kilchis JobId 35203: Wrote label to prelabeled Volume
> > > "homeMS-201" on File device "MidSwap" (/MidSwap)
> > > > 12-Sep 15:36 kilchis JobId 35203: New volume "homeMS-201" mounted on
> > > device "MidSwap" (/MidSwap) at 12-Sep-2017 15:36.
> > > > 12-Sep 19:54 kilchis JobId 35203: End of medium on Volume
> "homeMS-201"
> > > Bytes=1,932,735,281,790 Blocks=29,959,315 at 12-Sep-2017 19:54.
> > > > 12-Sep 19:54 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.
> 09_50
> > > is waiting. Cannot find any appendable volumes.
> > > > Please use the "label" command to create a new Volume for:
> > > > Storage:  "MidSwap" (/MidSwap)
> > > > Pool: OffsiteMid
> > > > Media type:   File
> > > > 12-Sep 20:57 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.
> 09_50
> > > is waiting. Cannot find any appendable volumes.
> > > > Please use the "label" command to create a new Volume for:
> > > > Storage:  "MidSwap" (/MidSwap)
> > > > Pool: OffsiteMid
> > > > Media type:   File
> > > > 12-Sep 23:03 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.
> 09_50
> > > is waiting. Cannot find any appendable volumes.
> > > > Please use the "label" command to create a new Volume for:
> > > > Storage:  "MidSwap" (/MidSwap)
> > > > Pool: OffsiteMid
> > > > Media type:   File
> > > > 13-Sep 03:15 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.
> 09_50
> > > is waiting. Cannot find any appendable volumes.
> > > > Please use the "label" command to create a new Volume for:
> > > > Storage:  "MidSwap" (/MidSwap)
> > > > Pool: 

Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-22 Thread Martin Simmons
That's odd -- the reading side looks normal to me until the error is detected.

Also, "Connection reset by peer" doesn't normally occur when connected to the
current machine.

Is kilchis a real computer (not a VM)?

Is this the only copy job that waits overnight for someone to label a new
volume?

Maybe something happens overnight on the system that causes networking to be
disrupted in some subtle way, causing "Connection reset by peer" when the
connection is closed cleanly?

__Martin


> On Tue, 19 Sep 2017 15:31:46 -0700, Jerry Lowry said:
> 
> The reading side is the same system.  It is a copy job setup to backup
> daily backups to the offsite backup disk.
> The attachment is the bacula jobid 35202.
> 
> jerry
> 
> On Tue, Sep 19, 2017 at 10:08 AM, Martin Simmons 
> wrote:
> 
> > The email below is from the writing side of the copy job and the message:
> >
> > 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from
> > Storage daemon:kilchis:9103: ERR=Connection reset by peer
> >
> > shows that the connection to the reading side of the job was closed
> > unexpectedly from the reading end.
> >
> > Do you have the corresponding email from the reading side?  It will have a
> > different JobId (but should mention JobId 35203) and should start with
> > something like "Using Device ... to read."
> >
> > __Martin
> >
> >
> > > On Mon, 18 Sep 2017 13:42:19 -0700, Jerry Lowry said:
> > >
> > > Martin,
> > > Here is the complete email that was sent just before the "Copy Error"
> > > message:
> > >
> > > 12-Sep 15:09 kilchis-dir JobId 35203: Using Device "MidSwap" to write.
> > > 12-Sep 15:09 kilchis JobId 35203: Volume "homeMS-200" previously
> > written, moving to end of data.
> > > 12-Sep 15:27 kilchis JobId 35203: End of medium on Volume "homeMS-200"
> > Bytes=1,932,735,274,146 Blocks=29,959,317 at 12-Sep-2017 15:27.
> > > 12-Sep 15:28 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> > is waiting. Cannot find any appendable volumes.
> > > Please use the "label" command to create a new Volume for:
> > > Storage:  "MidSwap" (/MidSwap)
> > > Pool: OffsiteMid
> > > Media type:   File
> > > 12-Sep 15:36 kilchis JobId 35203: Wrote label to prelabeled Volume
> > "homeMS-201" on File device "MidSwap" (/MidSwap)
> > > 12-Sep 15:36 kilchis JobId 35203: New volume "homeMS-201" mounted on
> > device "MidSwap" (/MidSwap) at 12-Sep-2017 15:36.
> > > 12-Sep 19:54 kilchis JobId 35203: End of medium on Volume "homeMS-201"
> > Bytes=1,932,735,281,790 Blocks=29,959,315 at 12-Sep-2017 19:54.
> > > 12-Sep 19:54 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> > is waiting. Cannot find any appendable volumes.
> > > Please use the "label" command to create a new Volume for:
> > > Storage:  "MidSwap" (/MidSwap)
> > > Pool: OffsiteMid
> > > Media type:   File
> > > 12-Sep 20:57 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> > is waiting. Cannot find any appendable volumes.
> > > Please use the "label" command to create a new Volume for:
> > > Storage:  "MidSwap" (/MidSwap)
> > > Pool: OffsiteMid
> > > Media type:   File
> > > 12-Sep 23:03 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> > is waiting. Cannot find any appendable volumes.
> > > Please use the "label" command to create a new Volume for:
> > > Storage:  "MidSwap" (/MidSwap)
> > > Pool: OffsiteMid
> > > Media type:   File
> > > 13-Sep 03:15 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> > is waiting. Cannot find any appendable volumes.
> > > Please use the "label" command to create a new Volume for:
> > > Storage:  "MidSwap" (/MidSwap)
> > > Pool: OffsiteMid
> > > Media type:   File
> > > 13-Sep 08:23 kilchis JobId 35203: Wrote label to prelabeled Volume
> > "homeMS-202" on File device "MidSwap" (/MidSwap)
> > > 13-Sep 08:23 kilchis JobId 35203: New volume "homeMS-202" mounted on
> > device "MidSwap" (/MidSwap) at 13-Sep-2017 08:23.
> > > 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from
> > Storage daemon:kilchis:9103: ERR=Connection reset by peer
> > > 13-Sep 08:43 kilchis JobId 35203: Fatal error: append.c:271 Network
> > error reading from FD. ERR=Connection reset by peer
> > > 13-Sep 08:43 kilchis JobId 35203: Elapsed time=04:56:15, Transfer
> > rate=125.6 M Bytes/second
> > > 13-Sep 08:43 kilchis JobId 35203: Sending spooled attrs to the Director.
> > Despooling 1,533,148,574 bytes ...
> > >
> > > I don't have the job log. Interestingly, I did not have any problems with
> > > this or any other copy job before I upgraded.  I went from 5.2.13 to
> > 9.0.3
> > > of Bacula and latest version of MySql to Mariadb.  Not saying that this
> > is
> > > a problem, because I have 5 other copy jobs that work without error
> > still.
> > > This one just happens to be the biggest one.
> > >
> > > thanks,
> > > jerry
> > >
> > > On Mon, Sep 18, 2017 at 7:55 A

Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-19 Thread Jerry Lowry
The reading side is the same system.  It is a copy job setup to backup
daily backups to the offsite backup disk.
The attachment is the bacula jobid 35202.

jerry

On Tue, Sep 19, 2017 at 10:08 AM, Martin Simmons 
wrote:

> The email below is from the writing side of the copy job and the message:
>
> 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from
> Storage daemon:kilchis:9103: ERR=Connection reset by peer
>
> shows that the connection to the reading side of the job was closed
> unexpectedly from the reading end.
>
> Do you have the corresponding email from the reading side?  It will have a
> different JobId (but should mention JobId 35203) and should start with
> something like "Using Device ... to read."
>
> __Martin
>
>
> > On Mon, 18 Sep 2017 13:42:19 -0700, Jerry Lowry said:
> >
> > Martin,
> > Here is the complete email that was sent just before the "Copy Error"
> > message:
> >
> > 12-Sep 15:09 kilchis-dir JobId 35203: Using Device "MidSwap" to write.
> > 12-Sep 15:09 kilchis JobId 35203: Volume "homeMS-200" previously
> written, moving to end of data.
> > 12-Sep 15:27 kilchis JobId 35203: End of medium on Volume "homeMS-200"
> Bytes=1,932,735,274,146 Blocks=29,959,317 at 12-Sep-2017 15:27.
> > 12-Sep 15:28 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> is waiting. Cannot find any appendable volumes.
> > Please use the "label" command to create a new Volume for:
> > Storage:  "MidSwap" (/MidSwap)
> > Pool: OffsiteMid
> > Media type:   File
> > 12-Sep 15:36 kilchis JobId 35203: Wrote label to prelabeled Volume
> "homeMS-201" on File device "MidSwap" (/MidSwap)
> > 12-Sep 15:36 kilchis JobId 35203: New volume "homeMS-201" mounted on
> device "MidSwap" (/MidSwap) at 12-Sep-2017 15:36.
> > 12-Sep 19:54 kilchis JobId 35203: End of medium on Volume "homeMS-201"
> Bytes=1,932,735,281,790 Blocks=29,959,315 at 12-Sep-2017 19:54.
> > 12-Sep 19:54 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> is waiting. Cannot find any appendable volumes.
> > Please use the "label" command to create a new Volume for:
> > Storage:  "MidSwap" (/MidSwap)
> > Pool: OffsiteMid
> > Media type:   File
> > 12-Sep 20:57 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> is waiting. Cannot find any appendable volumes.
> > Please use the "label" command to create a new Volume for:
> > Storage:  "MidSwap" (/MidSwap)
> > Pool: OffsiteMid
> > Media type:   File
> > 12-Sep 23:03 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> is waiting. Cannot find any appendable volumes.
> > Please use the "label" command to create a new Volume for:
> > Storage:  "MidSwap" (/MidSwap)
> > Pool: OffsiteMid
> > Media type:   File
> > 13-Sep 03:15 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50
> is waiting. Cannot find any appendable volumes.
> > Please use the "label" command to create a new Volume for:
> > Storage:  "MidSwap" (/MidSwap)
> > Pool: OffsiteMid
> > Media type:   File
> > 13-Sep 08:23 kilchis JobId 35203: Wrote label to prelabeled Volume
> "homeMS-202" on File device "MidSwap" (/MidSwap)
> > 13-Sep 08:23 kilchis JobId 35203: New volume "homeMS-202" mounted on
> device "MidSwap" (/MidSwap) at 13-Sep-2017 08:23.
> > 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from
> Storage daemon:kilchis:9103: ERR=Connection reset by peer
> > 13-Sep 08:43 kilchis JobId 35203: Fatal error: append.c:271 Network
> error reading from FD. ERR=Connection reset by peer
> > 13-Sep 08:43 kilchis JobId 35203: Elapsed time=04:56:15, Transfer
> rate=125.6 M Bytes/second
> > 13-Sep 08:43 kilchis JobId 35203: Sending spooled attrs to the Director.
> Despooling 1,533,148,574 bytes ...
> >
> > I don't have the job log. Interestingly, I did not have any problems with
> > this or any other copy job before I upgraded.  I went from 5.2.13 to
> 9.0.3
> > of Bacula and latest version of MySql to Mariadb.  Not saying that this
> is
> > a problem, because I have 5 other copy jobs that work without error
> still.
> > This one just happens to be the biggest one.
> >
> > thanks,
> > jerry
> >
> > On Mon, Sep 18, 2017 at 7:55 AM, Martin Simmons 
> > wrote:
> >
> > > A copy job will communicate using TCP between the Bacula daemons.  A
> bsock
> > > error could indicate that bacula-sd closed the connection unexpectedly
> and
> > > I
> > > would expect media errors to be logged.
> > >
> > > Your syslog did include some I/O errors.  Any they caused by something
> > > else?
> > >
> > > Do you have the complete job log (from the Bacula log, not the syslog)?
> > >
> > > __Martin
> > >
> > >
> > > > On Wed, 13 Sep 2017 09:35:07 -0700, Jerry Lowry said:
> > > >
> > > > Kern,
> > > > My Offsite Backup just failed again on the same drive, different
> disk. It
> > > > failed with the same bsock error.  If the backup is working on the
> same
> > > > system using the copy

Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-19 Thread Martin Simmons
The email below is from the writing side of the copy job and the message:

13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from Storage 
daemon:kilchis:9103: ERR=Connection reset by peer

shows that the connection to the reading side of the job was closed
unexpectedly from the reading end.

Do you have the corresponding email from the reading side?  It will have a
different JobId (but should mention JobId 35203) and should start with
something like "Using Device ... to read."

__Martin


> On Mon, 18 Sep 2017 13:42:19 -0700, Jerry Lowry said:
> 
> Martin,
> Here is the complete email that was sent just before the "Copy Error"
> message:
> 
> 12-Sep 15:09 kilchis-dir JobId 35203: Using Device "MidSwap" to write.
> 12-Sep 15:09 kilchis JobId 35203: Volume "homeMS-200" previously written, 
> moving to end of data.
> 12-Sep 15:27 kilchis JobId 35203: End of medium on Volume "homeMS-200" 
> Bytes=1,932,735,274,146 Blocks=29,959,317 at 12-Sep-2017 15:27.
> 12-Sep 15:28 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
> Storage:  "MidSwap" (/MidSwap)
> Pool: OffsiteMid
> Media type:   File
> 12-Sep 15:36 kilchis JobId 35203: Wrote label to prelabeled Volume 
> "homeMS-201" on File device "MidSwap" (/MidSwap)
> 12-Sep 15:36 kilchis JobId 35203: New volume "homeMS-201" mounted on device 
> "MidSwap" (/MidSwap) at 12-Sep-2017 15:36.
> 12-Sep 19:54 kilchis JobId 35203: End of medium on Volume "homeMS-201" 
> Bytes=1,932,735,281,790 Blocks=29,959,315 at 12-Sep-2017 19:54.
> 12-Sep 19:54 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
> Storage:  "MidSwap" (/MidSwap)
> Pool: OffsiteMid
> Media type:   File
> 12-Sep 20:57 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
> Storage:  "MidSwap" (/MidSwap)
> Pool: OffsiteMid
> Media type:   File
> 12-Sep 23:03 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
> Storage:  "MidSwap" (/MidSwap)
> Pool: OffsiteMid
> Media type:   File
> 13-Sep 03:15 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
> Storage:  "MidSwap" (/MidSwap)
> Pool: OffsiteMid
> Media type:   File
> 13-Sep 08:23 kilchis JobId 35203: Wrote label to prelabeled Volume 
> "homeMS-202" on File device "MidSwap" (/MidSwap)
> 13-Sep 08:23 kilchis JobId 35203: New volume "homeMS-202" mounted on device 
> "MidSwap" (/MidSwap) at 13-Sep-2017 08:23.
> 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from Storage 
> daemon:kilchis:9103: ERR=Connection reset by peer
> 13-Sep 08:43 kilchis JobId 35203: Fatal error: append.c:271 Network error 
> reading from FD. ERR=Connection reset by peer
> 13-Sep 08:43 kilchis JobId 35203: Elapsed time=04:56:15, Transfer rate=125.6 
> M Bytes/second
> 13-Sep 08:43 kilchis JobId 35203: Sending spooled attrs to the Director. 
> Despooling 1,533,148,574 bytes ...
> 
> I don't have the job log. Interestingly, I did not have any problems with
> this or any other copy job before I upgraded.  I went from 5.2.13 to 9.0.3
> of Bacula and latest version of MySql to Mariadb.  Not saying that this is
> a problem, because I have 5 other copy jobs that work without error still.
> This one just happens to be the biggest one.
> 
> thanks,
> jerry
> 
> On Mon, Sep 18, 2017 at 7:55 AM, Martin Simmons 
> wrote:
> 
> > A copy job will communicate using TCP between the Bacula daemons.  A bsock
> > error could indicate that bacula-sd closed the connection unexpectedly and
> > I
> > would expect media errors to be logged.
> >
> > Your syslog did include some I/O errors.  Any they caused by something
> > else?
> >
> > Do you have the complete job log (from the Bacula log, not the syslog)?
> >
> > __Martin
> >
> >
> > > On Wed, 13 Sep 2017 09:35:07 -0700, Jerry Lowry said:
> > >
> > > Kern,
> > > My Offsite Backup just failed again on the same drive, different disk. It
> > > failed with the same bsock error.  If the backup is working on the same
> > > system using the copy function, how far out of the network stack does it
> > > go.  My thinking is it does not get out of the application layer.  Is
> > this
> > > right?  Why would I get a bsock error?
> > >
> > > I have taken a look at the smart data for the disk and they seem to be
> > > running okay. I am getting some sector relocation errors, would that
> > cause
> > > the bsock error during a remap

Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-18 Thread Martin Simmons
A copy job will communicate using TCP between the Bacula daemons.  A bsock
error could indicate that bacula-sd closed the connection unexpectedly and I
would expect media errors to be logged.

Your syslog did include some I/O errors.  Any they caused by something else?

Do you have the complete job log (from the Bacula log, not the syslog)?

__Martin


> On Wed, 13 Sep 2017 09:35:07 -0700, Jerry Lowry said:
> 
> Kern,
> My Offsite Backup just failed again on the same drive, different disk. It
> failed with the same bsock error.  If the backup is working on the same
> system using the copy function, how far out of the network stack does it
> go.  My thinking is it does not get out of the application layer.  Is this
> right?  Why would I get a bsock error?
> 
> I have taken a look at the smart data for the disk and they seem to be
> running okay. I am getting some sector relocation errors, would that cause
> the bsock error during a remap?  This procedure has been running flawlessly
> for many years ( except for human error ).  I am wondering if I should
> delete the present disk files and let bacula recreate new ones.
> 
> thanks for your help!
> 
> jerry
> 
> 
> On Wed, Sep 6, 2017 at 11:26 PM, Kern Sibbald  wrote:
> 
> > Hello,
> >
> > If the job is marked as Incomplete in the catalog ("I" I think), then you
> > can simply restart it and it should pickup where it left off.  If not you
> > must run it again from the beginning.
> >
> > If you are switching devices when one is full during a Job, it is unlikely
> > you can restore that job when it terminates. I recommend carefully testing
> > restores on your system.
> >
> > Best regards,
> >
> > Kern
> >
> > On 09/06/2017 05:38 PM, Jerry Lowry wrote:
> >
> > List,
> > I am running, bacula 9.0.3, Mariadb 12.2.8 on Centos 6.9.  I got notice
> > last night that my Offsite backup failed due to a bsock error.  My offsite
> > drives are attached to an ATTO raid card which gives me hot swap
> > capability. This configuration works great as it allows me to hot swap a
> > drive when it fills up with a new drive to continue with.  The problem is
> > included below. The backup that I was doing is to the OffsiteMid drive
> > which is mounted as /dev/sde. Is there a way to restart this backup job or
> > am I left with an incomplete backup going forward.
> >
> > thanks for your help,
> >
> > jerry
> >
> >
> > Sep  5 08:46:01 kilchis bat[4339]: bsock.c:147 Unable to connect to
> > Director dae
> > mon on kilchis:9101. ERR=Connection refused
> > Sep  5 10:37:20 kilchis attocfgd: [CRIT] [ExpressSAS
> > R608,50:01:08:60:00:57:3d:c
> > 0] [FW] RAID Group state now Offline: OffsiteTop
> > Sep  5 10:39:06 kilchis kernel: scsi 5:0:1:0: Direct-Access ATTO
> > Offsite
> > Top00 0001 PQ: 0 ANSI: 5
> > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6 type
> > 0
> > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > logical bl
> > ocks: (2.00 TB/1.81 TiB)
> > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write Protect is off
> > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write cache: enabled,
> > read cac
> > he: enabled, doesn't support DPO or FUA
> > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > logical bl
> > ocks: (2.00 TB/1.81 TiB)
> > Sep  5 10:39:06 kilchis kernel: sdd: unknown partition table
> > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > logical bl
> > ocks: (2.00 TB/1.81 TiB)
> > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Attached SCSI disk
> > Sep  5 10:39:35 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > logical bl
> > ocks: (2.00 TB/1.81 TiB)
> > Sep  5 10:39:35 kilchis kernel: sdd:
> > Sep  5 10:44:54 kilchis kernel: EXT4-fs (sdd): mounted filesystem with
> > ordered d
> > ata mode. Opts:
> > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> > on cal
> > l to client:10.20.10.21:9101
> > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> > on cal
> > l to client:10.20.10.21:9101
> > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> > on cal
> > l to client:10.20.10.21:9101
> > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> > on cal
> > l to client:10.20.10.21:9101
> > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> > on cal
> > l to client:10.20.10.21:9101
> > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> > on cal
> > l to client:10.20.10.21:9101
> > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> > on cal
> > l to client:10.20.10.21:9101
> > Sep  5 13:45:48 kilchis attocfgd: [CRIT] [ExpressSAS
> > R608,50:01:08:60:00:57:3d:c
> > 0] [FW] RAID Group state now Offline: OffsiteMid
> > Sep  5 13:45:53 kilchis attocfgd: [CRIT] [ExpressSAS
> > R608,50:01:08:60:00:57:3d:c
> > 0] [FW] RAID Group state now Offline: OffsiteTop
> > Sep  5 13:47:52 kilch

Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-13 Thread Jerry Lowry
No, the only thing that shows in the messages file is that I changed the
disk 3 times as they filled up.

jerry

On Wed, Sep 13, 2017 at 10:51 AM, Josip Deanovic  wrote:

> On Wednesday 2017-09-13 09:35:07 Jerry Lowry wrote:
> > Kern,
> > My Offsite Backup just failed again on the same drive, different disk.
> > It failed with the same bsock error.  If the backup is working on the
> > same system using the copy function, how far out of the network stack
> > does it go.  My thinking is it does not get out of the application
> > layer.  Is this right?  Why would I get a bsock error?
> >
> > I have taken a look at the smart data for the disk and they seem to be
> > running okay. I am getting some sector relocation errors, would that
> > cause the bsock error during a remap?  This procedure has been running
> > flawlessly for many years ( except for human error ).  I am wondering
> > if I should delete the present disk files and let bacula recreate new
> > ones.
> >
> > thanks for your help!
>
>
> Did you get any disk/file system related error messages in the dmesg
> output?
>
> The same question goes for the system logs (usually /var/log/messages).
>
> --
> Josip Deanovic
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-13 Thread Josip Deanovic
On Wednesday 2017-09-13 09:35:07 Jerry Lowry wrote:
> Kern,
> My Offsite Backup just failed again on the same drive, different disk.
> It failed with the same bsock error.  If the backup is working on the
> same system using the copy function, how far out of the network stack
> does it go.  My thinking is it does not get out of the application
> layer.  Is this right?  Why would I get a bsock error?
> 
> I have taken a look at the smart data for the disk and they seem to be
> running okay. I am getting some sector relocation errors, would that
> cause the bsock error during a remap?  This procedure has been running
> flawlessly for many years ( except for human error ).  I am wondering
> if I should delete the present disk files and let bacula recreate new
> ones.
> 
> thanks for your help!


Did you get any disk/file system related error messages in the dmesg
output?

The same question goes for the system logs (usually /var/log/messages).

-- 
Josip Deanovic

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-13 Thread Jerry Lowry
Kern,
My Offsite Backup just failed again on the same drive, different disk. It
failed with the same bsock error.  If the backup is working on the same
system using the copy function, how far out of the network stack does it
go.  My thinking is it does not get out of the application layer.  Is this
right?  Why would I get a bsock error?

I have taken a look at the smart data for the disk and they seem to be
running okay. I am getting some sector relocation errors, would that cause
the bsock error during a remap?  This procedure has been running flawlessly
for many years ( except for human error ).  I am wondering if I should
delete the present disk files and let bacula recreate new ones.

thanks for your help!

jerry


On Wed, Sep 6, 2017 at 11:26 PM, Kern Sibbald  wrote:

> Hello,
>
> If the job is marked as Incomplete in the catalog ("I" I think), then you
> can simply restart it and it should pickup where it left off.  If not you
> must run it again from the beginning.
>
> If you are switching devices when one is full during a Job, it is unlikely
> you can restore that job when it terminates. I recommend carefully testing
> restores on your system.
>
> Best regards,
>
> Kern
>
> On 09/06/2017 05:38 PM, Jerry Lowry wrote:
>
> List,
> I am running, bacula 9.0.3, Mariadb 12.2.8 on Centos 6.9.  I got notice
> last night that my Offsite backup failed due to a bsock error.  My offsite
> drives are attached to an ATTO raid card which gives me hot swap
> capability. This configuration works great as it allows me to hot swap a
> drive when it fills up with a new drive to continue with.  The problem is
> included below. The backup that I was doing is to the OffsiteMid drive
> which is mounted as /dev/sde. Is there a way to restart this backup job or
> am I left with an incomplete backup going forward.
>
> thanks for your help,
>
> jerry
>
>
> Sep  5 08:46:01 kilchis bat[4339]: bsock.c:147 Unable to connect to
> Director dae
> mon on kilchis:9101. ERR=Connection refused
> Sep  5 10:37:20 kilchis attocfgd: [CRIT] [ExpressSAS
> R608,50:01:08:60:00:57:3d:c
> 0] [FW] RAID Group state now Offline: OffsiteTop
> Sep  5 10:39:06 kilchis kernel: scsi 5:0:1:0: Direct-Access ATTO
> Offsite
> Top00 0001 PQ: 0 ANSI: 5
> Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6 type
> 0
> Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> logical bl
> ocks: (2.00 TB/1.81 TiB)
> Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write Protect is off
> Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write cache: enabled,
> read cac
> he: enabled, doesn't support DPO or FUA
> Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> logical bl
> ocks: (2.00 TB/1.81 TiB)
> Sep  5 10:39:06 kilchis kernel: sdd: unknown partition table
> Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> logical bl
> ocks: (2.00 TB/1.81 TiB)
> Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Attached SCSI disk
> Sep  5 10:39:35 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> logical bl
> ocks: (2.00 TB/1.81 TiB)
> Sep  5 10:39:35 kilchis kernel: sdd:
> Sep  5 10:44:54 kilchis kernel: EXT4-fs (sdd): mounted filesystem with
> ordered d
> ata mode. Opts:
> Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> on cal
> l to client:10.20.10.21:9101
> Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> on cal
> l to client:10.20.10.21:9101
> Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> on cal
> l to client:10.20.10.21:9101
> Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> on cal
> l to client:10.20.10.21:9101
> Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> on cal
> l to client:10.20.10.21:9101
> Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> on cal
> l to client:10.20.10.21:9101
> Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
> on cal
> l to client:10.20.10.21:9101
> Sep  5 13:45:48 kilchis attocfgd: [CRIT] [ExpressSAS
> R608,50:01:08:60:00:57:3d:c
> 0] [FW] RAID Group state now Offline: OffsiteMid
> Sep  5 13:45:53 kilchis attocfgd: [CRIT] [ExpressSAS
> R608,50:01:08:60:00:57:3d:c
> 0] [FW] RAID Group state now Offline: OffsiteTop
> Sep  5 13:47:52 kilchis kernel: scsi 5:0:1:0: Direct-Access ATTO
> Offsite
> Mid00 0001 PQ: 0 ANSI: 5
> Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6 type
> 0
> Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
> logical bl
> ocks: (2.00 TB/1.81 TiB)
> Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write Protect is off
> Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write cache: enabled,
> read cac
> he: enabled, doesn't support DPO or FUA
> Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
> logical bl
> ocks: (2.00 TB/1.81 TiB)
> Sep  5 13:47:52 kilchis kernel: sde: unknown p

Re: [Bacula-users] Incomplete backup - due to bsock error

2017-09-06 Thread Kern Sibbald

  
  
Hello,
If the job is marked as Incomplete in the catalog ("I" I think),
  then you can simply restart it and it should pickup where it left
  off.  If not you must run it again from the beginning.
If you are switching devices when one is full during a Job, it is
  unlikely you can restore that job when it terminates. I recommend
  carefully testing restores on your system.
Best regards,
Kern


On 09/06/2017 05:38 PM, Jerry Lowry
  wrote:


  

  List,
  
  I am running, bacula 9.0.3, Mariadb 12.2.8 on Centos 6.9.  I
  got notice last night that my Offsite backup failed due to a
  bsock error.  My offsite drives are attached to an ATTO raid
  card which gives me hot swap capability. This configuration
  works great as it allows me to hot swap a drive when it fills
  up with a new drive to continue with.  The problem is included
  below. The backup that I was doing is to the OffsiteMid drive
  which is mounted as /dev/sde. Is there a way to restart this
  backup job or am I left with an incomplete backup going
  forward.


thanks for your help,


jerry




Sep  5 08:46:01 kilchis bat[4339]: bsock.c:147 Unable to
  connect to Director dae
  mon on kilchis:9101. ERR=Connection refused
  Sep  5 10:37:20 kilchis attocfgd: [CRIT] [ExpressSAS
  R608,50:01:08:60:00:57:3d:c
  0] [FW] RAID Group state now Offline: OffsiteTop
  Sep  5 10:39:06 kilchis kernel: scsi 5:0:1:0:
  Direct-Access ATTO Offsite
  Top00 0001 PQ: 0 ANSI: 5
  Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: Attached scsi
  generic sg6 type 0
  Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336
  4096-byte logical bl
  ocks: (2.00 TB/1.81 TiB)
  Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write
  Protect is off
  Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write cache:
  enabled, read cac
  he: enabled, doesn't support DPO or FUA
  Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336
  4096-byte logical bl
  ocks: (2.00 TB/1.81 TiB)
  Sep  5 10:39:06 kilchis kernel: sdd: unknown partition table
  Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336
  4096-byte logical bl
  ocks: (2.00 TB/1.81 TiB)
  Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Attached
  SCSI disk
  Sep  5 10:39:35 kilchis kernel: sd 5:0:1:0: [sdd] 488366336
  4096-byte logical bl
  ocks: (2.00 TB/1.81 TiB)
  Sep  5 10:39:35 kilchis kernel: sdd:
  Sep  5 10:44:54 kilchis kernel: EXT4-fs (sdd): mounted
  filesystem with ordered d
  ata mode. Opts: 
  Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket
  has errors=1 on cal
  l to client:10.20.10.21:9101
  Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket
  has errors=1 on cal
  l to client:10.20.10.21:9101
  Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket
  has errors=1 on cal
  l to client:10.20.10.21:9101
  Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket
  has errors=1 on cal
  l to client:10.20.10.21:9101
  Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket
  has errors=1 on cal
  l to client:10.20.10.21:9101
  Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket
  has errors=1 on cal
  l to client:10.20.10.21:9101
  Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket
  has errors=1 on cal
  l to client:10.20.10.21:9101
  Sep  5 13:45:48 kilchis attocfgd: [CRIT] [ExpressSAS
  R608,50:01:08:60:00:57:3d:c
  0] [FW] RAID Group state now Offline: OffsiteMid
  Sep  5 13:45:53 kilchis attocfgd: [CRIT] [ExpressSAS
  R608,50:01:08:60:00:57:3d:c
  0] [FW] RAID Group state now Offline: OffsiteTop
  Sep  5 13:47:52 kilchis kernel: scsi 5:0:1:0:
  Direct-Access ATTO Offsite
  Mid00 0001 PQ: 0 ANSI: 5
  Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: Attached scsi
  generic sg6 type 0
  Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336
  4096-byte logical bl
  ocks: (2.00 TB/1.81 TiB)
  Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write
  Protect is off
  Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write cache:
  enabled, read cac
  he: enabled, doesn't support DPO or FUA
  Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336
  4096-by

[Bacula-users] Incomplete backup - due to bsock error

2017-09-06 Thread Jerry Lowry
List,
I am running, bacula 9.0.3, Mariadb 12.2.8 on Centos 6.9.  I got notice
last night that my Offsite backup failed due to a bsock error.  My offsite
drives are attached to an ATTO raid card which gives me hot swap
capability. This configuration works great as it allows me to hot swap a
drive when it fills up with a new drive to continue with.  The problem is
included below. The backup that I was doing is to the OffsiteMid drive
which is mounted as /dev/sde. Is there a way to restart this backup job or
am I left with an incomplete backup going forward.

thanks for your help,

jerry


Sep  5 08:46:01 kilchis bat[4339]: bsock.c:147 Unable to connect to
Director dae
mon on kilchis:9101. ERR=Connection refused
Sep  5 10:37:20 kilchis attocfgd: [CRIT] [ExpressSAS
R608,50:01:08:60:00:57:3d:c
0] [FW] RAID Group state now Offline: OffsiteTop
Sep  5 10:39:06 kilchis kernel: scsi 5:0:1:0: Direct-Access ATTO
Offsite
Top00 0001 PQ: 0 ANSI: 5
Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6 type 0
Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
logical bl
ocks: (2.00 TB/1.81 TiB)
Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write Protect is off
Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write cache: enabled,
read cac
he: enabled, doesn't support DPO or FUA
Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
logical bl
ocks: (2.00 TB/1.81 TiB)
Sep  5 10:39:06 kilchis kernel: sdd: unknown partition table
Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
logical bl
ocks: (2.00 TB/1.81 TiB)
Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Attached SCSI disk
Sep  5 10:39:35 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
logical bl
ocks: (2.00 TB/1.81 TiB)
Sep  5 10:39:35 kilchis kernel: sdd:
Sep  5 10:44:54 kilchis kernel: EXT4-fs (sdd): mounted filesystem with
ordered d
ata mode. Opts:
Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
on cal
l to client:10.20.10.21:9101
Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
on cal
l to client:10.20.10.21:9101
Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
on cal
l to client:10.20.10.21:9101
Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
on cal
l to client:10.20.10.21:9101
Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
on cal
l to client:10.20.10.21:9101
Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
on cal
l to client:10.20.10.21:9101
Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1
on cal
l to client:10.20.10.21:9101
Sep  5 13:45:48 kilchis attocfgd: [CRIT] [ExpressSAS
R608,50:01:08:60:00:57:3d:c
0] [FW] RAID Group state now Offline: OffsiteMid
Sep  5 13:45:53 kilchis attocfgd: [CRIT] [ExpressSAS
R608,50:01:08:60:00:57:3d:c
0] [FW] RAID Group state now Offline: OffsiteTop
Sep  5 13:47:52 kilchis kernel: scsi 5:0:1:0: Direct-Access ATTO
Offsite
Mid00 0001 PQ: 0 ANSI: 5
Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6 type 0
Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
logical bl
ocks: (2.00 TB/1.81 TiB)
Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write Protect is off
Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write cache: enabled,
read cac
he: enabled, doesn't support DPO or FUA
Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
logical bl
ocks: (2.00 TB/1.81 TiB)
Sep  5 13:47:52 kilchis kernel: sde: unknown partition table
Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
logical bl
ocks: (2.00 TB/1.81 TiB)
Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Attached SCSI disk
Sep  5 13:48:01 kilchis kernel: EXT4-fs error (device sdd):
__ext4_get_inode_loc
: unable to read inode block - inode=2, block=1057
Sep  5 13:48:01 kilchis kernel: Buffer I/O error on device sdd, logical
block 0
Sep  5 13:48:01 kilchis kernel: lost page write due to I/O error on sdd
Sep  5 13:48:01 kilchis kernel: EXT4-fs error (device sdd) in
ext4_reserve_inode
_write: IO failure
Sep  5 13:48:01 kilchis kernel: EXT4-fs (sdd): previous I/O error to
superblock
detected
Sep  5 13:48:01 kilchis kernel: Buffer I/O error on device sdd, logical
block 0
Sep  5 13:48:01 kilchis kernel: lost page write due to I/O error on sdd
Sep  5 13:48:06 kilchis kernel: Aborting journal on device sdd-8.
Sep  5 13:48:06 kilchis kernel: Buffer I/O error on device sdd, logical
block 24
3826688
Sep  5 13:48:06 kilchis kernel: lost page write due to I/O error on sdd
Sep  5 13:48:06 kilchis kernel: JBD2: I/O error detected when updating
journal s
uperblock for sdd-8.
Sep  5 13:48:08 kilchis kernel: EXT4-fs error (device sdd): ext4_put_super:
Coul
dn't clean up the journal
Sep  5 13:48:08 kilchis kernel: EXT4-fs (sdd): Remounting filesystem
read-only
Sep  5 13:48:44 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
logical bl
ocks: (2.00 T