Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson



Thanks Bill, you nailed it.

04-Jan-2022 16:13:52 FD: backup.c:1356-884680 
fname=/Users/USER/Library/Containers/com.apple.Safari.CacheDeleteExtension 
snap=/Users/USER/Library/Containers/com.apple.Safari.CacheDeleteExtension link=/Users/USER/Library/Containers/com.apple.Safari.CacheDeleteExtension/



Last file before error is thrown and job craps out.

I would if that's the only file, will try to exclude and see how far the 
job can go.


Also, note a debug level of 150 is way more than needed to troubleshoot 
this and I canceled attempt after trace file was 60G.  level=10 was 
enough to log which files were being backedup as well as the error that 
terminated job.


Stephen



On 1/4/22 12:03 PM, Bill Arlofski via Bacula-users wrote:

On 1/4/22 12:26, Stephen Thompson wrote:


Yes, backing up a single file on my problem hosts does succeed.

H...

Stephen


Hello Stephen,

This issue looked familiar to me, so I checked internally and I think I found 
something.

I am pretty sure that this is an issue due to the larger possible size of 
extended attributes that Big Sur uses.

  From what I can gather, this has been addressed and fixed in Bacula 
Enterprise, and the fix will appear in the next Bacula
Community release. (no ETA that I am aware of yet, but I assume very soon)

In the case I found, running the FD in debug mode, level=150 revealed there was 
an issue with one specific file:
8<
/Users//Library/Containers/com.apple.Safari.CacheDeleteExtension
8<

The temporary workaround at the time (Sept 2021) was to omit this file (or 
whichever file your system is working on when the
job fails) from the backups.

No idea if this means much, but there was also a mention made: "this seems to be 
related to Time Machine"


Setting the FD in debug mode:

* setdebug level=150 options=tc trace=1 client=

Then, run the backup until it fails, and stop debugging:

* setdebug level=0 trace=0 client=

In /opt/bacula/working on the FD (or wherever "WorkingDirectory" is set to), 
there will be a *.trace file. You will be
looking for the file mentioned before the error:
8<
bxattr.c:310-69825 Network send error to SD. ERR=Broken pipe
8<


Hope this helps.
Bill

--
Bill Arlofski
w...@protonmail.com



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


--
Stephen Thompson   Berkeley Seismology Lab
stephen.thomp...@berkeley.edu  307 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson



Thanks.
I have large file support off, though I am not sure that's intentional. 
 I will double check that.



On 1/4/22 11:55 AM, Graham Sparks wrote:

I'm afraid I don't enable encryption in my backup jobs (I know I should ) so I 
don't know if that causes an issue.  I'll have a quick look some time to see 
what happens when I enable encryption.

I think I've reached my limit here, but it might be worth checking the following file to 
make sure all the compilation options took successfully (thinking aloud here, but 
"Large File Support" caught my attention):

$BHOME/bin/bacula_config

Thanks.


--
Stephen Thompson   Berkeley Seismology Lab
stephen.thomp...@berkeley.edu  307 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson



However, even just backing up /Users results in...

04-Jan 11:31 SD JobId 88: Fatal error: bsock.c:530 Packet 
size=1387166 too big from "client:1.2.3.4:9103". Maximum permitted 
100. Terminating connection.



Stephen




On 1/4/22 11:26 AM, Stephen Thompson wrote:




Yes, backing up a single file on my problem hosts does succeed.

H...

Stephen



On 1/4/22 11:23 AM, Stephen Thompson wrote:



That's a good test, which I apparently have not tried.  I will do so.

thanks,
Stephen


On 1/4/22 11:20 AM, Martin Simmons wrote:

Is this happening for all backups?

What happens if you run a backup with a minimal fileset that lists 
just one

small file?

__Martin



On Tue, 4 Jan 2022 08:13:46 -0800, Stephen Thompson said:


I am still seeing the same issue on Monterey as on Big Sur with 11.0.5
compiled from source and CoreFoundation linked in.

04-Jan 07:56 SD JobId 88: Fatal error: bsock.c:530 Packet 
size=1387165
too big from "client:1.2.3.4:9103". Maximum permitted 100. 
Terminating

connection.



Stephen

On Tue, Jan 4, 2022 at 7:02 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:



Graham,

Thanks for presenting Monterey as a possibility!  I am seeing the same
issue under Monterrey as I have under Big Sur, but to know someone 
else
does not means that it's possible.  I should double check that I am 
using a
freshly compiled client on Monterey and not just the one that I 
compiled on

Big Sur.

I am backing up Macs with bacula, but not really for system 
recovery, more
to backup user files/documents that they may not be backing up 
themselves.
I do note a number of Mac system files that refuse to be backed up, 
but
again for my purposes, I do not care too much.  It would be nice to 
be able
to BMR a Mac, but not a requirement where I am at, being 
operationally a

Linux shop.

Stephen




On Tue, Jan 4, 2022 at 6:20 AM Graham Sparks  
wrote:



Hi David,

I use Time Machine (for the System disk) as well as Bacula on my 
Mac, as
I'd still need the Time Machine backup to do a bare-metal restore 
(with

Apps). I use Bacula to back up this and an external data drive.

Rather than purchasing a separate "Time Capsule", I set up Samba on a
Linux VM to expose an SMB share that the Mac sees as a Time 
Capsule drive (
https://wiki.samba.org/index.php/Configure_Samba_to_Work_Better_with_Mac_OS_X 


).

I had one problem with Time Machine a few months ago, where it 
stopped
backing up data and insisted on starting the backup 'chain' from 
scratch

again.  I was a little miffed .

I'm afraid I can only confirm that the Bacula v9.6 and v11 file 
daemons
worked for me under macOS Catalina and Monetery (I skipped Big 
Sur.  Not
for good reason---just laziness).  Both v9 and v11 clients were 
compiled
from source (setting the linker flags to "-framework 
CoreFoundation" as

already suggested).

I've personally not run in to problems with System Integrity 
Protection,

although I do give the bacula-fd executable "Full Disk" permissions.

Thanks.
--
Graham Sparks



From: David Brodbeck 
Sent: 03 January 2022 18:36
Cc: bacula-users@lists.sourceforge.net <
bacula-users@lists.sourceforge.net>
Subject: Re: [Bacula-users] Packet size too big (NOT a version 
mismatch)


I'm curious if anyone has moved away from Bacula on macOS and what
alternatives they're using. Even before this, it was getting more 
and more
awkward to set up -- bacula really doesn't play well with SIP, for 
example,

and running "csrutil disable" on every system is not a security best
practice.

On Wed, Dec 8, 2021 at 4:46 PM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:


Disappointing...  I am having the same issue on BigSur with the 
11.0.5

release as I had with 9x.

08-Dec 15:42 SD JobId 878266: Fatal error: bsock.c:530 Packet
size=1387166 too big from "client:1.2.3.4:8103". Maximum permitted
100. Terminating connection.


Setting 'Maximum Network Buffer Size' does not appear to solve issue.
Are there users out there successfully running a bacula client on Big
Sur??
Stephen



On Wed, Dec 1, 2021 at 3:25 PM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

Not sure if this is correct, but I've been able to at least compile
bacula client 11.0.5 on Big Sur by doing before configure step:

LDFLAGS='-framework CoreFoundation'

We'll see next up whether it runs and whether it exhibits the 
issue seen

under Big Sur for 9x client.

Stephen

On Tue, Nov 23, 2021 at 7:32 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

Josh,

Thanks for the tip.  That did not appear to be the cause of this 
issue,
though perhaps it will fix a yet to be found issue that I would 
have run

into after I get past this compilation error.

Stephen



On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  
wrote:


On 11/22/21 10:46, Stephen Thompson wrote:

All,

I too was having the issue with running a 9x client on Big 

Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson


Graham,

Thanks.

I am confident that it's not a networking issue (at least one external 
to the Macs).  The new problem only shows on hosts that have been 
updated to Big Sur or Monterey (with or without rebuilt client, both 9x 
and 11s).  High Sierra and earlier hosts never yield the 'too big... 
Maximum permitted 100' error, but Big Sur/Monterey always do.


I use xcode along with homebrew openssl 1.1.

To further describe, the BigSur/Monterrey host jobs do partially 
complete, successfully sending many GBs of data and files, including a 
few warnings about unreadable system files, but ultimately the jobs crap 
out with the same error.


My build options...

BHOME=/Users/bacula
EMAIL=bacula@DOMAINNAME

env CFLAGS='-g -O2' LDFLAGS='-framework CoreFoundation' \
./configure \
--prefix=$BHOME \
--sbindir=$BHOME/bin \
--sysconfdir=$BHOME/conf \
--with-working-dir=$BHOME/work \
--with-archivedir=$BHOME/archive \
--with-bsrdir=$BHOME/log \
--with-logdir=$BHOME/log \
--with-pid-dir=/var/run \
--with-subsys-dir=/var/run \
--with-basename=SERVER \
--with-hostname=SERVER.DOMAINNAME \
--with-dump-email=$EMAIL \
--with-openssl=/usr/local/opt/openssl\@1.1 \
--enable-smartalloc \
--disable-readline \
--enable-conio \
--enable-client-only \
| tee configure.out


thanks again,
Stephen



On 1/4/22 10:54 AM, Graham Sparks wrote:

Hi Stephen,

I've had a quick read of the archive (I'm late to the mailing list party) and 
see you've tried lots, so I'll try to say something constructive.

I tried to recreate the packet size error, crudely, by directing the Bacula server to a 
web page instead of the client FD (incidentally, this recreates it well).  Therefore, I 
think it's worth making sure the server and client are communicating without 
interruption, just in case something else is being returned (perhaps a transparent 
proxy/firewall/web filter "blocked" message, or similar).

Maybe try:

1.  "status client=" in bconsole to check Bacula can communicate 
with the client.
2.  If not, issue "lsof -i -P | grep 9102" at the terminal on the client, to 
make sure 'bacula-fd' is running (on the default port).
3.  If 'bacula-fd' is listed as running, stop the Bacula File Daemon on the client to free port 9102, 
then run "nc -l 9102" to open a listener on the same port the file daemon uses, and send some 
text from the Bacula server using "nc  9102".  If TCP communications are 
good, you should see exactly the text you type on the server appear on the Mac's terminal after pressing 
return.

Sorry in advance if this is stuff you've already tried.

Just for completeness, one of the few things I have done to the Mac in question 
is install Xcode (I think it replaces the shipped installation of 'make', so 
there's a chance it affects compilation).

I'm not a big Mac user, I'm afraid.  It seems that just owning a Mac automatically makes 
one the "Mac guy" .

Thanks.


--
Stephen Thompson   Berkeley Seismology Lab
stephen.thomp...@berkeley.edu  307 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson




Yes, backing up a single file on my problem hosts does succeed.

H...

Stephen



On 1/4/22 11:23 AM, Stephen Thompson wrote:



That's a good test, which I apparently have not tried.  I will do so.

thanks,
Stephen


On 1/4/22 11:20 AM, Martin Simmons wrote:

Is this happening for all backups?

What happens if you run a backup with a minimal fileset that lists 
just one

small file?

__Martin



On Tue, 4 Jan 2022 08:13:46 -0800, Stephen Thompson said:


I am still seeing the same issue on Monterey as on Big Sur with 11.0.5
compiled from source and CoreFoundation linked in.

04-Jan 07:56 SD JobId 88: Fatal error: bsock.c:530 Packet 
size=1387165
too big from "client:1.2.3.4:9103". Maximum permitted 100. 
Terminating

connection.



Stephen

On Tue, Jan 4, 2022 at 7:02 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:



Graham,

Thanks for presenting Monterey as a possibility!  I am seeing the same
issue under Monterrey as I have under Big Sur, but to know someone else
does not means that it's possible.  I should double check that I am 
using a
freshly compiled client on Monterey and not just the one that I 
compiled on

Big Sur.

I am backing up Macs with bacula, but not really for system 
recovery, more
to backup user files/documents that they may not be backing up 
themselves.

I do note a number of Mac system files that refuse to be backed up, but
again for my purposes, I do not care too much.  It would be nice to 
be able
to BMR a Mac, but not a requirement where I am at, being 
operationally a

Linux shop.

Stephen




On Tue, Jan 4, 2022 at 6:20 AM Graham Sparks  
wrote:



Hi David,

I use Time Machine (for the System disk) as well as Bacula on my 
Mac, as
I'd still need the Time Machine backup to do a bare-metal restore 
(with

Apps). I use Bacula to back up this and an external data drive.

Rather than purchasing a separate "Time Capsule", I set up Samba on a
Linux VM to expose an SMB share that the Mac sees as a Time Capsule 
drive (
https://wiki.samba.org/index.php/Configure_Samba_to_Work_Better_with_Mac_OS_X 


).

I had one problem with Time Machine a few months ago, where it stopped
backing up data and insisted on starting the backup 'chain' from 
scratch

again.  I was a little miffed .

I'm afraid I can only confirm that the Bacula v9.6 and v11 file 
daemons
worked for me under macOS Catalina and Monetery (I skipped Big 
Sur.  Not
for good reason---just laziness).  Both v9 and v11 clients were 
compiled
from source (setting the linker flags to "-framework 
CoreFoundation" as

already suggested).

I've personally not run in to problems with System Integrity 
Protection,

although I do give the bacula-fd executable "Full Disk" permissions.

Thanks.
--
Graham Sparks



From: David Brodbeck 
Sent: 03 January 2022 18:36
Cc: bacula-users@lists.sourceforge.net <
bacula-users@lists.sourceforge.net>
Subject: Re: [Bacula-users] Packet size too big (NOT a version 
mismatch)


I'm curious if anyone has moved away from Bacula on macOS and what
alternatives they're using. Even before this, it was getting more 
and more
awkward to set up -- bacula really doesn't play well with SIP, for 
example,

and running "csrutil disable" on every system is not a security best
practice.

On Wed, Dec 8, 2021 at 4:46 PM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:


Disappointing...  I am having the same issue on BigSur with the 11.0.5
release as I had with 9x.

08-Dec 15:42 SD JobId 878266: Fatal error: bsock.c:530 Packet
size=1387166 too big from "client:1.2.3.4:8103". Maximum permitted
100. Terminating connection.


Setting 'Maximum Network Buffer Size' does not appear to solve issue.
Are there users out there successfully running a bacula client on Big
Sur??
Stephen



On Wed, Dec 1, 2021 at 3:25 PM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

Not sure if this is correct, but I've been able to at least compile
bacula client 11.0.5 on Big Sur by doing before configure step:

LDFLAGS='-framework CoreFoundation'

We'll see next up whether it runs and whether it exhibits the issue 
seen

under Big Sur for 9x client.

Stephen

On Tue, Nov 23, 2021 at 7:32 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

Josh,

Thanks for the tip.  That did not appear to be the cause of this 
issue,
though perhaps it will fix a yet to be found issue that I would 
have run

into after I get past this compilation error.

Stephen



On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  
wrote:


On 11/22/21 10:46, Stephen Thompson wrote:

All,

I too was having the issue with running a 9x client on Big Sur.  I've
tried compiling 11.0.5 but have not found my way past:

This might be due to a libtool.m4 bug having to do with MacOS changing
the major Darwin version from 19.x to 20.x. There is a patch at
https://www.mail-archive.com/libtool-patches@gnu.org/msg07396.html


Linking bacula-fd ...
/U

Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson



That's a good test, which I apparently have not tried.  I will do so.

thanks,
Stephen


On 1/4/22 11:20 AM, Martin Simmons wrote:

Is this happening for all backups?

What happens if you run a backup with a minimal fileset that lists just one
small file?

__Martin



On Tue, 4 Jan 2022 08:13:46 -0800, Stephen Thompson said:


I am still seeing the same issue on Monterey as on Big Sur with 11.0.5
compiled from source and CoreFoundation linked in.

04-Jan 07:56 SD JobId 88: Fatal error: bsock.c:530 Packet size=1387165
too big from "client:1.2.3.4:9103". Maximum permitted 100. Terminating
connection.



Stephen

On Tue, Jan 4, 2022 at 7:02 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:



Graham,

Thanks for presenting Monterey as a possibility!  I am seeing the same
issue under Monterrey as I have under Big Sur, but to know someone else
does not means that it's possible.  I should double check that I am using a
freshly compiled client on Monterey and not just the one that I compiled on
Big Sur.

I am backing up Macs with bacula, but not really for system recovery, more
to backup user files/documents that they may not be backing up themselves.
I do note a number of Mac system files that refuse to be backed up, but
again for my purposes, I do not care too much.  It would be nice to be able
to BMR a Mac, but not a requirement where I am at, being operationally a
Linux shop.

Stephen




On Tue, Jan 4, 2022 at 6:20 AM Graham Sparks  wrote:


Hi David,

I use Time Machine (for the System disk) as well as Bacula on my Mac, as
I'd still need the Time Machine backup to do a bare-metal restore (with
Apps). I use Bacula to back up this and an external data drive.

Rather than purchasing a separate "Time Capsule", I set up Samba on a
Linux VM to expose an SMB share that the Mac sees as a Time Capsule drive (
https://wiki.samba.org/index.php/Configure_Samba_to_Work_Better_with_Mac_OS_X
).

I had one problem with Time Machine a few months ago, where it stopped
backing up data and insisted on starting the backup 'chain' from scratch
again.  I was a little miffed .

I'm afraid I can only confirm that the Bacula v9.6 and v11 file daemons
worked for me under macOS Catalina and Monetery (I skipped Big Sur.  Not
for good reason---just laziness).  Both v9 and v11 clients were compiled
from source (setting the linker flags to "-framework CoreFoundation" as
already suggested).

I've personally not run in to problems with System Integrity Protection,
although I do give the bacula-fd executable "Full Disk" permissions.

Thanks.
--
Graham Sparks



From: David Brodbeck 
Sent: 03 January 2022 18:36
Cc: bacula-users@lists.sourceforge.net <
bacula-users@lists.sourceforge.net>
Subject: Re: [Bacula-users] Packet size too big (NOT a version mismatch)

I'm curious if anyone has moved away from Bacula on macOS and what
alternatives they're using. Even before this, it was getting more and more
awkward to set up -- bacula really doesn't play well with SIP, for example,
and running "csrutil disable" on every system is not a security best
practice.

On Wed, Dec 8, 2021 at 4:46 PM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:


Disappointing...  I am having the same issue on BigSur with the 11.0.5
release as I had with 9x.

08-Dec 15:42 SD JobId 878266: Fatal error: bsock.c:530 Packet
size=1387166 too big from "client:1.2.3.4:8103". Maximum permitted
100. Terminating connection.


Setting 'Maximum Network Buffer Size' does not appear to solve issue.
Are there users out there successfully running a bacula client on Big
Sur??
Stephen



On Wed, Dec 1, 2021 at 3:25 PM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

Not sure if this is correct, but I've been able to at least compile
bacula client 11.0.5 on Big Sur by doing before configure step:

LDFLAGS='-framework CoreFoundation'

We'll see next up whether it runs and whether it exhibits the issue seen
under Big Sur for 9x client.

Stephen

On Tue, Nov 23, 2021 at 7:32 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

Josh,

Thanks for the tip.  That did not appear to be the cause of this issue,
though perhaps it will fix a yet to be found issue that I would have run
into after I get past this compilation error.

Stephen



On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  wrote:

On 11/22/21 10:46, Stephen Thompson wrote:

All,

I too was having the issue with running a 9x client on Big Sur.  I've
tried compiling 11.0.5 but have not found my way past:

This might be due to a libtool.m4 bug having to do with MacOS changing
the major Darwin version from 19.x to 20.x. There is a patch at
https://www.mail-archive.com/libtool-patches@gnu.org/msg07396.html


Linking bacula-fd ...
/Users/bacula/src/bacula-11.0.5-CLIENT.MAC/libtool --silent --tag=CXX
--mode=link /usr/bin/g++   -L../lib -L../findlib -o bacula-fd filed.o
authenticate.o backup.o crypto.o win

Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson
I am still seeing the same issue on Monterey as on Big Sur with 11.0.5
compiled from source and CoreFoundation linked in.

04-Jan 07:56 SD JobId 88: Fatal error: bsock.c:530 Packet size=1387165
too big from "client:1.2.3.4:9103". Maximum permitted 100. Terminating
connection.



Stephen

On Tue, Jan 4, 2022 at 7:02 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

>
> Graham,
>
> Thanks for presenting Monterey as a possibility!  I am seeing the same
> issue under Monterrey as I have under Big Sur, but to know someone else
> does not means that it's possible.  I should double check that I am using a
> freshly compiled client on Monterey and not just the one that I compiled on
> Big Sur.
>
> I am backing up Macs with bacula, but not really for system recovery, more
> to backup user files/documents that they may not be backing up themselves.
> I do note a number of Mac system files that refuse to be backed up, but
> again for my purposes, I do not care too much.  It would be nice to be able
> to BMR a Mac, but not a requirement where I am at, being operationally a
> Linux shop.
>
> Stephen
>
>
>
>
> On Tue, Jan 4, 2022 at 6:20 AM Graham Sparks  wrote:
>
>> Hi David,
>>
>> I use Time Machine (for the System disk) as well as Bacula on my Mac, as
>> I'd still need the Time Machine backup to do a bare-metal restore (with
>> Apps). I use Bacula to back up this and an external data drive.
>>
>> Rather than purchasing a separate "Time Capsule", I set up Samba on a
>> Linux VM to expose an SMB share that the Mac sees as a Time Capsule drive (
>> https://wiki.samba.org/index.php/Configure_Samba_to_Work_Better_with_Mac_OS_X
>> ).
>>
>> I had one problem with Time Machine a few months ago, where it stopped
>> backing up data and insisted on starting the backup 'chain' from scratch
>> again.  I was a little miffed .
>>
>> I'm afraid I can only confirm that the Bacula v9.6 and v11 file daemons
>> worked for me under macOS Catalina and Monetery (I skipped Big Sur.  Not
>> for good reason---just laziness).  Both v9 and v11 clients were compiled
>> from source (setting the linker flags to "-framework CoreFoundation" as
>> already suggested).
>>
>> I've personally not run in to problems with System Integrity Protection,
>> although I do give the bacula-fd executable "Full Disk" permissions.
>>
>> Thanks.
>> --
>> Graham Sparks
>>
>>
>>
>> From: David Brodbeck 
>> Sent: 03 January 2022 18:36
>> Cc: bacula-users@lists.sourceforge.net <
>> bacula-users@lists.sourceforge.net>
>> Subject: Re: [Bacula-users] Packet size too big (NOT a version mismatch)
>>
>> I'm curious if anyone has moved away from Bacula on macOS and what
>> alternatives they're using. Even before this, it was getting more and more
>> awkward to set up -- bacula really doesn't play well with SIP, for example,
>> and running "csrutil disable" on every system is not a security best
>> practice.
>>
>> On Wed, Dec 8, 2021 at 4:46 PM Stephen Thompson <
>> stephen.thomp...@berkeley.edu> wrote:
>>
>>
>> Disappointing...  I am having the same issue on BigSur with the 11.0.5
>> release as I had with 9x.
>>
>> 08-Dec 15:42 SD JobId 878266: Fatal error: bsock.c:530 Packet
>> size=1387166 too big from "client:1.2.3.4:8103". Maximum permitted
>> 100. Terminating connection.
>>
>>
>> Setting 'Maximum Network Buffer Size' does not appear to solve issue.
>> Are there users out there successfully running a bacula client on Big
>> Sur??
>> Stephen
>>
>>
>>
>> On Wed, Dec 1, 2021 at 3:25 PM Stephen Thompson <
>> stephen.thomp...@berkeley.edu> wrote:
>>
>> Not sure if this is correct, but I've been able to at least compile
>> bacula client 11.0.5 on Big Sur by doing before configure step:
>>
>> LDFLAGS='-framework CoreFoundation'
>>
>> We'll see next up whether it runs and whether it exhibits the issue seen
>> under Big Sur for 9x client.
>>
>> Stephen
>>
>> On Tue, Nov 23, 2021 at 7:32 AM Stephen Thompson <
>> stephen.thomp...@berkeley.edu> wrote:
>>
>> Josh,
>>
>> Thanks for the tip.  That did not appear to be the cause of this issue,
>> though perhaps it will fix a yet to be found issue that I would have run
>> into after I get past this compilation error.
>>
>> Stephen
>>
>>
>>
>> On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  wrote:
>>
>> On 11/22/21 10:46, Steph

Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2022-01-04 Thread Stephen Thompson
Graham,

Thanks for presenting Monterey as a possibility!  I am seeing the same
issue under Monterrey as I have under Big Sur, but to know someone else
does not means that it's possible.  I should double check that I am using a
freshly compiled client on Monterey and not just the one that I compiled on
Big Sur.

I am backing up Macs with bacula, but not really for system recovery, more
to backup user files/documents that they may not be backing up themselves.
I do note a number of Mac system files that refuse to be backed up, but
again for my purposes, I do not care too much.  It would be nice to be able
to BMR a Mac, but not a requirement where I am at, being operationally a
Linux shop.

Stephen




On Tue, Jan 4, 2022 at 6:20 AM Graham Sparks  wrote:

> Hi David,
>
> I use Time Machine (for the System disk) as well as Bacula on my Mac, as
> I'd still need the Time Machine backup to do a bare-metal restore (with
> Apps). I use Bacula to back up this and an external data drive.
>
> Rather than purchasing a separate "Time Capsule", I set up Samba on a
> Linux VM to expose an SMB share that the Mac sees as a Time Capsule drive (
> https://wiki.samba.org/index.php/Configure_Samba_to_Work_Better_with_Mac_OS_X
> ).
>
> I had one problem with Time Machine a few months ago, where it stopped
> backing up data and insisted on starting the backup 'chain' from scratch
> again.  I was a little miffed .
>
> I'm afraid I can only confirm that the Bacula v9.6 and v11 file daemons
> worked for me under macOS Catalina and Monetery (I skipped Big Sur.  Not
> for good reason---just laziness).  Both v9 and v11 clients were compiled
> from source (setting the linker flags to "-framework CoreFoundation" as
> already suggested).
>
> I've personally not run in to problems with System Integrity Protection,
> although I do give the bacula-fd executable "Full Disk" permissions.
>
> Thanks.
> --
> Graham Sparks
>
>
>
> From: David Brodbeck 
> Sent: 03 January 2022 18:36
> Cc: bacula-users@lists.sourceforge.net  >
> Subject: Re: [Bacula-users] Packet size too big (NOT a version mismatch)
>
> I'm curious if anyone has moved away from Bacula on macOS and what
> alternatives they're using. Even before this, it was getting more and more
> awkward to set up -- bacula really doesn't play well with SIP, for example,
> and running "csrutil disable" on every system is not a security best
> practice.
>
> On Wed, Dec 8, 2021 at 4:46 PM Stephen Thompson <
> stephen.thomp...@berkeley.edu> wrote:
>
>
> Disappointing...  I am having the same issue on BigSur with the 11.0.5
> release as I had with 9x.
>
> 08-Dec 15:42 SD JobId 878266: Fatal error: bsock.c:530 Packet size=1387166
> too big from "client:1.2.3.4:8103". Maximum permitted 100.
> Terminating connection.
>
>
> Setting 'Maximum Network Buffer Size' does not appear to solve issue.
> Are there users out there successfully running a bacula client on Big Sur??
> Stephen
>
>
>
> On Wed, Dec 1, 2021 at 3:25 PM Stephen Thompson <
> stephen.thomp...@berkeley.edu> wrote:
>
> Not sure if this is correct, but I've been able to at least compile bacula
> client 11.0.5 on Big Sur by doing before configure step:
>
> LDFLAGS='-framework CoreFoundation'
>
> We'll see next up whether it runs and whether it exhibits the issue seen
> under Big Sur for 9x client.
>
> Stephen
>
> On Tue, Nov 23, 2021 at 7:32 AM Stephen Thompson <
> stephen.thomp...@berkeley.edu> wrote:
>
> Josh,
>
> Thanks for the tip.  That did not appear to be the cause of this issue,
> though perhaps it will fix a yet to be found issue that I would have run
> into after I get past this compilation error.
>
> Stephen
>
>
>
> On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  wrote:
>
> On 11/22/21 10:46, Stephen Thompson wrote:
>
> All,
>
> I too was having the issue with running a 9x client on Big Sur.  I've
> tried compiling 11.0.5 but have not found my way past:
>
> This might be due to a libtool.m4 bug having to do with MacOS changing the
> major Darwin version from 19.x to 20.x. There is a patch at
> https://www.mail-archive.com/libtool-patches@gnu.org/msg07396.html
>
>
> Linking bacula-fd ...
> /Users/bacula/src/bacula-11.0.5-CLIENT.MAC/libtool --silent --tag=CXX
> --mode=link /usr/bin/g++   -L../lib -L../findlib -o bacula-fd filed.o
> authenticate.o backup.o crypto.o win_efs.o estimate.o fdcollect.o
> fd_plugins.o accurate.o bacgpfs.o filed_conf.o runres_conf.o heartbeat.o
> hello.o job.o fd_snapshot.o restore.o status.o verify.o verify_vol.o
> fdcallsdir.o suspend.o org_filed_dedup.o bacl.o bacl_osx.o bxattr.o
> bxattr_osx.o \
> -lz -lb

Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2021-12-08 Thread Stephen Thompson
Disappointing...  I am having the same issue on BigSur with the 11.0.5
release as I had with 9x.

08-Dec 15:42 SD JobId 878266: Fatal error: bsock.c:530 Packet
size=1387166 too big from "client:1.2.3.4:8103". Maximum permitted
100. Terminating connection.



Setting 'Maximum Network Buffer Size' does not appear to solve issue.

Are there users out there successfully running a bacula client on Big Sur??

Stephen




On Wed, Dec 1, 2021 at 3:25 PM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

>
> Not sure if this is correct, but I've been able to at least compile bacula
> client 11.0.5 on Big Sur by doing before configure step:
>
> LDFLAGS='-framework CoreFoundation'
>
> We'll see next up whether it runs and whether it exhibits the issue seen
> under Big Sur for 9x client.
>
> Stephen
>
> On Tue, Nov 23, 2021 at 7:32 AM Stephen Thompson <
> stephen.thomp...@berkeley.edu> wrote:
>
>>
>> Josh,
>>
>> Thanks for the tip.  That did not appear to be the cause of this issue,
>> though perhaps it will fix a yet to be found issue that I would have run
>> into after I get past this compilation error.
>>
>> Stephen
>>
>>
>>
>> On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  wrote:
>>
>>>
>>> On 11/22/21 10:46, Stephen Thompson wrote:
>>>
>>>
>>> All,
>>>
>>> I too was having the issue with running a 9x client on Big Sur.  I've
>>> tried compiling 11.0.5 but have not found my way past:
>>>
>>>
>>> This might be due to a libtool.m4 bug having to do with MacOS changing
>>> the major Darwin version from 19.x to 20.x. There is a patch at
>>> https://www.mail-archive.com/libtool-patches@gnu.org/msg07396.html
>>>
>>>
>>>
>>> Linking bacula-fd ...
>>>
>>> /Users/bacula/src/bacula-11.0.5-CLIENT.MAC/libtool --silent --tag=CXX
>>> --mode=link /usr/bin/g++   -L../lib -L../findlib -o bacula-fd filed.o
>>> authenticate.o backup.o crypto.o win_efs.o estimate.o fdcollect.o
>>> fd_plugins.o accurate.o bacgpfs.o filed_conf.o runres_conf.o heartbeat.o
>>> hello.o job.o fd_snapshot.o restore.o status.o verify.o verify_vol.o
>>> fdcallsdir.o suspend.o org_filed_dedup.o bacl.o bacl_osx.o bxattr.o
>>> bxattr_osx.o \
>>>
>>> -lz -lbacfind -lbaccfg -lbac -lm -lpthread  \
>>>
>>> -L/usr/local/opt/openssl@1.1/lib -lssl -lcrypto-framework IOKit
>>>
>>> Undefined symbols for architecture x86_64:
>>>
>>>   "___CFConstantStringClassReference", referenced from:
>>>
>>>   CFString in suspend.o
>>>
>>>   CFString in suspend.o
>>>
>>> ld: symbol(s) not found for architecture x86_64
>>>
>>> clang: error: linker command failed with exit code 1 (use -v to see
>>> invocation)
>>>
>>> make[1]: *** [bacula-fd] Error 1
>>>
>>>
>>>
>>> Seems like this might have something to do with the expection of headers
>>> being here:
>>>
>>> /System/Library/Frameworks/CoreFoundation.framework/Headers
>>>
>>> when they are here:
>>>
>>>
>>> /Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk/System/Library/Frameworks/CoreFoundation.framework/Headers/
>>> but that may be a red herring.
>>>
>>> There also appears to be a 'clang' in two locations on OS X, /usr and
>>> xcode subdir.  Hmm
>>>
>>> Stephen
>>>
>>> On Tue, Nov 16, 2021 at 12:00 AM Eric Bollengier via Bacula-users <
>>> bacula-users@lists.sourceforge.net> wrote:
>>>
>>>> Hello,
>>>>
>>>> On 11/15/21 21:46, David Brodbeck wrote:
>>>> > To do that I'd have to upgrade the director and the storage first,
>>>> right?
>>>> > (Director can't be an earlier version than the FD, and the SD must
>>>> have the
>>>> > same version as the director.)
>>>>
>>>> In general yes, the code is designed to support Old FDs but can have
>>>> problems
>>>> with newer FDs. In your case it may work.
>>>>
>>>> At least, you can try a status client to see if the problem is solved
>>>> and
>>>> if you can run a backup & a restore.
>>>>
>>>> Best Regards,
>>>> Eric
>>>>
>>>>
>>>> ___
>>>> Bacula-users mailing list
>>>> Ba

Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2021-12-01 Thread Stephen Thompson
Not sure if this is correct, but I've been able to at least compile bacula
client 11.0.5 on Big Sur by doing before configure step:

LDFLAGS='-framework CoreFoundation'

We'll see next up whether it runs and whether it exhibits the issue seen
under Big Sur for 9x client.

Stephen

On Tue, Nov 23, 2021 at 7:32 AM Stephen Thompson <
stephen.thomp...@berkeley.edu> wrote:

>
> Josh,
>
> Thanks for the tip.  That did not appear to be the cause of this issue,
> though perhaps it will fix a yet to be found issue that I would have run
> into after I get past this compilation error.
>
> Stephen
>
>
>
> On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  wrote:
>
>>
>> On 11/22/21 10:46, Stephen Thompson wrote:
>>
>>
>> All,
>>
>> I too was having the issue with running a 9x client on Big Sur.  I've
>> tried compiling 11.0.5 but have not found my way past:
>>
>>
>> This might be due to a libtool.m4 bug having to do with MacOS changing
>> the major Darwin version from 19.x to 20.x. There is a patch at
>> https://www.mail-archive.com/libtool-patches@gnu.org/msg07396.html
>>
>>
>>
>> Linking bacula-fd ...
>>
>> /Users/bacula/src/bacula-11.0.5-CLIENT.MAC/libtool --silent --tag=CXX
>> --mode=link /usr/bin/g++   -L../lib -L../findlib -o bacula-fd filed.o
>> authenticate.o backup.o crypto.o win_efs.o estimate.o fdcollect.o
>> fd_plugins.o accurate.o bacgpfs.o filed_conf.o runres_conf.o heartbeat.o
>> hello.o job.o fd_snapshot.o restore.o status.o verify.o verify_vol.o
>> fdcallsdir.o suspend.o org_filed_dedup.o bacl.o bacl_osx.o bxattr.o
>> bxattr_osx.o \
>>
>> -lz -lbacfind -lbaccfg -lbac -lm -lpthread  \
>>
>> -L/usr/local/opt/openssl@1.1/lib -lssl -lcrypto-framework IOKit
>>
>> Undefined symbols for architecture x86_64:
>>
>>   "___CFConstantStringClassReference", referenced from:
>>
>>   CFString in suspend.o
>>
>>   CFString in suspend.o
>>
>> ld: symbol(s) not found for architecture x86_64
>>
>> clang: error: linker command failed with exit code 1 (use -v to see
>> invocation)
>>
>> make[1]: *** [bacula-fd] Error 1
>>
>>
>>
>> Seems like this might have something to do with the expection of headers
>> being here:
>>
>> /System/Library/Frameworks/CoreFoundation.framework/Headers
>>
>> when they are here:
>>
>>
>> /Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk/System/Library/Frameworks/CoreFoundation.framework/Headers/
>> but that may be a red herring.
>>
>> There also appears to be a 'clang' in two locations on OS X, /usr and
>> xcode subdir.  Hmm
>>
>> Stephen
>>
>> On Tue, Nov 16, 2021 at 12:00 AM Eric Bollengier via Bacula-users <
>> bacula-users@lists.sourceforge.net> wrote:
>>
>>> Hello,
>>>
>>> On 11/15/21 21:46, David Brodbeck wrote:
>>> > To do that I'd have to upgrade the director and the storage first,
>>> right?
>>> > (Director can't be an earlier version than the FD, and the SD must
>>> have the
>>> > same version as the director.)
>>>
>>> In general yes, the code is designed to support Old FDs but can have
>>> problems
>>> with newer FDs. In your case it may work.
>>>
>>> At least, you can try a status client to see if the problem is solved and
>>> if you can run a backup & a restore.
>>>
>>> Best Regards,
>>> Eric
>>>
>>>
>>> ___
>>> Bacula-users mailing list
>>> Bacula-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>>
>>
>>
>> --
>> Stephen Thompson   Berkeley Seismology Lab
>> stephen.thomp...@berkeley.edu  307 McCone Hall
>> Office: 510.664.9177   University of California
>> Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760
>>
>>
>> ___
>> Bacula-users mailing 
>> listBacula-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>>
>
> --
> Stephen Thompson   Berkeley Seismology Lab
> stephen.thomp...@berkeley.edu  307 McCone Hall
> Office: 510.664.9177   University of California
> Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760
>


-- 
Stephen Thompson   Berkeley Seismology Lab
stephen.thomp...@berkeley.edu  307 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2021-11-23 Thread Stephen Thompson
Josh,

Thanks for the tip.  That did not appear to be the cause of this issue,
though perhaps it will fix a yet to be found issue that I would have run
into after I get past this compilation error.

Stephen



On Mon, Nov 22, 2021 at 9:22 AM Josh Fisher  wrote:

>
> On 11/22/21 10:46, Stephen Thompson wrote:
>
>
> All,
>
> I too was having the issue with running a 9x client on Big Sur.  I've
> tried compiling 11.0.5 but have not found my way past:
>
>
> This might be due to a libtool.m4 bug having to do with MacOS changing the
> major Darwin version from 19.x to 20.x. There is a patch at
> https://www.mail-archive.com/libtool-patches@gnu.org/msg07396.html
>
>
>
> Linking bacula-fd ...
>
> /Users/bacula/src/bacula-11.0.5-CLIENT.MAC/libtool --silent --tag=CXX
> --mode=link /usr/bin/g++   -L../lib -L../findlib -o bacula-fd filed.o
> authenticate.o backup.o crypto.o win_efs.o estimate.o fdcollect.o
> fd_plugins.o accurate.o bacgpfs.o filed_conf.o runres_conf.o heartbeat.o
> hello.o job.o fd_snapshot.o restore.o status.o verify.o verify_vol.o
> fdcallsdir.o suspend.o org_filed_dedup.o bacl.o bacl_osx.o bxattr.o
> bxattr_osx.o \
>
> -lz -lbacfind -lbaccfg -lbac -lm -lpthread  \
>
> -L/usr/local/opt/openssl@1.1/lib -lssl -lcrypto-framework IOKit
>
> Undefined symbols for architecture x86_64:
>
>   "___CFConstantStringClassReference", referenced from:
>
>   CFString in suspend.o
>
>   CFString in suspend.o
>
> ld: symbol(s) not found for architecture x86_64
>
> clang: error: linker command failed with exit code 1 (use -v to see
> invocation)
>
> make[1]: *** [bacula-fd] Error 1
>
>
>
> Seems like this might have something to do with the expection of headers
> being here:
>
> /System/Library/Frameworks/CoreFoundation.framework/Headers
>
> when they are here:
>
>
> /Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk/System/Library/Frameworks/CoreFoundation.framework/Headers/
> but that may be a red herring.
>
> There also appears to be a 'clang' in two locations on OS X, /usr and
> xcode subdir.  Hmm
>
> Stephen
>
> On Tue, Nov 16, 2021 at 12:00 AM Eric Bollengier via Bacula-users <
> bacula-users@lists.sourceforge.net> wrote:
>
>> Hello,
>>
>> On 11/15/21 21:46, David Brodbeck wrote:
>> > To do that I'd have to upgrade the director and the storage first,
>> right?
>> > (Director can't be an earlier version than the FD, and the SD must have
>> the
>> > same version as the director.)
>>
>> In general yes, the code is designed to support Old FDs but can have
>> problems
>> with newer FDs. In your case it may work.
>>
>> At least, you can try a status client to see if the problem is solved and
>> if you can run a backup & a restore.
>>
>> Best Regards,
>> Eric
>>
>>
>> ___
>> Bacula-users mailing list
>> Bacula-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>
>
> --
> Stephen Thompson   Berkeley Seismology Lab
> stephen.thomp...@berkeley.edu  307 McCone Hall
> Office: 510.664.9177   University of California
> Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760
>
>
> ___
> Bacula-users mailing 
> listBacula-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/bacula-users
>
>

-- 
Stephen Thompson   Berkeley Seismology Lab
stephen.thomp...@berkeley.edu  307 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2021-11-22 Thread Stephen Thompson
All,

I too was having the issue with running a 9x client on Big Sur.  I've tried
compiling 11.0.5 but have not found my way past:

Linking bacula-fd ...

/Users/bacula/src/bacula-11.0.5-CLIENT.MAC/libtool --silent --tag=CXX
--mode=link /usr/bin/g++   -L../lib -L../findlib -o bacula-fd filed.o
authenticate.o backup.o crypto.o win_efs.o estimate.o fdcollect.o
fd_plugins.o accurate.o bacgpfs.o filed_conf.o runres_conf.o heartbeat.o
hello.o job.o fd_snapshot.o restore.o status.o verify.o verify_vol.o
fdcallsdir.o suspend.o org_filed_dedup.o bacl.o bacl_osx.o bxattr.o
bxattr_osx.o \

-lz -lbacfind -lbaccfg -lbac -lm -lpthread  \

-L/usr/local/opt/openssl@1.1/lib -lssl -lcrypto-framework IOKit

Undefined symbols for architecture x86_64:

  "___CFConstantStringClassReference", referenced from:

  CFString in suspend.o

  CFString in suspend.o

ld: symbol(s) not found for architecture x86_64

clang: error: linker command failed with exit code 1 (use -v to see
invocation)

make[1]: *** [bacula-fd] Error 1



Seems like this might have something to do with the expection of headers
being here:

/System/Library/Frameworks/CoreFoundation.framework/Headers

when they are here:

/Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk/System/Library/Frameworks/CoreFoundation.framework/Headers/
but that may be a red herring.

There also appears to be a 'clang' in two locations on OS X, /usr and xcode
subdir.  Hmm

Stephen

On Tue, Nov 16, 2021 at 12:00 AM Eric Bollengier via Bacula-users <
bacula-users@lists.sourceforge.net> wrote:

> Hello,
>
> On 11/15/21 21:46, David Brodbeck wrote:
> > To do that I'd have to upgrade the director and the storage first, right?
> > (Director can't be an earlier version than the FD, and the SD must have
> the
> > same version as the director.)
>
> In general yes, the code is designed to support Old FDs but can have
> problems
> with newer FDs. In your case it may work.
>
> At least, you can try a status client to see if the problem is solved and
> if you can run a backup & a restore.
>
> Best Regards,
> Eric
>
>
> ___
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>


-- 
Stephen Thompson   Berkeley Seismology Lab
stephen.thomp...@berkeley.edu  307 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Packet size too big (NOT a version mismatch)

2021-11-12 Thread Stephen Thompson



David,

Sorry I can't offer a solution, but I can report that am I getting the 
same error when trying to run bacula-fd 9.x on Big Sur (hand compiled).


I've tried the other suggestion of Maximum Network Buffer Size to no avail.

Stephen



On 11/12/21 2:14 PM, David Brodbeck wrote:
I'm getting this error trying to back up a macOS client. I recently 
re-installed bacula from macports on this client, after an upgrade to 
macOS Big Sur.


| russell.math.ucsb.edu-sd JobId 80985: Fatal error: bsock.c:520 Packet 
size=1387166 too big from "client:128.111.88.29:62571 
<http://128.111.88.29:62571>". Maximum permitted 100. Terminating 
connection. |


Normally when I've seen this it's because of a version mismatch between 
the client and the director or storage daemon, but that's not the case 
here; the director, sd, and fd are all running the same version:


1000 OK: 103 self-help.math.ucsb.edu-dir Version: 9.4.4 (28 May 2019)
russell.math.ucsb.edu-sd Version: 9.4.4 (28 May 2019) 
x86_64-pc-linux-gnu redhat (Core)
noether.math.ucsb.edu-fd Version: 9.4.4 (28 May 2019) 
  x86_64-apple-darwin20.6.0 osx 20.6.0


All except the fd are built directly from bacula source. (The fd was 
built with macports.)


Any suggestions on where to look? Other clients are backing up fine to 
the same sd, so I feel like it must be a client configuration issue, but 
I can't figure out how.


--
David Brodbeck (they/them)
System Administrator, Department of Mathematics
University of California, Santa Barbara



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



--
Stephen Thompson   Berkeley Seismology Lab
stephen.thomp...@berkeley.edu  307 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] possibly new mtx timing bug in 9x?

2018-07-10 Thread Stephen Thompson




update... they may very well be hardware, though it did not seem like it 
at first.  if it's a timing issue, it's not with the library but the drive.


On 7/6/18 7:26 AM, Stephen Thompson wrote:


Not sure if anyone else is seeing this, but sporadically, perhaps 2-3 
times a month, after running various version of bacula on the same 
server/tape library for 10 years now and now running 9.0.6, we are 
seeing cases where bacula want's user intervention to mount a tape in a 
drive, but the tape is already in the drive AND bacula put it there. The 
only thing I can think is that the tape load step is somehow timing out 
and then not making the check to see whether the tape made it to the 
drive or not.


thanks,
Stephen


--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] client initiated backups - bconsole vs tray?

2018-07-08 Thread Stephen Thompson


I got this working without TLS enabled.

Not sure why that breaks it, but perhaps something with how proxy is 
handled and how the TLS settings may not be applied properly for a 
remote=yes config, even though they are accepted as options.


Oddly, even if I just enabled tls from the console to the FD, which 
allows for a local status of FD, this breaks proxy, even when the proxy 
connection does not have TLS enabled and works if the console to FD also 
has no TLS enabled.  That the proxy connection might not be expecting 
TLS I can understand, but seems odd that a successful TLS connection 
from console to FD would break the FD's ability to proxy to the remote 
Director.


Stephen



On 7/6/18 2:44 PM, Stephen Thompson wrote:



Well this led to unexpected results.  Still 9.0.6, but running both FD 
and DIR in foreground with d900 both show startup messages, show console 
connecting to FD, show FD connecting to DIR when "proxy" is sent, but 
then when any command is sent and hangs, NEITHER FD NOR DIRECTOR output 
anything at all!


2000 OK Hello 214
Enter a period to cancel a command.
*proxy
2000 proxy OK.
*status


FD output...
fd: hello.c:262-0 Connecting to Director DIRECTOR:9101
fd: watchdog.c:197-0 Registered watchdog 7fc73401fa68, interval 15 one shot
fd: btimers.c:145-0 Start thread timer 7fc73401d498 tid 7fc73bf4c700 for 
15 secs.

fd: bsock.c:237-0 Current A.B.C.D:9101 All W.X.Y.Z:9101
fd: bsock.c:166-0 who=Director daemon host=DIRECTOR port=9101
fd: bsock.c:349-0 OK connected to server  Director daemon DIRECTOR:9101.
fd: btimers.c:203-0 Stop thread timer 7fc73401d498 tid=7fc73bf4c700.
fd: watchdog.c:217-0 Unregistered watchdog 7fc73401fa68
fd: watchdog.c:197-0 Registered watchdog 7fc73401d498, interval 15 one shot
fd: btimers.c:177-0 Start bsock timer 7fc734005d18 tid=7fc73bf4c700 for 
15 secs at 1530890871
fd: cram-md5.c:133-0 cram-get received: auth cram-md5 
<195401314.1530890871@dir> ssl=2

fd: cram-md5.c:157-0 sending resp to challenge: jlJ1z7+S47xwcCkb2S+GGD
fd: cram-md5.c:76-0 send: auth cram-md5 challenge 
<88308421.1530890871@fd> ssl=2

fd: cram-md5.c:95-0 Authenticate OK GD/TjH/8Dwc+4C0mJ8+2oD
fd: tls.c:392-0 Check subject name name
fd: bnet.c:280-0 TLS client negotiation established.
fd: hello.c:335-0 >dird: 1000 OK auth
fd: hello.c:342-0 November 2017)

fd: hello.c:345-0 1000 OK: 103 DIRECTOR Version: 9.0.6 (20 November 2017)



DIR output
dir: bnet.c:569-0 socket=6 who=client host=A.B.C.D port=9101
dir: jcr.c:931-0 set_jcr_job_status(0, C)
dir: jcr.c:940-0 OnEntry JobStatus=0 newJobstatus=C
dir: jcr.c:951-0 Set new stat. old: 0,0 new: C,0
dir: jcr.c:956-0 leave setJobStatus old=0 new=C
dir: job.c:1760-0 wstorage=STORAGE
dir: job.c:1769-0 wstore=STORAGE where=Job resource
dir: job.c:1429-0 JobId=0 created Job=-Console-.2018-07-06_08.27.51_05
dir: jcr.c:931-0 set_jcr_job_status(0, R)
dir: jcr.c:940-0 OnEntry JobStatus=C newJobstatus=R
dir: jcr.c:951-0 Set new stat. old: C,0 new: R,0
dir: jcr.c:956-0 leave setJobStatus old=C new=R
dir: cram-md5.c:69-0 send: auth cram-md5 challenge 
<195401314.1530890871@dir> ssl=2
dir: cram-md5.c:133-0 cram-get received: auth cram-md5 
<88308421.1530890871@fd> ssl=2

dir: cram-md5.c:157-0 sending resp to challenge: GD/TjH/8Dwc+4C0mJ8+2oD
lawson-dir: bnet.c:230-0 TLS server negotiation established.


I'm going to build 9.0.8 and see if I get different results.
I believe I skipped TLS with the same results.
Stephen


On 7/6/18 7:23 AM, Stephen Thompson wrote:



Yes, it does print 2000 proxy OK, but then in my case, the 'run' below 
would hang.  And as I said, running the bacula-fd in the foreground 
shows a successful connection to Director when successful, but then 
nothing more.  Also an unsuccessful connection (on purpose) is output 
form both the FD and the DIR, so they are definitely talking.  Hmmm... 
I will try your foregrounded director suggestion.


BTW, I'm also using TLS, which I'm hoping is not muddying the waters.

Oh, and technically I'm running 9.0.6, so perhaps I should upgrade as 
well.



Stephen



On 7/6/18 3:50 AM, Martin Simmons wrote:

It works for me in 9.0.8:

Connecting to Director localhost:9102
2000 OK Hello 214
Enter a period to cancel a command.
*proxy
2000 proxy OK.
*run
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"
A job name must be specified.
The defined Job resources are:
  1: Client1
  ...

Does it print "2000 proxy OK." and the "*" prompt after the proxy 
command?

You could try running the Director in the foreground with -d900.

__Martin




On Thu, 5 Jul 2018 17:30:31 -0700, Stephen Thompson said:


Thanks Martin.

That got me a step closer, but still not working.

If I run bacula-fd in foreground, I can see that when I execute proxy
command the FD outputs a successful connected to Director message.  But
running any other command under proxy in bconsole just hangs with no
output from FD or from Director.

Hmmm...
Ste

Re: [Bacula-users] client initiated backups - bconsole vs tray?

2018-07-06 Thread Stephen Thompson



Well this led to unexpected results.  Still 9.0.6, but running both FD 
and DIR in foreground with d900 both show startup messages, show console 
connecting to FD, show FD connecting to DIR when "proxy" is sent, but 
then when any command is sent and hangs, NEITHER FD NOR DIRECTOR output 
anything at all!


2000 OK Hello 214
Enter a period to cancel a command.
*proxy
2000 proxy OK.
*status


FD output...
fd: hello.c:262-0 Connecting to Director DIRECTOR:9101
fd: watchdog.c:197-0 Registered watchdog 7fc73401fa68, interval 15 one shot
fd: btimers.c:145-0 Start thread timer 7fc73401d498 tid 7fc73bf4c700 for 
15 secs.

fd: bsock.c:237-0 Current A.B.C.D:9101 All W.X.Y.Z:9101
fd: bsock.c:166-0 who=Director daemon host=DIRECTOR port=9101
fd: bsock.c:349-0 OK connected to server  Director daemon DIRECTOR:9101.
fd: btimers.c:203-0 Stop thread timer 7fc73401d498 tid=7fc73bf4c700.
fd: watchdog.c:217-0 Unregistered watchdog 7fc73401fa68
fd: watchdog.c:197-0 Registered watchdog 7fc73401d498, interval 15 one shot
fd: btimers.c:177-0 Start bsock timer 7fc734005d18 tid=7fc73bf4c700 for 
15 secs at 1530890871
fd: cram-md5.c:133-0 cram-get received: auth cram-md5 
<195401314.1530890871@dir> ssl=2

fd: cram-md5.c:157-0 sending resp to challenge: jlJ1z7+S47xwcCkb2S+GGD
fd: cram-md5.c:76-0 send: auth cram-md5 challenge 
<88308421.1530890871@fd> ssl=2

fd: cram-md5.c:95-0 Authenticate OK GD/TjH/8Dwc+4C0mJ8+2oD
fd: tls.c:392-0 Check subject name name
fd: bnet.c:280-0 TLS client negotiation established.
fd: hello.c:335-0 >dird: 1000 OK auth
fd: hello.c:342-0 November 2017)

fd: hello.c:345-0 1000 OK: 103 DIRECTOR Version: 9.0.6 (20 November 2017)



DIR output
dir: bnet.c:569-0 socket=6 who=client host=A.B.C.D port=9101
dir: jcr.c:931-0 set_jcr_job_status(0, C)
dir: jcr.c:940-0 OnEntry JobStatus=0 newJobstatus=C
dir: jcr.c:951-0 Set new stat. old: 0,0 new: C,0
dir: jcr.c:956-0 leave setJobStatus old=0 new=C
dir: job.c:1760-0 wstorage=STORAGE
dir: job.c:1769-0 wstore=STORAGE where=Job resource
dir: job.c:1429-0 JobId=0 created Job=-Console-.2018-07-06_08.27.51_05
dir: jcr.c:931-0 set_jcr_job_status(0, R)
dir: jcr.c:940-0 OnEntry JobStatus=C newJobstatus=R
dir: jcr.c:951-0 Set new stat. old: C,0 new: R,0
dir: jcr.c:956-0 leave setJobStatus old=C new=R
dir: cram-md5.c:69-0 send: auth cram-md5 challenge 
<195401314.1530890871@dir> ssl=2
dir: cram-md5.c:133-0 cram-get received: auth cram-md5 
<88308421.1530890871@fd> ssl=2

dir: cram-md5.c:157-0 sending resp to challenge: GD/TjH/8Dwc+4C0mJ8+2oD
lawson-dir: bnet.c:230-0 TLS server negotiation established.


I'm going to build 9.0.8 and see if I get different results.
I believe I skipped TLS with the same results.
Stephen


On 7/6/18 7:23 AM, Stephen Thompson wrote:



Yes, it does print 2000 proxy OK, but then in my case, the 'run' below 
would hang.  And as I said, running the bacula-fd in the foreground 
shows a successful connection to Director when successful, but then 
nothing more.  Also an unsuccessful connection (on purpose) is output 
form both the FD and the DIR, so they are definitely talking.  Hmmm... I 
will try your foregrounded director suggestion.


BTW, I'm also using TLS, which I'm hoping is not muddying the waters.

Oh, and technically I'm running 9.0.6, so perhaps I should upgrade as well.


Stephen



On 7/6/18 3:50 AM, Martin Simmons wrote:

It works for me in 9.0.8:

Connecting to Director localhost:9102
2000 OK Hello 214
Enter a period to cancel a command.
*proxy
2000 proxy OK.
*run
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"
A job name must be specified.
The defined Job resources are:
  1: Client1
  ...

Does it print "2000 proxy OK." and the "*" prompt after the proxy 
command?

You could try running the Director in the foreground with -d900.

__Martin




On Thu, 5 Jul 2018 17:30:31 -0700, Stephen Thompson said:


Thanks Martin.

That got me a step closer, but still not working.

If I run bacula-fd in foreground, I can see that when I execute proxy
command the FD outputs a successful connected to Director message.  But
running any other command under proxy in bconsole just hangs with no
output from FD or from Director.

Hmmm...
Stephen


On 7/5/18 8:21 AM, Martin Simmons wrote:

On Tue, 3 Jul 2018 16:04:56 -0700, Stephen Thompson said:


All,

I've been trying to setup client initiated backups via FD 
remote=yes and

bconsole with no success.  Regardless of the ACLs defined on Director,
the only command available on client's bconsole is "status" and even
that is the status of the local FD, not the DIR status.  Every other
command yields...

2999 Invalid command


You are not connected directly to the Director command loop after 
connecting

bconsole to the local FD.  According to the test
(regress/tests/remote-console-test), you need to use the proxy command
(without any arguments) to connect to the Director.

__Martin



--
Stephen Thompson

[Bacula-users] possibly new mtx timing bug in 9x?

2018-07-06 Thread Stephen Thompson



Not sure if anyone else is seeing this, but sporadically, perhaps 2-3 
times a month, after running various version of bacula on the same 
server/tape library for 10 years now and now running 9.0.6, we are 
seeing cases where bacula want's user intervention to mount a tape in a 
drive, but the tape is already in the drive AND bacula put it there. 
The only thing I can think is that the tape load step is somehow timing 
out and then not making the check to see whether the tape made it to the 
drive or not.


thanks,
Stephen
--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] client initiated backups - bconsole vs tray?

2018-07-06 Thread Stephen Thompson




Yes, it does print 2000 proxy OK, but then in my case, the 'run' below 
would hang.  And as I said, running the bacula-fd in the foreground 
shows a successful connection to Director when successful, but then 
nothing more.  Also an unsuccessful connection (on purpose) is output 
form both the FD and the DIR, so they are definitely talking.  Hmmm... I 
will try your foregrounded director suggestion.


BTW, I'm also using TLS, which I'm hoping is not muddying the waters.

Oh, and technically I'm running 9.0.6, so perhaps I should upgrade as well.


Stephen



On 7/6/18 3:50 AM, Martin Simmons wrote:

It works for me in 9.0.8:

Connecting to Director localhost:9102
2000 OK Hello 214
Enter a period to cancel a command.
*proxy
2000 proxy OK.
*run
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"
A job name must be specified.
The defined Job resources are:
  1: Client1
  ...

Does it print "2000 proxy OK." and the "*" prompt after the proxy command?
You could try running the Director in the foreground with -d900.

__Martin




On Thu, 5 Jul 2018 17:30:31 -0700, Stephen Thompson said:


Thanks Martin.

That got me a step closer, but still not working.

If I run bacula-fd in foreground, I can see that when I execute proxy
command the FD outputs a successful connected to Director message.  But
running any other command under proxy in bconsole just hangs with no
output from FD or from Director.

Hmmm...
Stephen


On 7/5/18 8:21 AM, Martin Simmons wrote:

On Tue, 3 Jul 2018 16:04:56 -0700, Stephen Thompson said:


All,

I've been trying to setup client initiated backups via FD remote=yes and
bconsole with no success.  Regardless of the ACLs defined on Director,
the only command available on client's bconsole is "status" and even
that is the status of the local FD, not the DIR status.  Every other
command yields...

2999 Invalid command


You are not connected directly to the Director command loop after connecting
bconsole to the local FD.  According to the test
(regress/tests/remote-console-test), you need to use the proxy command
(without any arguments) to connect to the Director.

__Martin



--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue)     Berkeley, CA 94720-4760



--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] client initiated backups - bconsole vs tray?

2018-07-05 Thread Stephen Thompson



Thanks Martin.

That got me a step closer, but still not working.

If I run bacula-fd in foreground, I can see that when I execute proxy 
command the FD outputs a successful connected to Director message.  But 
running any other command under proxy in bconsole just hangs with no 
output from FD or from Director.


Hmmm...
Stephen


On 7/5/18 8:21 AM, Martin Simmons wrote:

On Tue, 3 Jul 2018 16:04:56 -0700, Stephen Thompson said:


All,

I've been trying to setup client initiated backups via FD remote=yes and
bconsole with no success.  Regardless of the ACLs defined on Director,
the only command available on client's bconsole is "status" and even
that is the status of the local FD, not the DIR status.  Every other
command yields...

2999 Invalid command


You are not connected directly to the Director command loop after connecting
bconsole to the local FD.  According to the test
(regress/tests/remote-console-test), you need to use the proxy command
(without any arguments) to connect to the Director.

__Martin



--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] client initiated backups - bconsole vs tray?

2018-07-03 Thread Stephen Thompson



All,

I've been trying to setup client initiated backups via FD remote=yes and 
bconsole with no success.  Regardless of the ACLs defined on Director, 
the only command available on client's bconsole is "status" and even 
that is the status of the local FD, not the DIR status.  Every other 
command yields...


2999 Invalid command


==MyConfigution==

server...

bacula-dir.conf:
Director {
  Name = bacula-dir
  DIRport = 9101
  Password = "ABC123"
}
Console {
  Name = dir-con-fd
  Password = "DEF456"
  CommandACL = *all*
  ClientACL = *all*
  JobACL = *all*
  PoolACL = *all*
  StorageACL = *all*
  CatalogACL = *all*
  FileSetACL = *all*
}


client...

bacula-fd.conf:
Director {
  Name = bacula-dir
  Password = "ABC123"
}
Console {
  Name = dir-con-fd
  DIRPort = 9101
  Address = 
  Password = "DEF456"
}
Director {
  Name = fd-con
  Remote = yes
  Password = "GHI789"
}
FileDaemon {
  Name = bacula-fd
  FDport = 9102
}

bconsole.conf:
Director {
  Name = bacula-fd
  DIRport = 9102
  Address = localhost
  Password = "NOT_USED-SEE_CONSOLE_SECTION"
}
Console {
  Name = fd-con
  Password = "GHI789"
}


I see the docs on this lean heavily toward tray.  Does this even work 
for bconsole?  I saw a Kern comment that they use bconsole for testing 
this feature, but I just cannot get it to let me run any command but a 
local status of the FD.


Help?

thanks,
Stephen
--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] PurgedFiles in Job tables- Not toggled when Volumes are purged?

2018-04-19 Thread Stephen Thompson



Looks like I may have been seeing things like Canceled (A) jobs that 
never had any File records to begin with, and therefore were never 
deleted with Volume purging and still have PurgedFiles still set to 0.


Stephen


On 04/19/2018 09:53 AM, Martin Simmons wrote:

The "purge volumes" command deletes the job records, so there is no row
anymore in which to set PurgedFiles.

What is the exact bconsole command line you are running to purge volumes?

__Martin





On Mon, 16 Apr 2018 09:25:11 -0700, Stephen Thompson said:


In looking at doing this out of band (not pruning feature) I've run into
a tracking snag.  We tend to purge volumes manually after a year when we
want to reuse them, but it looks like purging volumes does not change
the PurgedFiles column in the Job table for the Job's that have had
their Files Purged.  It appears that only happens if the Files are
purged at the Job level.

Can anyone confirm that is expected behaviour?

I may need to purge all the Jobs on a volume, before I purge the volume,
in order to get the flags set properly, so that I can more easily track
which Job's have had their Files purged and which have not.

Stephen




On 04/11/2018 06:25 AM, Stephen Thompson wrote:


Thanks Kern.

I think given the limited nature of his need, I may use a postrun script
to simply wipe database records out of band.

Also if I did use multi-client definitions, I would need to use the same
pool as they all go to the same monthly tapes.

Stephen


On 4/10/18 11:59 PM, Kern Sibbald wrote:

Hello Stephen,

What you are asking for, as you suspect, does not exist and
implementing it would be a bit problematic because every Job would
need to keep it's own retention period.  For one client, there can be
any number of Jobs -- typically thousands.  Thus the catalog would
grow faster (more data for the File table having the most records),
and the complexity of pruning including the time to prune would
probably explode -- probably thousands of times slower.

I have never used two Client definitions to backup the same machine,
but in principle it would work fine.  If you name your Clients
appropriately it might be easier to remember what was done.  E.g.
Client1-Normal-Files, Client1-Archived-Files, ... Also, if you put
clear comments on the resource definitions, it would help.  Note two
things, if you go this route:

1. Be sure to define each of your two Client1-xxx with different Pools
with different Volume retention periods
2. I would appreciate feedback on how this works -- especially
operationally

Best regards,
Kern

PS: At the current time the Enterprise version of Bacula has a number
of performance improvements that should significantly speed up the
backups of 50+million files.  It does this at a small extra expense
(size) of the catalog.

On 04/07/2018 06:21 AM, Stephen Thompson wrote:


I believe the answer is no, but as a happy bacula user for 10 years I
am somewhat surprised at the lack of flexibility.

The scenarios is this:  A fileserver (1 client) with dozens of large
(size-wise) filesystems (12 jobs), but a couple of those filesystems
are large (filecount-wise).  We would really like to set different
file retention periods on those high-filecount jobs (50+million),
because they are forcing the Catalog to go beyond our size
constraints. However, we also don't want to lose the file retention
longevity of that client's other jobs (5 years).  The only hack I can
think of is to define 2 clients for 1 actual host, but I'd rather not
go down that route, because tracking jobs and associating them,
especially over multiple years, will get that much more tricky.

Ideas?

thanks,
Stephen




--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] PurgedFiles in Job tables- Not toggled when Volumes are purged?

2018-04-16 Thread Stephen Thompson



In looking at doing this out of band (not pruning feature) I've run into 
a tracking snag.  We tend to purge volumes manually after a year when we 
want to reuse them, but it looks like purging volumes does not change 
the PurgedFiles column in the Job table for the Job's that have had 
their Files Purged.  It appears that only happens if the Files are 
purged at the Job level.


Can anyone confirm that is expected behaviour?

I may need to purge all the Jobs on a volume, before I purge the volume, 
in order to get the flags set properly, so that I can more easily track 
which Job's have had their Files purged and which have not.


Stephen




On 04/11/2018 06:25 AM, Stephen Thompson wrote:


Thanks Kern.

I think given the limited nature of his need, I may use a postrun script 
to simply wipe database records out of band.


Also if I did use multi-client definitions, I would need to use the same 
pool as they all go to the same monthly tapes.


Stephen


On 4/10/18 11:59 PM, Kern Sibbald wrote:

Hello Stephen,

What you are asking for, as you suspect, does not exist and 
implementing it would be a bit problematic because every Job would 
need to keep it's own retention period.  For one client, there can be 
any number of Jobs -- typically thousands.  Thus the catalog would 
grow faster (more data for the File table having the most records), 
and the complexity of pruning including the time to prune would 
probably explode -- probably thousands of times slower.


I have never used two Client definitions to backup the same machine, 
but in principle it would work fine.  If you name your Clients 
appropriately it might be easier to remember what was done.  E.g. 
Client1-Normal-Files, Client1-Archived-Files, ... Also, if you put 
clear comments on the resource definitions, it would help.  Note two 
things, if you go this route:


1. Be sure to define each of your two Client1-xxx with different Pools 
with different Volume retention periods
2. I would appreciate feedback on how this works -- especially 
operationally


Best regards,
Kern

PS: At the current time the Enterprise version of Bacula has a number 
of performance improvements that should significantly speed up the 
backups of 50+million files.  It does this at a small extra expense 
(size) of the catalog.


On 04/07/2018 06:21 AM, Stephen Thompson wrote:


I believe the answer is no, but as a happy bacula user for 10 years I 
am somewhat surprised at the lack of flexibility.


The scenarios is this:  A fileserver (1 client) with dozens of large 
(size-wise) filesystems (12 jobs), but a couple of those filesystems 
are large (filecount-wise).  We would really like to set different 
file retention periods on those high-filecount jobs (50+million), 
because they are forcing the Catalog to go beyond our size 
constraints. However, we also don't want to lose the file retention 
longevity of that client's other jobs (5 years).  The only hack I can 
think of is to define 2 clients for 1 actual host, but I'd rather not 
go down that route, because tracking jobs and associating them, 
especially over multiple years, will get that much more tricky.


Ideas?

thanks,
Stephen




--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] which database version for bacula 9.0.6?

2018-04-14 Thread Stephen Thompson



Nevermind.  I was looking in wrong place.  I see that 9x involves an 
update from 15 to 16 and the script to do that with.


Sorry, though I still wonder if there's a mapping somewhere that lists 
db versions against bacula versions.


Stephen


On 4/14/18 7:44 PM, Stephen Thompson wrote:


I'm a little confused.  The release notes for 9.0.0 (and possibly all 
9x) say that a database upgrade is require, and to run the 
update_bacula_table script.


I am running 7.4.4 at the moment and my database version (from version 
table) says I'm at 15, but the script to update mysql tables with bacula 
9.0.6 apparently upgrades database to 15.


Does this sound right?  How can I already be a version that's for 9x, or 
did 15 come during 7x, which is why I have it, and the note in 9x 
release notes about needing database upgrade is because it can't hurt to 
run the script and many non-9x installations might not yet be at 15?


Is the database version and how it corresponds to bacula versions 
documented in a list anywhere?


I ask, because I want to know ahead of time how risky this is.  I do 
backup my catalog, but it could take a week to restore from backup, so I 
just want to know whether I really will doing a database upgrade or not.


thanks!
Stephen


--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] which database version for bacula 9.0.6?

2018-04-14 Thread Stephen Thompson


I'm a little confused.  The release notes for 9.0.0 (and possibly all 
9x) say that a database upgrade is require, and to run the 
update_bacula_table script.


I am running 7.4.4 at the moment and my database version (from version 
table) says I'm at 15, but the script to update mysql tables with bacula 
9.0.6 apparently upgrades database to 15.


Does this sound right?  How can I already be a version that's for 9x, or 
did 15 come during 7x, which is why I have it, and the note in 9x 
release notes about needing database upgrade is because it can't hurt to 
run the script and many non-9x installations might not yet be at 15?


Is the database version and how it corresponds to bacula versions 
documented in a list anywhere?


I ask, because I want to know ahead of time how risky this is.  I do 
backup my catalog, but it could take a week to restore from backup, so I 
just want to know whether I really will doing a database upgrade or not.


thanks!
Stephen
--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] can file retention be job rather than client based?

2018-04-11 Thread Stephen Thompson


Thanks Kern.

I think given the limited nature of his need, I may use a postrun script 
to simply wipe database records out of band.


Also if I did use multi-client definitions, I would need to use the same 
pool as they all go to the same monthly tapes.


Stephen


On 4/10/18 11:59 PM, Kern Sibbald wrote:

Hello Stephen,

What you are asking for, as you suspect, does not exist and implementing 
it would be a bit problematic because every Job would need to keep it's 
own retention period.  For one client, there can be any number of Jobs 
-- typically thousands.  Thus the catalog would grow faster (more data 
for the File table having the most records), and the complexity of 
pruning including the time to prune would probably explode -- probably 
thousands of times slower.


I have never used two Client definitions to backup the same machine, but 
in principle it would work fine.  If you name your Clients appropriately 
it might be easier to remember what was done.  E.g. 
Client1-Normal-Files, Client1-Archived-Files, ... Also, if you put clear 
comments on the resource definitions, it would help.  Note two things, 
if you go this route:


1. Be sure to define each of your two Client1-xxx with different Pools 
with different Volume retention periods
2. I would appreciate feedback on how this works -- especially 
operationally


Best regards,
Kern

PS: At the current time the Enterprise version of Bacula has a number of 
performance improvements that should significantly speed up the backups 
of 50+million files.  It does this at a small extra expense (size) of 
the catalog.


On 04/07/2018 06:21 AM, Stephen Thompson wrote:


I believe the answer is no, but as a happy bacula user for 10 years I 
am somewhat surprised at the lack of flexibility.


The scenarios is this:  A fileserver (1 client) with dozens of large 
(size-wise) filesystems (12 jobs), but a couple of those filesystems 
are large (filecount-wise).  We would really like to set different 
file retention periods on those high-filecount jobs (50+million), 
because they are forcing the Catalog to go beyond our size 
constraints. However, we also don't want to lose the file retention 
longevity of that client's other jobs (5 years).  The only hack I can 
think of is to define 2 clients for 1 actual host, but I'd rather not 
go down that route, because tracking jobs and associating them, 
especially over multiple years, will get that much more tricky.


Ideas?

thanks,
Stephen


--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] can file retention be job rather than client based?

2018-04-07 Thread Stephen Thompson


Thanks... Yeah, I'm leaning towards a post or pre job script that 
actually prunes (or more likely purges) the file records I need to jettison.


Stephen


On 4/7/18 3:38 AM, Heitor Faria wrote:

Hello Stefaphen,


I believe the answer is no, but as a happy bacula user for 10 years I am
somewhat surprised at the lack of flexibility.


Alternative solutions with proprietary catalog data are much more inflexible.


The scenarios is this:  A fileserver (1 client) with dozens of large
(size-wise) filesystems (12 jobs), but a couple of those filesystems are
large (filecount-wise).  We would really like to set different file
retention periods on those high-filecount jobs (50+million), because
they are forcing the Catalog to go beyond our size constraints.
However, we also don't want to lose the file retention longevity of that
client's other jobs (5 years).  The only hack I can think of is to
define 2 clients for 1 actual host, but I'd rather not go down that
route, because tracking jobs and associating them, especially over
multiple years, will get that much more tricky.


File & Job Retention can be set in Pool resource instead of Client one.
You can also try to modify the prior used manual Bacula pruning Perl script to only 
prune files you need and not delete the whole job 
<http://blog.bacula.org/whitepapers/manual_prune.pl>.
Finally, you can even use dynamically generated Filesets with lower or greater file 
sizes to automate their backup distribution. 
<http://bacula.us/bacula-fileset-on-client-configuration-remote-fileset/>


Ideas?

thanks,
Stephen
--


Regards,



--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
Remote: 510.214.6506 (Tue) Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] can file retention be job rather than client based?

2018-04-06 Thread Stephen Thompson


I believe the answer is no, but as a happy bacula user for 10 years I am 
somewhat surprised at the lack of flexibility.


The scenarios is this:  A fileserver (1 client) with dozens of large 
(size-wise) filesystems (12 jobs), but a couple of those filesystems are 
large (filecount-wise).  We would really like to set different file 
retention periods on those high-filecount jobs (50+million), because 
they are forcing the Catalog to go beyond our size constraints. 
However, we also don't want to lose the file retention longevity of that 
client's other jobs (5 years).  The only hack I can think of is to 
define 2 clients for 1 actual host, but I'd rather not go down that 
route, because tracking jobs and associating them, especially over 
multiple years, will get that much more tricky.


Ideas?

thanks,
Stephen
--
Stephen Thompson   Berkeley Seismo Lab
step...@seismo.berkeley.edu215 McCone Hall
Office: 510.664.9177   University of California
   Berkeley, CA 94720-4760

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.2 mysql issue?

2015-10-14 Thread Stephen Thompson


Yes, we've tuned the database a number of times and believe it's the 
best we can do.

On 10/14/15 3:17 AM, Alex Domoradov wrote:
> The same thing as for me. I'm trying do not use mysql shipped with
> CentOS 6 and replace it with Percona 5.5/5.6 whenever it's possible
>
> 2 Stephen
> Have you tried to run mysqltunner?
>
>
> On Mon, Oct 12, 2015 at 07:33:46AM -0700, Stephen Thompson wrote:
> >
> > update...
> >
> > After adding more RAM, we are back to getting a about 3 queries a day
> > that run longer than 15 minutes.  This was our norm before upgrading.
> > No job errors since the first couple days from this month (Oct).  Not
> > sure if the reduction in long running queries was actually from
> > additional RAM or not, since last week before adding RAM, the number of
> > long running queries per day had already greatly diminished since
> > beginning of month.
> >
> > So, I guess, problem solved for now, though I'm not completely confident
> > about what actually happened or if I did anything to fix it.
> > Oh, well.
> >
> > Stephen
>
> Hi Stephen,
>
> you might also try giving MariaDB a shot which has been performing
> fine as a drop-in mysql replacement for us for the last few years with
> catalogs of similar size.
>
> Cheers, Uwe
>
>
>
>
>
>
>
>
>
> 
> --
> ___
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> <mailto:Bacula-users@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall #4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.2 mysql issue?

2015-10-12 Thread Stephen Thompson

update...

After adding more RAM, we are back to getting a about 3 queries a day
that run longer than 15 minutes.  This was our norm before upgrading. 
No job errors since the first couple days from this month (Oct).  Not 
sure if the reduction in long running queries was actually from 
additional RAM or not, since last week before adding RAM, the number of 
long running queries per day had already greatly diminished since 
beginning of month.

So, I guess, problem solved for now, though I'm not completely confident 
about what actually happened or if I did anything to fix it.
Oh, well.

Stephen



On 10/9/15 2:08 PM, Stephen Thompson wrote:
>
>
> Eric,
>
> I appreciate all the feedback.  We went through a few iterations of
> tuning awhile back and have not generally had any significant issues
> over the years with database responsiveness.
>
> Back to the original post, it's only been since our upgrade that we
> started having database lock timeout issues.  Otherwise we've run for
> years (6 or so) without issue.  We also went through an orphan record
> cleanout earlier this year.
>
> Stat wise, it looks like our slow queries are still happening at twice
> the rate compared to recent months, but half as often as they were when
> I first reported the issue a week ago, so I am equally nonplussed about
> the improvement as I was about the lockouts.
>
> I did get a chance to double the ram from 8 to 16GB today though
> unfortunately we don't have the ready resources to do many hardware
> upgrades, though I quite understand why that's a recommendation.
>
> Stephen
>
>
>
> On 10/08/2015 10:58 PM, Eric Bollengier wrote:
>> Hello Stephen,
>>
>>
>> Le 05. 10. 15 19:17, Stephen Thompson a écrit :
>>>
>>> Eric,
>>>
>>> Thanks for the reply.
>>>
>>> I've heard the postgres recommendation a fair number of times.  A couple
>>> years back, we setup a parallel instance but even after tuning still
>>> wound up with _worse_ performance than with mysql.  I could not figure
>>> out what to attribute this to (because it was in such contrast to all
>>> the pro-postgres recommendations) except possibly our memory-poor server
>>> - 8Gb RAM.
>>>
>>> At any rate, the only thing that's changed was the upgrade from 7.0.5 to
>>> 7.2.0.  The table involved is definitely the File table.  We do have
>>> jobs with 20-30 million records, so those jobs can be slow when it comes
>>> time for attribute insertion into the database (or to read out a file
>>> list for Accurate backups).  This why we've historically had innodb lock
>>> timeout of 3600.  However, it's only last week after the upgrade that
>>> we've ever had queries extend beyond that hour mark.
>>>
>>> We also went through a database cleaning process last month due to
>>> nearly reaching 1Tb and I can pretty authoritatively claim that we don't
>>> have orphan records.  The database content and schema all appear to be
>>> appropriate.
>>
>> A 1TB database (running either Postgresql, MySQL or whatever other kind
>> of product) should be carefully tuned and monitored. My guess would be
>> that your my.cnf settings are not suitable for such database size. You
>> can run a tool such as MySQLtuner to check that everything is ok on
>> MySQL side, increase the size of the memory of your server or try to
>> cleanup orphan filename records.
>>
>> The size of the File table should not impact performances on Backup, but
>> other tables such as Path or Filename are important (and they are pretty
>> big on your site).
>>
>>   > I was worried that queries had been rewritten that made it
>>   > more efficient for other databases, but less so for mysql.
>>
>> We didn't wrote database query specifically for PostgreSQL or MySQL but
>> we optimize them when it's possible, some SQLite queries were optimized
>> by a contributor 2 or 3 years ago, and it was way faster for some parts
>> of Bacula afterward.
>>
>> If you look the database world from outside, you might think that
>> everything is nice and smooth because all products seem to talk the
>> same language (SQL), but they all have a different way to handle the
>> work and the SQL specifications (and the lack of specifications).
>> For myself, I'm a PostgreSQL user for a quite long time, I have good
>> relationships with the PostgreSQL community, and we got huge help when
>> we wrote the "Batch Mode" few years ago. I know that it works well and
>> we can analyze problems quite easily, doing so I always advise strongly
>> to use PostgreSQ

Re: [Bacula-users] 7.2 mysql issue?

2015-10-09 Thread Stephen Thompson


Eric,

I appreciate all the feedback.  We went through a few iterations of 
tuning awhile back and have not generally had any significant issues 
over the years with database responsiveness.

Back to the original post, it's only been since our upgrade that we 
started having database lock timeout issues.  Otherwise we've run for 
years (6 or so) without issue.  We also went through an orphan record 
cleanout earlier this year.

Stat wise, it looks like our slow queries are still happening at twice 
the rate compared to recent months, but half as often as they were when 
I first reported the issue a week ago, so I am equally nonplussed about 
the improvement as I was about the lockouts.

I did get a chance to double the ram from 8 to 16GB today though 
unfortunately we don't have the ready resources to do many hardware 
upgrades, though I quite understand why that's a recommendation.

Stephen



On 10/08/2015 10:58 PM, Eric Bollengier wrote:
> Hello Stephen,
>
>
> Le 05. 10. 15 19:17, Stephen Thompson a écrit :
>>
>> Eric,
>>
>> Thanks for the reply.
>>
>> I've heard the postgres recommendation a fair number of times.  A couple
>> years back, we setup a parallel instance but even after tuning still
>> wound up with _worse_ performance than with mysql.  I could not figure
>> out what to attribute this to (because it was in such contrast to all
>> the pro-postgres recommendations) except possibly our memory-poor server
>> - 8Gb RAM.
>>
>> At any rate, the only thing that's changed was the upgrade from 7.0.5 to
>> 7.2.0.  The table involved is definitely the File table.  We do have
>> jobs with 20-30 million records, so those jobs can be slow when it comes
>> time for attribute insertion into the database (or to read out a file
>> list for Accurate backups).  This why we've historically had innodb lock
>> timeout of 3600.  However, it's only last week after the upgrade that
>> we've ever had queries extend beyond that hour mark.
>>
>> We also went through a database cleaning process last month due to
>> nearly reaching 1Tb and I can pretty authoritatively claim that we don't
>> have orphan records.  The database content and schema all appear to be
>> appropriate.
>
> A 1TB database (running either Postgresql, MySQL or whatever other kind
> of product) should be carefully tuned and monitored. My guess would be
> that your my.cnf settings are not suitable for such database size. You
> can run a tool such as MySQLtuner to check that everything is ok on
> MySQL side, increase the size of the memory of your server or try to
> cleanup orphan filename records.
>
> The size of the File table should not impact performances on Backup, but
> other tables such as Path or Filename are important (and they are pretty
> big on your site).
>
>  > I was worried that queries had been rewritten that made it
>  > more efficient for other databases, but less so for mysql.
>
> We didn't wrote database query specifically for PostgreSQL or MySQL but
> we optimize them when it's possible, some SQLite queries were optimized
> by a contributor 2 or 3 years ago, and it was way faster for some parts
> of Bacula afterward.
>
> If you look the database world from outside, you might think that
> everything is nice and smooth because all products seem to talk the
> same language (SQL), but they all have a different way to handle the
> work and the SQL specifications (and the lack of specifications).
> For myself, I'm a PostgreSQL user for a quite long time, I have good
> relationships with the PostgreSQL community, and we got huge help when
> we wrote the "Batch Mode" few years ago. I know that it works well and
> we can analyze problems quite easily, doing so I always advise strongly
> to use PostgreSQL for all large setup. For other products, developers
> uses MySQL and the PostgreSQL driver is not good at all.
>
> With the time, I found that you can do "more" with "less" hardware when
> using the PostgreSQL catalog. In your case (a fairly big database), it
> might be the time to spend a bit of money to get more RAM and/or make
> sure that your Path/Filename indexes stay in RAM.
>
>
> Hope it helps.
>
> Best Regards,
> Eric
>
>>
>>
>> More info...
>>
>> example from slow query logfile:
>> # Time: 151001  1:28:14
>> # User@Host: bacula[bacula] @ localhost []
>> # Query_time: 3675.052083  Lock_time: 73.719795 Rows_sent: 0
>> Rows_examined: 3
>> SET timestamp=1443688094;
>> INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5,
>> DeltaSeq) SELECT batch.FileIndex, batch.JobId, Path.PathId,
>> Filename.FilenameId,batch.LStat, batc

Re: [Bacula-users] 7.2 mysql issue?

2015-10-09 Thread Stephen Thompson
973 |
|   41 | 219826 |
|   53 | 219767 |
|   63 | 219749 |
|  135 | 219746 |
|  141 | 219344 |
|  124 | 219157 |
|   57 | 219070 |
|  134 | 215349 |
|  227 | 154642 |
|  112 | 134792 |
|  125 | 114623 |
|   31 |  99493 |
|   49 |  98341 |
|   34 |  92193 |
|   50 |  90190 |
|   46 |  88746 |
|  111 |  87960 |
|  148 |  70591 |
|   62 |  68151 |
|  145 |  65377 |
|   42 |  65290 |
|   25 |  63220 |
|   60 |  62653 |
|   38 |  62183 |
|   43 |  46063 |
|  228 |  45989 |
|   44 |  45433 |
|  113 |  44317 |
|  186 |  1 |
|5 |  0 |
|   56 |  0 |
|  172 |  0 |
|  195 |  0 |
|  174 |  0 |
|   48 |  0 |
|   61 |  0 |
+--++
221 rows in set (0.21 sec)




On 10/09/2015 10:01 AM, Eric Bollengier wrote:
> Very good point Ana,
>
> So, you might want to add to the query "AND PurgedFiles = 0"
>
> Thanks,
>
> Eric
>
> Le 09. 10. 15 14:24, Ana Emília M. Arruda a écrit :
>> Hello Eric!
>>
>> Thank you. I thought that you were looking for the number of filename
>> per Client that had not been pruned yet :).
>>
>> Best regards,
>> Ana
>>
>> On Fri, Oct 9, 2015 at 3:17 AM, Eric Bollengier
>> <eric.bolleng...@baculasystems.com
>> <mailto:eric.bolleng...@baculasystems.com>> wrote:
>>
>> Thanks Ana!
>>
>> Something such as
>>
>> SELECT ClientId, SUM(JobFiles) AS NB FROM Job GROUP BY ClientId
>> ORDER BY NB DESC;
>>
>> should also do the trick a bit more faster ;-)
>>
>> Best Regards,
>> Eric
>>
>> Le 07. 10. 15 15:23, Ana Emília M. Arruda a écrit :
>>
>> Hello Stephen,
>>
>> On Mon, Oct 5, 2015 at 2:17 PM, Stephen Thompson
>> <step...@seismo.berkeley.edu
>> <mailto:step...@seismo.berkeley.edu>
>> <mailto:step...@seismo.berkeley.edu
>> <mailto:step...@seismo.berkeley.edu>>> wrote:
>>
>>
>>  Regarding:
>>> Would be nice also if you can give the number of
>> Filename per Client
>>  (from the job table).
>>
>>  Do you have a sample SQL to retrieve this stat?
>>
>>
>> ​​select Client.Name, count(distinct Filename.FilenameId) from
>> Client,
>> Filename, File, Job where Filename.FilenameId=File.FilenameId and
>> File.JobId=Job.JobId and Job.ClientId=Client.ClientId group by
>> Client.ClientId;
>>
>> ​The above query should work.
>>
>> Best regards,
>> Ana​
>>
>>
>>
>>  thanks,
>>  Stephen
>>
>>
>>
>>
>>
>>
>>
>>  On 10/03/2015 12:02 AM, Eric Bollengier wrote:
>>   > Hello Stephen,
>>   >
>>   > On 10/03/2015 12:00 AM, Stephen Thompson wrote:
>>   >>
>>   >>
>>   >> All,
>>   >>
>>   >> I believe I'm having mysql database issues since
>> upgrading to
>>  7.2 (from
>>   >> 7.0.2).  I run mysql innodb with 900Gb database that's
>> largely
>>  the File
>>   >> table.
>>   >
>>   > For large catalog, we usually advise to use PostgreSQL
>> where we have
>>   > multi-terabytes databases in production.
>>   >
>>   >> Since upgrading, I lose a few jobs a night due to
>> database locking
>>   >> timeouts, which I have set to 3600.  I also log slow
>> queries.
>>   >
>>   > Can you get some information about these locks? On which
>> table?
>>  Can you
>>   > give some statistics on your catalog like the size and
>> the number of
>>   > records of the File, Filename and Path table? Would be
>> nice also
>>  if you
>>   > can give the number of Filename per Client (from the job
>> table).
>>   >
>>   > Y

Re: [Bacula-users] 7.2 mysql issue?

2015-10-07 Thread Stephen Thompson


Thanks for the help.  Though, this is giving me a syntax error.

ERROR 1064 (42000): You have an error in your SQL syntax; check the 
manual that corresponds to your MySQL server version for the right 
syntax to use near '​​select Client.Name, count(distinct 
Filename.FilenameId) from Client, Filen' at line 1

On 10/7/15 6:23 AM, Ana Emília M. Arruda wrote:
> ​​select Client.Name, count(distinct Filename.FilenameId) from Client,
> Filename, File, Job where Filename.FilenameId=File.FilenameId and
> File.JobId=Job.JobId and Job.ClientId=Client.ClientId group by
> Client.ClientId;

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall #4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
Full-scale, agent-less Infrastructure Monitoring from a single dashboard
Integrate with 40+ ManageEngine ITSM Solutions for complete visibility
Physical-Virtual-Cloud Infrastructure monitoring from one console
Real user monitoring with APM Insights and performance trend reports 
Learn More http://pubads.g.doubleclick.net/gampad/clk?id=247754911=/4140
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.2 mysql issue?

2015-10-05 Thread Stephen Thompson

mysql> show indexes from File;
+---++--+--+-+---+-+--++--++-+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation 
| Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---++--+--+-+---+-+--++--++-+
| File  |  0 | PRIMARY  |1 | FileId  | A 
  |  4494348205 | NULL | NULL   |  | BTREE  | |
| File  |  1 | JobId|1 | JobId   | A 
  |  19 | NULL | NULL   |  | BTREE  | |
| File  |  1 | JobId|2 | PathId  | A 
  |   408577109 | NULL | NULL   |  | BTREE  | |
| File  |  1 | JobId|3 | FilenameId  | A 
  |  4494348205 | NULL | NULL   |  | BTREE  | |
+---++--+--+-+---+-+--++--++-+




On 10/05/2015 10:30 AM, Stephen Thompson wrote:
>
>
> Phil,
>
> Good question.  I vaguely recollect doing that a few years back, but I
> don't immediately see any additional indexing.  Where can I reference
> what the default indexes are supposed to be?
>
> thanks,
> Stephen
>
>
>
> On 10/05/2015 10:28 AM, Phil Stracchino wrote:
>> On 10/05/15 13:17, Stephen Thompson wrote:
>>> At any rate, the only thing that's changed was the upgrade from 7.0.5 to
>>> 7.2.0.  The table involved is definitely the File table.  We do have
>>> jobs with 20-30 million records, so those jobs can be slow when it comes
>>> time for attribute insertion into the database (or to read out a file
>>> list for Accurate backups).  This why we've historically had innodb lock
>>> timeout of 3600.  However, it's only last week after the upgrade that
>>> we've ever had queries extend beyond that hour mark.
>>
>> Stephen,
>> Just as a thought, there have been a number of threads on this mailing
>> list recommending additional or modified indexes on the File table.
>> Have you added the suggested additional indexes?
>>
>>
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.2 mysql issue?

2015-10-05 Thread Stephen Thompson


Nevermind about question concerning Snapshot table.  I see what happened 
there.

On 10/05/2015 10:17 AM, Stephen Thompson wrote:
>
> Eric,
>
> Thanks for the reply.
>
> I've heard the postgres recommendation a fair number of times.  A couple
> years back, we setup a parallel instance but even after tuning still
> wound up with _worse_ performance than with mysql.  I could not figure
> out what to attribute this to (because it was in such contrast to all
> the pro-postgres recommendations) except possibly our memory-poor server
> - 8Gb RAM.
>
> At any rate, the only thing that's changed was the upgrade from 7.0.5 to
> 7.2.0.  The table involved is definitely the File table.  We do have
> jobs with 20-30 million records, so those jobs can be slow when it comes
> time for attribute insertion into the database (or to read out a file
> list for Accurate backups).  This why we've historically had innodb lock
> timeout of 3600.  However, it's only last week after the upgrade that
> we've ever had queries extend beyond that hour mark.
>
> We also went through a database cleaning process last month due to
> nearly reaching 1Tb and I can pretty authoritatively claim that we don't
> have orphan records.  The database content and schema all appear to be
> appropriate.  I was worried that queries had been rewritten that made it
> more efficient for other databases, but less so for mysql.
>
>
> More info...
>
> example from slow query logfile:
> # Time: 151001  1:28:14
> # User@Host: bacula[bacula] @ localhost []
> # Query_time: 3675.052083  Lock_time: 73.719795 Rows_sent: 0
> Rows_examined: 3
> SET timestamp=1443688094;
> INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5,
> DeltaSeq) SELECT batch.FileIndex, batch.JobId, Path.PathId,
> Filename.FilenameId,batch.LStat, batch.MD5, batch.DeltaSeq FROM batch
> JOIN Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name =
> Filename.Name);
>
> mysqld:
> mysql-5.1.73-5.el6_6.x86_64
>
> record counts per table:
> File  4,315,675,600
> Filename  154,748,787
> Path  28,534,411
>
> innodb file sizes:
> 847708500 File.ibd
> 19488772  Filename.ibd
> 8216580   Path.ibd
> 106500PathHierarchy.ibd
> 57344 JobMedia.ibd
> 40960 PathVisibility.ibd
> 27648 Job.ibd
> 512   Media.ibd
> 176   FileSet.ibd
> 144   JobHisto.ibd
> 144   Client.ibd
> 112   RestoreObject.ibd
> 112   Pool.ibd
> 112   Log.ibd
> 112   BaseFiles.ibd
> 96Version.ibd
> 96UnsavedFiles.ibd
> 96Storage.ibd
> 96Status.ibd
> 96MediaType.ibd
> 96LocationLog.ibd
> 96Location.ibd
> 96Device.ibd
> 96Counters.ibd
> 96CDImages.ibd
> 4 Snapshot.MYI
> 0 Snapshot.MYD
>
>
>
> Not related, but I just noticed that somehow the new Snapshot table is
> MyISAM format.  How did that happen?
>
> Regarding:
>   > Would be nice also if you can give the number of Filename per Client
> (from the job table).
>
> Do you have a sample SQL to retrieve this stat?
>
>
> thanks,
> Stephen
>
>
>
>
>
>
>
> On 10/03/2015 12:02 AM, Eric Bollengier wrote:
>> Hello Stephen,
>>
>> On 10/03/2015 12:00 AM, Stephen Thompson wrote:
>>>
>>>
>>> All,
>>>
>>> I believe I'm having mysql database issues since upgrading to 7.2 (from
>>> 7.0.2).  I run mysql innodb with 900Gb database that's largely the File
>>> table.
>>
>> For large catalog, we usually advise to use PostgreSQL where we have
>> multi-terabytes databases in production.
>>
>>> Since upgrading, I lose a few jobs a night due to database locking
>>> timeouts, which I have set to 3600.  I also log slow queries.
>>
>> Can you get some information about these locks? On which table? Can you
>> give some statistics on your catalog like the size and the number of
>> records of the File, Filename and Path table? Would be nice also if you
>> can give the number of Filename per Client (from the job table).
>>
>> You might have many orphan Filenames, and MySQL is not always very good
>> to join large tables (it uses nested loops, and cannot use the index on
>> the Text column in all queries).
>>
>>> It appears that typically during a months I have about 90-100 queries
>>> that take longer than 15 minutes to run.  Already this month (upgraded
>>> earlier this week), I have 32 queries that take longer than 15 minutes.

Re: [Bacula-users] 7.2 mysql issue?

2015-10-05 Thread Stephen Thompson


Phil,

Good question.  I vaguely recollect doing that a few years back, but I 
don't immediately see any additional indexing.  Where can I reference 
what the default indexes are supposed to be?

thanks,
Stephen



On 10/05/2015 10:28 AM, Phil Stracchino wrote:
> On 10/05/15 13:17, Stephen Thompson wrote:
>> At any rate, the only thing that's changed was the upgrade from 7.0.5 to
>> 7.2.0.  The table involved is definitely the File table.  We do have
>> jobs with 20-30 million records, so those jobs can be slow when it comes
>> time for attribute insertion into the database (or to read out a file
>> list for Accurate backups).  This why we've historically had innodb lock
>> timeout of 3600.  However, it's only last week after the upgrade that
>> we've ever had queries extend beyond that hour mark.
>
> Stephen,
> Just as a thought, there have been a number of threads on this mailing
> list recommending additional or modified indexes on the File table.
> Have you added the suggested additional indexes?
>
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.2 mysql issue?

2015-10-05 Thread Stephen Thompson

Eric,

Thanks for the reply.

I've heard the postgres recommendation a fair number of times.  A couple 
years back, we setup a parallel instance but even after tuning still 
wound up with _worse_ performance than with mysql.  I could not figure 
out what to attribute this to (because it was in such contrast to all 
the pro-postgres recommendations) except possibly our memory-poor server 
- 8Gb RAM.

At any rate, the only thing that's changed was the upgrade from 7.0.5 to 
7.2.0.  The table involved is definitely the File table.  We do have 
jobs with 20-30 million records, so those jobs can be slow when it comes 
time for attribute insertion into the database (or to read out a file 
list for Accurate backups).  This why we've historically had innodb lock 
timeout of 3600.  However, it's only last week after the upgrade that 
we've ever had queries extend beyond that hour mark.

We also went through a database cleaning process last month due to 
nearly reaching 1Tb and I can pretty authoritatively claim that we don't 
have orphan records.  The database content and schema all appear to be 
appropriate.  I was worried that queries had been rewritten that made it 
more efficient for other databases, but less so for mysql.


More info...

example from slow query logfile:
# Time: 151001  1:28:14
# User@Host: bacula[bacula] @ localhost []
# Query_time: 3675.052083  Lock_time: 73.719795 Rows_sent: 0 
Rows_examined: 3
SET timestamp=1443688094;
INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5, 
DeltaSeq) SELECT batch.FileIndex, batch.JobId, Path.PathId, 
Filename.FilenameId,batch.LStat, batch.MD5, batch.DeltaSeq FROM batch 
JOIN Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name = 
Filename.Name);

mysqld:
mysql-5.1.73-5.el6_6.x86_64

record counts per table:
File4,315,675,600
Filename154,748,787
Path28,534,411

innodb file sizes:
847708500   File.ibd
19488772Filename.ibd
8216580 Path.ibd
106500  PathHierarchy.ibd
57344   JobMedia.ibd
40960   PathVisibility.ibd
27648   Job.ibd
512 Media.ibd
176 FileSet.ibd
144 JobHisto.ibd
144 Client.ibd
112 RestoreObject.ibd
112 Pool.ibd
112 Log.ibd
112 BaseFiles.ibd
96  Version.ibd
96  UnsavedFiles.ibd
96  Storage.ibd
96  Status.ibd
96  MediaType.ibd
96  LocationLog.ibd
96  Location.ibd
96  Device.ibd
96  Counters.ibd
96  CDImages.ibd
4   Snapshot.MYI
0   Snapshot.MYD



Not related, but I just noticed that somehow the new Snapshot table is 
MyISAM format.  How did that happen?

Regarding:
 > Would be nice also if you can give the number of Filename per Client 
(from the job table).

Do you have a sample SQL to retrieve this stat?


thanks,
Stephen







On 10/03/2015 12:02 AM, Eric Bollengier wrote:
> Hello Stephen,
>
> On 10/03/2015 12:00 AM, Stephen Thompson wrote:
>>
>>
>> All,
>>
>> I believe I'm having mysql database issues since upgrading to 7.2 (from
>> 7.0.2).  I run mysql innodb with 900Gb database that's largely the File
>> table.
>
> For large catalog, we usually advise to use PostgreSQL where we have
> multi-terabytes databases in production.
>
>> Since upgrading, I lose a few jobs a night due to database locking
>> timeouts, which I have set to 3600.  I also log slow queries.
>
> Can you get some information about these locks? On which table? Can you
> give some statistics on your catalog like the size and the number of
> records of the File, Filename and Path table? Would be nice also if you
> can give the number of Filename per Client (from the job table).
>
> You might have many orphan Filenames, and MySQL is not always very good
> to join large tables (it uses nested loops, and cannot use the index on
> the Text column in all queries).
>
>> It appears that typically during a months I have about 90-100 queries
>> that take longer than 15 minutes to run.  Already this month (upgraded
>> earlier this week), I have 32 queries that take longer than 15 minutes.
>>At this rate (after 2 days) that will up my regular average of 90-100
>> to 480!
>>
>> Something is wrong and the coincidence is pretty strong that it's
>> related to the upgrade.
>
> Maybe, but I'm not sure, we did not change a lot of thing in this area,
> we did mostly refactoring.
>
> Best Regards,
> Eric
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--

[Bacula-users] 7.2 mysql issue?

2015-10-02 Thread Stephen Thompson


All,

I believe I'm having mysql database issues since upgrading to 7.2 (from 
7.0.2).  I run mysql innodb with 900Gb database that's largely the File 
table.

Since upgrading, I lose a few jobs a night due to database locking 
timeouts, which I have set to 3600.  I also log slow queries.

It appears that typically during a months I have about 90-100 queries 
that take longer than 15 minutes to run.  Already this month (upgraded 
earlier this week), I have 32 queries that take longer than 15 minutes. 
  At this rate (after 2 days) that will up my regular average of 90-100 
to 480!

Something is wrong and the coincidence is pretty strong that it's 
related to the upgrade.

Ideas?

thanks,
Stephen



On 09/25/2015 09:02 AM, Stephen Thompson wrote:
>
>
> So far so good.  Minor snafu on my part when updating database, but I'm
> running 7.2 now.  Looking good so far.  Will find out more when hundreds
> of jobs run tonight.
>
> Stephen
>
>
>
> On 09/24/2015 08:40 AM, Stephen Thompson wrote:
>>
>> All,
>>
>> I typically patch bacula pretty frequently, but I saw the somewhat
>> unusual notice on the latest release notes that warns it may not be
>> ready for use in production.  How stable is it?  I don't really have the
>> resources to test this out, but rather would have to go straight to
>> production with it.  I could always roll back, but that might entail the
>> recovery from dump of a 900GB database.  Opinions?
>>
>> thanks,
>> Stephen
>>
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] is 7.2 ready for prime time?

2015-09-25 Thread Stephen Thompson

Help?

Well, the compile and install went fine, but the update tables script is 
having issue.

I was running 7.0.5 before.  Not sure what database version, but likely 
whatever was appropriate to 7.0.5.


First time I ran script as su'ed user which caused this...
--
Altering mysql tables

This script will update a Bacula MySQL database from version 12 to 14
  which is needed to convert from Bacula Community version 5.0.x to 5.2.x

ERROR 1045 (28000): Access denied for user 'stephen'@'localhost' (using 
password: YES)
/home/bacula/conf/update_mysql_tables: line 31: [: !=: unary operator 
expected
ERROR 1045 (28000): Access denied for user 'stephen'@'localhost' (using 
password: YES)
Update of Bacula MySQL tables failed.
--

I assumed because access denied, that the script failed entirely, but 
then running it again as proper user...

Second time...
---
./update_bacula_tables
Altering mysql tables

This script will update a Bacula MySQL database from version 12 to 14
  which is needed to convert from Bacula Community version 5.0.x to 5.2.x

/home/bacula/conf/update_mysql_tables: line 31: [: too many arguments
ERROR 1050 (42S01) at line 1: Table 'RestoreObject' already exists
ERROR 1061 (42000) at line 17: Duplicate key name 'jobhisto_jobid_idx'
ERROR 1060 (42S21) at line 19: Duplicate column name 'DeltaSeq'
Update of Bacula MySQL tables succeeded.
--

Seems like it either partially ran before or I had changes already 
present from 7.0.5 update.

However, my Director will not start due to database version number not 
being 15, and if I run the script any more times...
--
Altering mysql tables

This script will update a Bacula MySQL database from version 12 to 14
  which is needed to convert from Bacula Community version 5.0.x to 5.2.x


The existing database is version 14 !!
This script can only update an existing version 12 database to version 14.
Error. Cannot upgrade this database.
--


If it updatad the database to 14, why is it not able to update to 15 if 
that's what the Director requires?


thanks!
Stephen





On 09/24/2015 11:21 AM, Kern Sibbald wrote:
> Hello,
>
> We put a caution message in every release, particularly for new features
> which are generally tested but not always tested in production. Normally
> most of the issues turn up for non-Linux distributions where we either
> have not tested or have tested less than Linux.
>
> Version 7.2.0 is as stable or more so than any prior major release. That
> said, there are always a few minor problems for each release and this
> one is no different.  All the important problems (build issues on
> Solaris and FreeBSD) have been corrected in the public git repository.
>
> Best regards,
> Kern
>
> On 15-09-24 11:40 AM, Stephen Thompson wrote:
>> All,
>>
>> I typically patch bacula pretty frequently, but I saw the somewhat
>> unusual notice on the latest release notes that warns it may not be
>> ready for use in production.  How stable is it?  I don't really have the
>> resources to test this out, but rather would have to go straight to
>> production with it.  I could always roll back, but that might entail the
>> recovery from dump of a 900GB database.  Opinions?
>>
>> thanks,
>> Stephen

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] is 7.2 ready for prime time?

2015-09-25 Thread Stephen Thompson



I run daily backups of my database and had finished my monthly full run 
for September, so I was technically covered.  However I was not looking 
forward to restoring a 900+Gb mysql database from a text dump which on 
my system would take days, if not an entire week.  The last time I had 
to restore database from backup it was 4 or so years ago and my database 
was only 300-400Gb back then.

Stephen



On 09/25/2015 08:50 AM, Raymond Burns Jr. wrote:
> Did you run a backup of the database?
> If not, I bet you were terrified with all the errors :)
> Same thing happened to me going to 7.0.5, and it sent me for a frenzy. I
> didn't run a backup of the database because of all the great responses
> from people.
>
> When is the 7.2.0 rpm expected? Not running update until the rpm is there.
>
> On Fri, Sep 25, 2015 at 10:43 AM Stephen Thompson
> <step...@seismo.berkeley.edu <mailto:step...@seismo.berkeley.edu>> wrote:
>
>
>
> Spoke too soon, I see what's going on, I was running update script from
> new location (7.2.0) and it's referencing old location (7.0.5) and
> running the wrong mysql script.
>
>
>
> On 09/25/2015 08:34 AM, Stephen Thompson wrote:
>  >
>  > Help?
>  >
>  > Well, the compile and install went fine, but the update tables
> script is
>  > having issue.
>  >
>  > I was running 7.0.5 before.  Not sure what database version, but
> likely
>  > whatever was appropriate to 7.0.5.
>  >
>  >
>  > First time I ran script as su'ed user which caused this...
>  > --
>  > Altering mysql tables
>  >
>  > This script will update a Bacula MySQL database from version 12 to 14
>  >which is needed to convert from Bacula Community version 5.0.x
> to 5.2.x
>  >
>  > ERROR 1045 (28000): Access denied for user 'stephen'@'localhost'
> (using
>  > password: YES)
>  > /home/bacula/conf/update_mysql_tables: line 31: [: !=: unary operator
>  > expected
>  > ERROR 1045 (28000): Access denied for user 'stephen'@'localhost'
> (using
>  > password: YES)
>  > Update of Bacula MySQL tables failed.
>  > --
>  >
>  > I assumed because access denied, that the script failed entirely, but
>  > then running it again as proper user...
>  >
>  > Second time...
>  > ---
>  > ./update_bacula_tables
>  > Altering mysql tables
>  >
>  > This script will update a Bacula MySQL database from version 12 to 14
>  >which is needed to convert from Bacula Community version 5.0.x
> to 5.2.x
>  >
>  > /home/bacula/conf/update_mysql_tables: line 31: [: too many arguments
>  > ERROR 1050 (42S01) at line 1: Table 'RestoreObject' already exists
>  > ERROR 1061 (42000) at line 17: Duplicate key name
> 'jobhisto_jobid_idx'
>  > ERROR 1060 (42S21) at line 19: Duplicate column name 'DeltaSeq'
>  > Update of Bacula MySQL tables succeeded.
>  > --
>  >
>  > Seems like it either partially ran before or I had changes already
>  > present from 7.0.5 update.
>  >
>  > However, my Director will not start due to database version
> number not
>  > being 15, and if I run the script any more times...
>  > --
>  > Altering mysql tables
>  >
>  > This script will update a Bacula MySQL database from version 12 to 14
>  >which is needed to convert from Bacula Community version 5.0.x
> to 5.2.x
>  >
>  >
>  > The existing database is version 14 !!
>  > This script can only update an existing version 12 database to
> version 14.
>  > Error. Cannot upgrade this database.
>  > --
>  >
>  >
>  > If it updatad the database to 14, why is it not able to update to
> 15 if
>  > that's what the Director requires?
>  >
>  >
>  > thanks!
>  > Stephen
>  >
>  >
>  >
>  >
>  >
>  > On 09/24/2015 11:21 AM, Kern Sibbald wrote:
>  >> Hello,
>  >>
>  >> We put a caution message in every release, particularly for new
> features
>  >> which are generally tested but not always tested in production.
> Normally
>  >> most of the issues turn up for non-Linux distributions where we
> either
>  >> have not tested or have tested less than Linux.
>

Re: [Bacula-users] is 7.2 ready for prime time?

2015-09-25 Thread Stephen Thompson


Spoke too soon, I see what's going on, I was running update script from 
new location (7.2.0) and it's referencing old location (7.0.5) and 
running the wrong mysql script.



On 09/25/2015 08:34 AM, Stephen Thompson wrote:
>
> Help?
>
> Well, the compile and install went fine, but the update tables script is
> having issue.
>
> I was running 7.0.5 before.  Not sure what database version, but likely
> whatever was appropriate to 7.0.5.
>
>
> First time I ran script as su'ed user which caused this...
> --
> Altering mysql tables
>
> This script will update a Bacula MySQL database from version 12 to 14
>which is needed to convert from Bacula Community version 5.0.x to 5.2.x
>
> ERROR 1045 (28000): Access denied for user 'stephen'@'localhost' (using
> password: YES)
> /home/bacula/conf/update_mysql_tables: line 31: [: !=: unary operator
> expected
> ERROR 1045 (28000): Access denied for user 'stephen'@'localhost' (using
> password: YES)
> Update of Bacula MySQL tables failed.
> --
>
> I assumed because access denied, that the script failed entirely, but
> then running it again as proper user...
>
> Second time...
> ---
> ./update_bacula_tables
> Altering mysql tables
>
> This script will update a Bacula MySQL database from version 12 to 14
>which is needed to convert from Bacula Community version 5.0.x to 5.2.x
>
> /home/bacula/conf/update_mysql_tables: line 31: [: too many arguments
> ERROR 1050 (42S01) at line 1: Table 'RestoreObject' already exists
> ERROR 1061 (42000) at line 17: Duplicate key name 'jobhisto_jobid_idx'
> ERROR 1060 (42S21) at line 19: Duplicate column name 'DeltaSeq'
> Update of Bacula MySQL tables succeeded.
> --
>
> Seems like it either partially ran before or I had changes already
> present from 7.0.5 update.
>
> However, my Director will not start due to database version number not
> being 15, and if I run the script any more times...
> --
> Altering mysql tables
>
> This script will update a Bacula MySQL database from version 12 to 14
>which is needed to convert from Bacula Community version 5.0.x to 5.2.x
>
>
> The existing database is version 14 !!
> This script can only update an existing version 12 database to version 14.
> Error. Cannot upgrade this database.
> --
>
>
> If it updatad the database to 14, why is it not able to update to 15 if
> that's what the Director requires?
>
>
> thanks!
> Stephen
>
>
>
>
>
> On 09/24/2015 11:21 AM, Kern Sibbald wrote:
>> Hello,
>>
>> We put a caution message in every release, particularly for new features
>> which are generally tested but not always tested in production. Normally
>> most of the issues turn up for non-Linux distributions where we either
>> have not tested or have tested less than Linux.
>>
>> Version 7.2.0 is as stable or more so than any prior major release. That
>> said, there are always a few minor problems for each release and this
>> one is no different.  All the important problems (build issues on
>> Solaris and FreeBSD) have been corrected in the public git repository.
>>
>> Best regards,
>> Kern
>>
>> On 15-09-24 11:40 AM, Stephen Thompson wrote:
>>> All,
>>>
>>> I typically patch bacula pretty frequently, but I saw the somewhat
>>> unusual notice on the latest release notes that warns it may not be
>>> ready for use in production.  How stable is it?  I don't really have the
>>> resources to test this out, but rather would have to go straight to
>>> production with it.  I could always roll back, but that might entail the
>>> recovery from dump of a 900GB database.  Opinions?
>>>
>>> thanks,
>>> Stephen
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] is 7.2 ready for prime time?

2015-09-25 Thread Stephen Thompson


So far so good.  Minor snafu on my part when updating database, but I'm 
running 7.2 now.  Looking good so far.  Will find out more when hundreds 
of jobs run tonight.

Stephen



On 09/24/2015 08:40 AM, Stephen Thompson wrote:
>
> All,
>
> I typically patch bacula pretty frequently, but I saw the somewhat
> unusual notice on the latest release notes that warns it may not be
> ready for use in production.  How stable is it?  I don't really have the
> resources to test this out, but rather would have to go straight to
> production with it.  I could always roll back, but that might entail the
> recovery from dump of a 900GB database.  Opinions?
>
> thanks,
> Stephen
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] is 7.2 ready for prime time?

2015-09-25 Thread Stephen Thompson

Thanks, I'll be upgrading soon.

What known bugs are in the update_bacula_tables scripts?

thanks,
Stephen


On 9/24/15 10:51 PM, Uwe Schuerkamp wrote:
> On Thu, Sep 24, 2015 at 08:40:05AM -0700, Stephen Thompson wrote:
>>
>> All,
>>
>> I typically patch bacula pretty frequently, but I saw the somewhat
>> unusual notice on the latest release notes that warns it may not be
>> ready for use in production.  How stable is it?  I don't really have the
>> resources to test this out, but rather would have to go straight to
>> production with it.  I could always roll back, but that might entail the
>> recovery from dump of a 900GB database.  Opinions?
>>
>
> I upgraded five bacula instances of varying size over the last four
> weeks or so, starting with the smallest (all were on 7.0.5 compiled
> from source on CentOS), no issues so far apart from the little bugs in
> the update_bacula_tables script.
>
> Cheers, Uwe
>

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall #4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] is 7.2 ready for prime time?

2015-09-24 Thread Stephen Thompson

All,

I typically patch bacula pretty frequently, but I saw the somewhat 
unusual notice on the latest release notes that warns it may not be 
ready for use in production.  How stable is it?  I don't really have the 
resources to test this out, but rather would have to go straight to 
production with it.  I could always roll back, but that might entail the 
recovery from dump of a 900GB database.  Opinions?

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] bacula-fd.service systemd file?

2015-09-23 Thread Stephen Thompson

All,

I build bacula 7.2 on rhel 7.1 but have not systemd file for bacula-fd. 
  Is there an example available?

I thought perhaps that building bacula would make one, as I have this at 
the end of my configure output:

systemd support: yes /etc/systemd/system

But I do not appear to see any systemd file example in the source tree. 
  Am I just not looking in the right place?  If one does not exist, does 
anyone have one that I could see?

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall #4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula-fd.service systemd file?

2015-09-23 Thread Stephen Thompson

Sorry, I don't know how I missed this before in src tree...

./platforms/systemd/bacula-fd.service


On 9/23/15 10:45 AM, Stephen Thompson wrote:
>
> All,
>
> I build bacula 7.2 on rhel 7.1 but have not systemd file for bacula-fd.
>   Is there an example available?
>
> I thought perhaps that building bacula would make one, as I have this at
> the end of my configure output:
>
> systemd support: yes /etc/systemd/system
>
> But I do not appear to see any systemd file example in the source tree.
>   Am I just not looking in the right place?  If one does not exist, does
> anyone have one that I could see?
>
> thanks,
> Stephen

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall #4760
Office: 510.664.9177   University of California, Berkeley
Remote: 510.214.6506 (Tue,Wed) Berkeley, CA 94720-4760

--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Error: block.c:255 Write errors?

2014-09-05 Thread Stephen Thompson

Hello,

I sporadically get these types of alerts for one on my bacula tape 
libraries...

05-Sep 00:41 lawson-sd_L100_ JobId 389348: Error: block.c:255 Write 
error at 610:412 on device L100-Drive-0 (/dev/L100-Drive-0). 
ERR=Input/output error.

Am I correct in assuming that this was indeed a tape write error, but 
that bacula will attempt a 2nd write of the same block of data and if 
that 2nd attempt succeeds proceed on and ultimately have a successfully 
run job (one that can be restored without issue)?

In other words, should this error worry me if it doesn't happen often?
It does consistently happen -- with 100's of jobs a night, it probably 
happens 3-4 times a week.

thanks,
Stephen


--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Error: block.c:255 Write errors?

2014-09-05 Thread Stephen Thompson

Huh, maybe this is a misdiagnosis of the end of tape and a write error 
only in the sense that there is no tape left.

05-Sep 00:41 SD_L100_ JobId 389348: Error: block.c:255 Write error at 
610:412 on device L100-Drive-0 (/dev/L100-Drive-0). ERR=Input/output 
error.
05-Sep 00:41 SD_L100_ JobId 389348: Re-read of last block succeeded.
05-Sep 00:41 SD_L100_ JobId 389348: End of medium on Volume IM0161 
Bytes=1,090,307,051,520 Blocks=520,103 at 05-Sep-2014 00:41.


On 09/05/2014 09:42 AM, Stephen Thompson wrote:

 Hello,

 I sporadically get these types of alerts for one on my bacula tape
 libraries...

 05-Sep 00:41 lawson-sd_L100_ JobId 389348: Error: block.c:255 Write
 error at 610:412 on device L100-Drive-0 (/dev/L100-Drive-0).
 ERR=Input/output error.

 Am I correct in assuming that this was indeed a tape write error, but
 that bacula will attempt a 2nd write of the same block of data and if
 that 2nd attempt succeeds proceed on and ultimately have a successfully
 run job (one that can be restored without issue)?

 In other words, should this error worry me if it doesn't happen often?
 It does consistently happen -- with 100's of jobs a night, it probably
 happens 3-4 times a week.

 thanks,
 Stephen


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] solaris sparc 7.0.5 clients crash

2014-08-19 Thread Stephen Thompson

Anyone with success in running a 7x client on Solaris 10 SPARC?
We've recently attempted to upgrade clients from 5x to 7.0.5 and it 
works fine on Solaris 10 x86, but on SPARC nothing but crashes once jobs 
are submitted.  SPARC clients build and run (without jobs) fine.

http://bugs.bacula.org/view.php?id=2094

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] solaris sparc 7.0.5 clients crash

2014-08-19 Thread Stephen Thompson


Additionally, we run with Accurate backups.  It looks like the crash may 
be occurring between the time the SD sends the list for accurate backups 
but before the client traverses the fileset.

Stephen


On 8/19/14 7:47 AM, Stephen Thompson wrote:

 Anyone with success in running a 7x client on Solaris 10 SPARC?
 We've recently attempted to upgrade clients from 5x to 7.0.5 and it
 works fine on Solaris 10 x86, but on SPARC nothing but crashes once jobs
 are submitted.  SPARC clients build and run (without jobs) fine.

 http://bugs.bacula.org/view.php?id=2094

 thanks,
 Stephen


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] solaris sparc 7.0.5 clients crash

2014-08-19 Thread Stephen Thompson
 --tag=CXX 
--mode=link /usr/sfw/bin/g++  -shared bpipe-fd.lo -o bpipe-fd.la -rpath 
/opt/bacula/lib -module -export-dynamic -avoid-version
/opt/src/bacula/bacula-7.0.5-CLIENT/libtool --silent --tag=CXX 
--mode=compile /usr/sfw/bin/g++   -fno-strict-aliasing -fno-exceptions 
-fno-rtti -g -O2 -Wall -fno-strict-aliasing -fno-exceptions -fno-rtti 
-I../.. -I../../filed -c test-plugin-fd.c
/opt/src/bacula/bacula-7.0.5-CLIENT/libtool --silent --tag=CXX 
--mode=link /usr/sfw/bin/g++  -shared test-plugin-fd.lo -o 
test-plugin-fd.la -rpath /opt/bacula/lib -module -export-dynamic 
-avoid-version
/opt/src/bacula/bacula-7.0.5-CLIENT/libtool --silent --tag=CXX 
--mode=compile /usr/sfw/bin/g++   -fno-strict-aliasing -fno-exceptions 
-fno-rtti -g -O2 -Wall -fno-strict-aliasing -fno-exceptions -fno-rtti 
-I../.. -I../../filed -c test-deltaseq-fd.c
/opt/src/bacula/bacula-7.0.5-CLIENT/libtool --silent --tag=CXX 
--mode=link /usr/sfw/bin/g++  -shared test-deltaseq-fd.lo -o 
test-deltaseq-fd.la -rpath /opt/bacula/lib -module -export-dynamic 
-avoid-version
==Entering directory /opt/src/bacula/bacula-7.0.5-CLIENT/manpages




On 8/19/14 8:31 AM, Heitor Faria wrote:
 Stephen,

 Sorry for insisting on this subject, but I saw that even you using the
 --enable-client-only the configuration output said it would build Director:

 client-only:yes
 build-dird:   yes
 build-stored:   yes

 Last night I've compiled for Debian, and if the MYSQL_LIBS path wasn't
 correct the director and file daemon were not built.
 Again: this is just a wild guess that could be tested. Sorry I don't
 have a Solaris here installed to test for you.

 It's a client-only build not linked against any database.

 env CC='/usr/sfw/bin/gcc' \
 env CXX='/usr/sfw/bin/g++' \
 env CFLAGS='-g -O2' \
 env CXXFLAGS='-g -02' \
  ./configure \
  --prefix=$BHOME \
  --sbindir=$BHOME/bin \
  --sysconfdir=$BHOME/conf \
  --with-working-dir=$BHOME/work \
  --with-bsrdir=$BHOME/log \
  --with-logdir=$BHOME/log \
  --with-pid-dir=/var/run \
  --with-subsys-dir=/var/run \
  --with-basename=lawson \
  --with-hostname=lawson \
  --with-dump-email=$EMAIL \
  --enable-smartalloc \
  --enable-client-only \
  --with-openssl=no


 Same configure works fine with 5X source; been running with this for
 literally years, though many versions of 5X.  Same config works fine
 on Solaris 10 x86.

 Stephen



 On 8/19/14 8:08 AM, Heitor Faria wrote:


Anyone with success in running a 7x client on Solaris 10 SPARC?
We've recently attempted to upgrade clients from 5x to 7.0.5
 and it
works fine on Solaris 10 x86, but on SPARC nothing but
 crashes once jobs
are submitted.  SPARC clients build and run (without jobs) fine.

 Couldn't autenticate in the link but wild hint here: did you
 changed the
 /src/cats/Makefile to put the correct path to the database libs?

   
http://bugs.bacula.org/view.__php?id=2094
 http://bugs.bacula.org/view.php?id=2094
   
thanks,
Stephen
--
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu
 mailto:step...@seismo.berkeley.edu
 mailto:stephen@seismo.__berkeley.edu
 mailto:step...@seismo.berkeley.edu

 215 McCone Hall # 4760
510.214.6506 tel:510.214.6506 (phone)   University
 of California, Berkeley
510.643.5811 tel:510.643.5811 (fax) Berkeley,
 CA 94720-4760
   
   
 
 --__--__--
_
Bacula-users mailing list
Bacula-users@lists.__sourceforge.net
 mailto:Bacula-users@lists.sourceforge.net
 mailto:Bacula-users@lists.__sourceforge.net
 mailto:Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/__lists/listinfo/bacula-users
 https://lists.sourceforge.net/lists/listinfo/bacula-users


 --
 Stephen Thompson   Berkeley Seismological Laboratory
 step...@seismo.berkeley.edu mailto:step...@seismo.berkeley.edu
 215 McCone Hall # 4760
 510.214.6506 tel:510.214.6506 (phone)   University of
 California, Berkeley
 510.643.5811 tel:510.643.5811 (fax) Berkeley, CA
 94720-4760




 --
 
 Heitor Medrado de Faria | Need Bacula training? 10% discount coupon code
 at Udemy: bacula-users
 https://www.udemy.com/bacula-backup-software/?couponCode=bacula-users
 +55 61 2021-8260
 +55 61 8268-4220
 Site: www.bacula.com.br

Re: [Bacula-users] solaris sparc 7.0.5 clients crash

2014-08-19 Thread Stephen Thompson


Ah.  I think that fixed it.  Thanks!


On 8/19/14 10:28 AM, Martin Simmons wrote:
 On Tue, 19 Aug 2014 07:47:39 -0700, Stephen Thompson said:

 Anyone with success in running a 7x client on Solaris 10 SPARC?
 We've recently attempted to upgrade clients from 5x to 7.0.5 and it
 works fine on Solaris 10 x86, but on SPARC nothing but crashes once jobs
 are submitted.  SPARC clients build and run (without jobs) fine.

 http://bugs.bacula.org/view.php?id=2094

 Maybe a compiler bug?  Try without -O2.

 __Martin

 --
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] 7.0.4 director crashes

2014-08-12 Thread Stephen Thompson

Hello,

We've had 3 director crashes since updating to 7.0.4, which is highly 
unusual for us.  We've had a stable bacula for years now.  Don't know if 
anyone else has had this issue.

We're running on Redhat 6.5 x86_64.

I have yet to get a trace.  First crash, I hadn't enabled sudo, and the 
2-3 crashes, I hadn't disabled sudo's requirement for a tty, so in all 
three cases btraceback was not able to run properly.  I believe I have 
this resolved in case it crashes again, but I thought I'd ping this list 
to see if anyone had thoughts.

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.0.4 director crashes

2014-08-12 Thread Stephen Thompson

Thanks for the feedback.  We've been running on 7.0.4 since June 10th 
and have had 3 crashes.  Have 130+ clients with nightly incrementals and 
monthly fulls.

Stephen



On 8/12/14 10:07 AM, Francisco Rafael wrote:
 I'm using 7.0.5 with 40+ clients, no crash so far... CentOS 6.5 x64.



 2014-08-12 13:50 GMT-03:00 John Drescher dresche...@gmail.com
 mailto:dresche...@gmail.com:

   We've had 3 director crashes since updating to 7.0.4, which is highly
   unusual for us.  We've had a stable bacula for years now.  Don't
 know if
   anyone else has had this issue.
  

 I have not had any crashes on gentoo with 7.0.4 and 35+ clients.

 John

 
 --
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 mailto:Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users




 --



 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.0.4 director crashes

2014-08-12 Thread Stephen Thompson

My most recent crashes created lockdump files, but not my initial one.

Stephen


On 8/12/14 11:45 AM, Clark, Patricia A. wrote:
 I have 2 separate instances of Bacula v7.0.5 on RHEL6.5 x86_64.  One has had 
 the server FD segfault once.  The second instance has had the director 
 segfault twice now.  The server has large file systems mounted and is backing 
 these up.  It does not have any external clients at this time.  It is now 
 generating lockdump files in the spool area when this happens.  I have not 
 gone further into debugging as of yet since it has been only on the weekend.

 Patti Clark
 Linux System Administrator
 RD Systems Support Oak Ridge National Laboratory

 From: Stephen Thompson 
 step...@seismo.berkeley.edumailto:step...@seismo.berkeley.edu
 Date: Tuesday, August 12, 2014 at 1:20 PM
 To: 
 bacula-users@lists.sourceforge.netmailto:bacula-users@lists.sourceforge.net
  
 bacula-users@lists.sourceforge.netmailto:bacula-users@lists.sourceforge.net
 Subject: Re: [Bacula-users] 7.0.4 director crashes


 Thanks for the feedback.  We've been running on 7.0.4 since June 10th
 and have had 3 crashes.  Have 130+ clients with nightly incrementals and
 monthly fulls.

 Stephen



 On 8/12/14 10:07 AM, Francisco Rafael wrote:
 I'm using 7.0.5 with 40+ clients, no crash so far... CentOS 6.5 x64.



 2014-08-12 13:50 GMT-03:00 John Drescher 
 dresche...@gmail.commailto:dresche...@gmail.com
 mailto:dresche...@gmail.com:

 We've had 3 director crashes since updating to 7.0.4, which is highly
 unusual for us.  We've had a stable bacula for years now.  Don't
   know if
 anyone else has had this issue.


   I have not had any crashes on gentoo with 7.0.4 and 35+ clients.

   John

   
 --
   ___
   Bacula-users mailing list
   
 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net
   mailto:Bacula-users@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/bacula-users




 --



 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


 --
 Stephen Thompson   Berkeley Seismological Laboratory
 step...@seismo.berkeley.edumailto:step...@seismo.berkeley.edu215 McCone 
 Hall # 4760
 510.214.6506 (phone)   University of California, Berkeley
 510.643.5811 (fax) Berkeley, CA 94720-4760

 --
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


 --
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] 7.0.4 director crashes

2014-08-12 Thread Stephen Thompson


Additionally, I see that my 'crashes' were segmentation violations.

Aug  6 02:20:04 HOST bacula-dir: Bacula interrupted by signal 11: 
Segmentation violation
Aug  7 03:40:05 HOST bacula-dir: Bacula interrupted by signal 11: 
Segmentation violation


Stephen



On 8/12/14 1:00 PM, Stephen Thompson wrote:

 My most recent crashes created lockdump files, but not my initial one.

 Stephen


 On 8/12/14 11:45 AM, Clark, Patricia A. wrote:
 I have 2 separate instances of Bacula v7.0.5 on RHEL6.5 x86_64.  One
 has had the server FD segfault once.  The second instance has had the
 director segfault twice now.  The server has large file systems
 mounted and is backing these up.  It does not have any external
 clients at this time.  It is now generating lockdump files in the
 spool area when this happens.  I have not gone further into debugging
 as of yet since it has been only on the weekend.

 Patti Clark
 Linux System Administrator
 RD Systems Support Oak Ridge National Laboratory

 From: Stephen Thompson
 step...@seismo.berkeley.edumailto:step...@seismo.berkeley.edu
 Date: Tuesday, August 12, 2014 at 1:20 PM
 To:
 bacula-users@lists.sourceforge.netmailto:bacula-users@lists.sourceforge.net
 bacula-users@lists.sourceforge.netmailto:bacula-users@lists.sourceforge.net

 Subject: Re: [Bacula-users] 7.0.4 director crashes


 Thanks for the feedback.  We've been running on 7.0.4 since June 10th
 and have had 3 crashes.  Have 130+ clients with nightly incrementals and
 monthly fulls.

 Stephen



 On 8/12/14 10:07 AM, Francisco Rafael wrote:
 I'm using 7.0.5 with 40+ clients, no crash so far... CentOS 6.5 x64.



 2014-08-12 13:50 GMT-03:00 John Drescher
 dresche...@gmail.commailto:dresche...@gmail.com
 mailto:dresche...@gmail.com:

 We've had 3 director crashes since updating to 7.0.4, which
 is highly
 unusual for us.  We've had a stable bacula for years now.  Don't
   know if
 anyone else has had this issue.


   I have not had any crashes on gentoo with 7.0.4 and 35+ clients.

   John


 --

   ___
   Bacula-users mailing list

 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net

   mailto:Bacula-users@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/bacula-users




 --




 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net

 https://lists.sourceforge.net/lists/listinfo/bacula-users


 --
 Stephen Thompson   Berkeley Seismological Laboratory
 step...@seismo.berkeley.edumailto:step...@seismo.berkeley.edu215
 McCone Hall # 4760
 510.214.6506 (phone)   University of California, Berkeley
 510.643.5811 (fax) Berkeley, CA 94720-4760

 --

 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net

 https://lists.sourceforge.net/lists/listinfo/bacula-users


 --

 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] issue with setuid/gid on restored files

2014-07-23 Thread Stephen Thompson


Redhat 6.5 x86_64

On 7/23/14 12:50 AM, Kern Sibbald wrote:
 Different Linux OSes have very different behaviors, which OS are you
 running (distribution and version)?

 On 07/23/2014 12:10 AM, Stephen Thompson wrote:
 I'm running 7.0.4.



 Here's an example...

 (before backup)
 # ls -ld /bin
 dr-xr-xr-x 2 root root 4096 Jul 22 09:56 /bin
 # ls -l /bin/ping
 -rwsr-xr-x 1 root root 40760 Sep 17  2013 /bin/ping

 (after restore selecting file /bin/ping)
 # ls -ld  /bin
 drwsr-xr-x 2 root root 4096 Jul 22 14:38 bin
 # ls -l /bin/ping
 -rwxr-xr-x 1 root root 40760 Sep 17  2013 ping

 (after restore selecting file /bin/ping and directory /bin)
 # ls -ld  /bin
 dr-xr-xr-x 2 root root 4096 Jul 22 14:38 bin
 # ls -l /bin/ping
 -rwxr-xr-x 1 root root 40760 Sep 17  2013 ping


 In the first restore case, looks like the dir has user-write permissions
 as well, which isn't right, but perhaps that comes from the umask of the
 restore since the directory wasn't part of the restore selection.
 However, the setuid bit certainly wouldn't be coming from the umask.
 I'm jumping to the conclusion that whatever's doing the setuid bit is
 messing up and doing it to the parent directory instead of to the file.

 Stephen





 On 7/22/14 2:58 PM, Stephen Thompson wrote:

 Sorry if I have not researched this enough before bringing it to the
 list, but what I'm seeing is very odd.  Someone else must have run into
 this before me.

 If I restore a setuid or setgid file, the file is restored without the
 setuid/setgid bit set.  However, the directory containing the file
 (which did not have it's setuid/setgid bit set during the backup) winds
 up with the setuid/setgid bit being set.

 If I restore both the directory and the file, the directory ends up with
 the proper non-setuid/setgid attributes, but the file once again ends
 up without the setuid/setgid bit set.  I'm assuming the directory has
 the bit set during an interim stage of the restore, but is then properly
 set when it's attributes are set during the restore (which must happen
 after the files that it contains).

 I can't say authoritatively, but I don't believe this is the way bacula
 used to behave for me.  And to say the least, this is far from
 acceptable.  I discovered this during a bare metal restore, and have
 loads of issues from no setuid or setgid bits being set on the restored
 system.

 thanks,
 Stephen

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] issue with setuid/gid on restored files

2014-07-23 Thread Stephen Thompson

compiled from scratch.



On 7/23/14 8:02 AM, Simone Caronni wrote:
 On 23 July 2014 16:18, Kern Sibbald k...@sibbald.com
 mailto:k...@sibbald.com wrote:

 On 07/23/2014 04:04 PM, Stephen Thompson wrote:
   Redhat 6.5 x86_64

 OK, that is a particularly tricky system as they have added additional
 system security which does not permit certain sequences of API calls
 even as root which other Linux OSes permit :-(  I.e. we test on the
 latest debian/ubuntu and the code works, but not on RHEL 6.x ...

 I will look at the code as I may have a patch that will help, but I
 don't remember it having to do with the setuid bit.

 I recommend that you submit a bug report on this, because if I get
 distracted this weekend, I might miss coming back to this problem.  With
 a bug report, it remains very visible until it is corrected.


 Stephen, can you please add me in CC to the bug?
 I'm the current Fedora Bacula maintainer.

 BTW, have you compiled Bacula from scratch or used backported packages [1]?

 Thanks,
 --Simone

 [1] http://repos.fedorapeople.org/repos/slaanesh/bacula7/

 --
 You cannot discover new oceans unless you have the courage to lose sight
 of the shore (R. W. Emerson).

 http://xkcd.com/229/
 http://negativo17.org/

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] issue with setuid/gid on restored files

2014-07-22 Thread Stephen Thompson


Sorry if I have not researched this enough before bringing it to the 
list, but what I'm seeing is very odd.  Someone else must have run into 
this before me.

If I restore a setuid or setgid file, the file is restored without the 
setuid/setgid bit set.  However, the directory containing the file 
(which did not have it's setuid/setgid bit set during the backup) winds 
up with the setuid/setgid bit being set.

If I restore both the directory and the file, the directory ends up with 
the proper non-setuid/setgid attributes, but the file once again ends 
up without the setuid/setgid bit set.  I'm assuming the directory has 
the bit set during an interim stage of the restore, but is then properly 
set when it's attributes are set during the restore (which must happen 
after the files that it contains).

I can't say authoritatively, but I don't believe this is the way bacula 
used to behave for me.  And to say the least, this is far from 
acceptable.  I discovered this during a bare metal restore, and have 
loads of issues from no setuid or setgid bits being set on the restored 
system.

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] issue with setuid/gid on restored files

2014-07-22 Thread Stephen Thompson

I'm running 7.0.4.



Here's an example...

(before backup)
# ls -ld /bin
dr-xr-xr-x 2 root root 4096 Jul 22 09:56 /bin
# ls -l /bin/ping
-rwsr-xr-x 1 root root 40760 Sep 17  2013 /bin/ping

(after restore selecting file /bin/ping)
# ls -ld  /bin
drwsr-xr-x 2 root root 4096 Jul 22 14:38 bin
# ls -l /bin/ping
-rwxr-xr-x 1 root root 40760 Sep 17  2013 ping

(after restore selecting file /bin/ping and directory /bin)
# ls -ld  /bin
dr-xr-xr-x 2 root root 4096 Jul 22 14:38 bin
# ls -l /bin/ping
-rwxr-xr-x 1 root root 40760 Sep 17  2013 ping


In the first restore case, looks like the dir has user-write permissions 
as well, which isn't right, but perhaps that comes from the umask of the 
restore since the directory wasn't part of the restore selection. 
However, the setuid bit certainly wouldn't be coming from the umask. 
I'm jumping to the conclusion that whatever's doing the setuid bit is 
messing up and doing it to the parent directory instead of to the file.

Stephen





On 7/22/14 2:58 PM, Stephen Thompson wrote:


 Sorry if I have not researched this enough before bringing it to the
 list, but what I'm seeing is very odd.  Someone else must have run into
 this before me.

 If I restore a setuid or setgid file, the file is restored without the
 setuid/setgid bit set.  However, the directory containing the file
 (which did not have it's setuid/setgid bit set during the backup) winds
 up with the setuid/setgid bit being set.

 If I restore both the directory and the file, the directory ends up with
 the proper non-setuid/setgid attributes, but the file once again ends
 up without the setuid/setgid bit set.  I'm assuming the directory has
 the bit set during an interim stage of the restore, but is then properly
 set when it's attributes are set during the restore (which must happen
 after the files that it contains).

 I can't say authoritatively, but I don't believe this is the way bacula
 used to behave for me.  And to say the least, this is far from
 acceptable.  I discovered this during a bare metal restore, and have
 loads of issues from no setuid or setgid bits being set on the restored
 system.

 thanks,
 Stephen

-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bug when canceling a job in bconsole on 7.0.2?

2014-06-10 Thread Stephen Thompson


ver 7.0.4 does not appear to have the canceling job issue I saw in 
7.0.2. yay! ...and thanks.



On 5/22/14 8:37 AM, Bill Arlofski wrote:
 On 05/22/14 11:28, Kern Sibbald wrote:
 Hello Bill,

 I have also pushed a patch that may well fix the problem you are
 having with cancel.  I have never been able to reproduce the problem,
 but I did yet another rewrite of the sellist routine as well as
 designed a number of tests, none of which every failed.  However, in
 the process I noticed that the source code that called the sellist
 methods was using the wrong calling sequence (my own fault).  I am
 pretty sure that is what was causing your problem.  In any case, this
 new code is in the current git public repo and I would appreciate it
 if you would test it.

 Best regards,
 Kern


 Hi Kern, I saw that you wrote the above as an add-on to another thread,
 I am posting it here so that this thread is complete too.

 I currently don't have time to test this, but perhaps Stephen who is
 also seeing this issue might.

 I will test it as soon as I have some free time, unless of course
 Stephen or someone else has confirmed that the patch fixes the issue.

 Thanks Kern!


 Bill


 --
 Bill Arlofski
 Reverse Polarity, LLC
 http://www.revpol.com/
 -- Not responsible for anything below this line --


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing  Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] RESTORE PRUNED FILE (WITH CATALOG BACKUPS)

2014-05-29 Thread Stephen Thompson


If you have the flexibility to do this, the simplest way might be to 
restore the catalog from tape, shut down bacula, temporarily move aside 
your up-to-date database and put the restored database in it's place 
(this is likely restoring the database from a dump file), do your 
restore now that you have a version of the database with the purged 
files, then once the restore is complete, shutdown bacula and move your 
up-to-date database back into place.


Stephen


On 5/29/14 6:49 AM, david parada wrote:
 Thanks John,

 I am not very confidence with BSCAN. Can you tell me an example to add files 
 again to catalog using your way?

 Kind regards,


 David

 +--
 |This was sent by david.par...@techex.es via Backup Central.
 |Forward SPAM to ab...@backupcentral.com.
 +--



 --
 Time is money. Stop wasting it! Get your web API in 5 minutes.
 www.restlet.com/download
 http://p.sf.net/sfu/restlet
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Time is money. Stop wasting it! Get your web API in 5 minutes.
www.restlet.com/download
http://p.sf.net/sfu/restlet
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] RESTORE PRUNED FILE (WITH CATALOG BACKUPS)

2014-05-29 Thread Stephen Thompson


I didn't mention this, but of course, you would not want to run any 
other jobs (or really do anything with bacula at all!) while running the 
old database beyond the restore of the files, otherwise those changes 
won't make it into your up-to-date database you ultimately run with.

On 5/29/14 7:21 AM, Stephen Thompson wrote:


 If you have the flexibility to do this, the simplest way might be to
 restore the catalog from tape, shut down bacula, temporarily move aside
 your up-to-date database and put the restored database in it's place
 (this is likely restoring the database from a dump file), do your
 restore now that you have a version of the database with the purged
 files, then once the restore is complete, shutdown bacula and move your
 up-to-date database back into place.


 Stephen


 On 5/29/14 6:49 AM, david parada wrote:
 Thanks John,

 I am not very confidence with BSCAN. Can you tell me an example to add files 
 again to catalog using your way?

 Kind regards,


 David

 +--
 |This was sent by david.par...@techex.es via Backup Central.
 |Forward SPAM to ab...@backupcentral.com.
 +--



 --
 Time is money. Stop wasting it! Get your web API in 5 minutes.
 www.restlet.com/download
 http://p.sf.net/sfu/restlet
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Time is money. Stop wasting it! Get your web API in 5 minutes.
www.restlet.com/download
http://p.sf.net/sfu/restlet
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bug when canceling a job in bconsole on 7.0.2?

2014-05-23 Thread Stephen Thompson


I may be able to test at the end of the month.  Right now I have 
continuous jobs running that I'd rather not inadvertently cancel.

Stephen


On 5/22/14 8:37 AM, Bill Arlofski wrote:
 On 05/22/14 11:28, Kern Sibbald wrote:
 Hello Bill,

 I have also pushed a patch that may well fix the problem you are
 having with cancel.  I have never been able to reproduce the problem,
 but I did yet another rewrite of the sellist routine as well as
 designed a number of tests, none of which every failed.  However, in
 the process I noticed that the source code that called the sellist
 methods was using the wrong calling sequence (my own fault).  I am
 pretty sure that is what was causing your problem.  In any case, this
 new code is in the current git public repo and I would appreciate it
 if you would test it.

 Best regards,
 Kern


 Hi Kern, I saw that you wrote the above as an add-on to another thread,
 I am posting it here so that this thread is complete too.

 I currently don't have time to test this, but perhaps Stephen who is
 also seeing this issue might.

 I will test it as soon as I have some free time, unless of course
 Stephen or someone else has confirmed that the patch fixes the issue.

 Thanks Kern!


 Bill


 --
 Bill Arlofski
 Reverse Polarity, LLC
 http://www.revpol.com/
 -- Not responsible for anything below this line --


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free.
http://p.sf.net/sfu/SauceLabs
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Fatal error: askdir.c:340 NULL Volume name. This shouldn't happen!!!

2014-05-05 Thread Stephen Thompson


Hello,

I believe this bug is present in version 7.0.3.

I just had it happen last night, much like I saw about 2 years ago.  I 
run 100s of incrementals each night across 2 LTO tap drives, running 
with a concurrency limit, so that jobs start whenever others are 
finished (i.e. I cannot stagger their start times.).  I'm assuming this 
is again a race condition, but one as an end-user I really cannot 
workaround.

So far the problem is not frequent, but does still appear to be an issue.

thanks,
Stephen




On 02/20/2014 09:30 AM, Kern Sibbald wrote:
 Hello Wolfgang,

 The drive is allocated first.  Your analysis is correct, but
 obviously something is wrong.  I don't think this is happening
 any more with the Enterprise version, so it will very likely
 be fixed in the next release as we will backport (or flowback)
 some rather massive changes we have made in the last
 during the freeze to the community version.

 If you want to see what is going on a little more, turn on
 a debug level in the SD of about 100.  Likewise you can set a debug
 level in the SD of say 1 or 2, then when you do a status,
 if Bacula is having difficulties reserving a drive, it will print
 out more detailed information on what is going on -- this last
 is most effective if jobs end up waiting because a resource
 (drive or volume) is not available.

 Best regards,
 Kern

 On 02/17/2014 11:54 PM, Wolfgang Denk wrote:
 Dear Kern Sibbald,

 In message 5301db23.6010...@sibbald.com you wrote:
 Were you careful to change the actual volume retention period in
 the catalog entry for the volume?  That requires a manual step after
 changing the conf file.  You can check two ways:
 Yes, I was. list volumes shows the new retention period for all
 volumes.

 1. Look at the full output from all the jobs and see if any
 volumes were recycled while the batch of jobs ran.
 Not in this run, and not in any of the last 15 or so before that.

 2. Do a llist on all the volumes that were used during the
 period the problem happened and see if they were freshly
 recycled and that the retention period is set to your new
 value.
 retention period is as expected, no recycling happened.

 In any case, I will look over your previous emails to see if I see
 anything that could point to a problem, and I will look at the bug
 report, but without a test case, this is one of those nightmare
 bugs that take huge resources and time to fix.
 Hm... I wonder why the DIR allocates for two simultaneous running jobs
 two pairs of (DRIVE, VOLUME), but not using the volume currently
 mounted in the respective drive, but in the other one.  I would
 expect, that when a job starts, that either a volume or a drive is
 selected first:

 - if the drive is selected first, and it has a tape loaded which is in
the right pool, and in status append, then there should be no need
to ask for any other tape.
 - if the volume is allocated first, and it is already loaded in a
suitable drive, then that drive should be used, ant not the other
one.

 Best regards,

 Wolfgang Denk



 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.
 http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.664.9177 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bug when canceling a job in bconsole on 7.0.2?

2014-04-29 Thread Stephen Thompson


I believe I've seen this unwanted behaviour as well.  I cannot test, as 
at the moment I have a job running that I could not have accidentally 
canceled, but this past weekend I attempted to cancel a running 
Incremental job by number (as I have successfully many times in the 
past), but somehow a different Full job that was also running at the 
time got canceled as well.

Stephen


On 4/28/14 7:15 PM, Bill Arlofski wrote:

 Whoops... Clicked send too soon.

 Just a follow-up.

 I went ahead and chose #1 in the list to see if it would cancel both jobs. It 
 did:

 *can
 Select Job(s):
   1: JobId=25775 Job=Helpdesk.2014-04-28_20.30.00_52
   2: JobId=25776 Job=Postbooks.2014-04-28_20.30.00_53
 Choose Job list to cancel (1-2): 1
 JobId=25775 Job=Helpdesk.2014-04-28_20.30.00_52
 JobId=25776 Job=Postbooks.2014-04-28_20.30.00_53
 Confirm cancel of 2 Jobs (yes/no): yes
 2001 Job Helpdesk.2014-04-28_20.30.00_52 marked to be canceled.
 3000 JobId=25775 Job=Helpdesk.2014-04-28_20.30.00_52 marked to be canceled.
 2001 Job Postbooks.2014-04-28_20.30.00_53 marked to be canceled.
 3000 JobId=25776 Job=Postbooks.2014-04-28_20.30.00_53 marked to be canceled.



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free.
http://p.sf.net/sfu/SauceLabs
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] bconsole 7.0.2 storage status issue

2014-04-16 Thread Stephen Thompson

Hello,

Wanting to confirm something new I'm seeing in 7.0.2 with bconsole.  I 
have multiple storage daemons with multiple devices.  Used to be 
(5.2.13) that a status and then 2: Storage in bconsole would present 
a list of storage devices to query.  Not it immediately returns only the 
status of the first device I have configured for my Director.  A mount 
command in comparison, will present me with what I am used to -- the 
list of devices to choose from.  Is this a feature?  A bug?

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] choosing database.

2013-09-19 Thread Stephen Thompson


The answer may partly come from how much RAM the system running the 
database has.  I've seen numerous preferences for postgres on this 
mailing list, but I've personally found on my 8Gb RAM system, I get 
better performance out of mysql.  We backup about 130+ hosts, 
incrementals nightly, differentials weekly, fulls monthly (~40TB).

Stephen


On 9/19/13 8:06 AM, Mauro wrote:
 Hello.
 I'm using bacula in a linux debian system.
 I've to backup about 30 hosts.
 I've choose postresql as database.
 What do you think about?
 Better mysql or postgres?


 --
 LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
 Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
 http://pubads.g.doubleclick.net/gampad/clk?id=58041151iu=/4140/ostg.clktrk



 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] choosing database.

2013-09-19 Thread Stephen Thompson
On 09/19/2013 08:51 AM, Mauro wrote:
 On 19 September 2013 17:20, Stephen Thompson
 step...@seismo.berkeley.edu mailto:step...@seismo.berkeley.edu wrote:



 The answer may partly come from how much RAM the system running the
 database has.  I've seen numerous preferences for postgres on this
 mailing list, but I've personally found on my 8Gb RAM system, I get
 better performance out of mysql.  We backup about 130+ hosts,
 incrementals nightly, differentials weekly, fulls monthly (~40TB).


 In my case the ram is not a problem, bacula server is in a virtual
 machine, I'm using xen, actually my ram is 4G but I can increase.
 I've to backup about 30 host, four of which have a lot of data to be
 backed up.
 One has about 80G of data, multimedia files and other.
 I've always used postgres for all my needs so I though to use it also
 for bacula server.


Given what you're going to backup, I don't think it's really going to 
matter which database you choose.  Pick whichever database you're more 
familiar with, as that's likely going to be the only difference you'll 
notice between them.

Also, in this discussion folks don't always immediately bring up 
retention as that (along with the number, not size, of files you backup) 
is going to determine your database size.  Since 90+% of the bacula 
database is the File table, that's where good or poor performance is 
going to exhibit itself.

We have a 300-400Gb File table and get reasonable performance from mysql 
and 8Gb of RAM.  We run the innodb engine for bacula itself (less 
blocking than myisam), and the myisam engine on a slave server for 
catalog dumps (faster dumps than innodb).


Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.664.9177 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] duplicate job storage device bug?

2013-08-03 Thread Stephen Thompson


Hey all,

Figured I'd throw this out there before opening a ticket in case this is 
already known or I'm just confused.

We use duplicate job control for the following reason:  We run nightly 
Incrementals of _all_ jobs.  Then rather than running Fulls on a cyclic 
schedule, we run them back-to-back, injecting a few at a time via 
scripts.  Note, we also have two tape libraries (and two SDs), one for 
Incremental Pools and one for Full Pools.

Where duplicate job control comes in is that we want a running 
Incremental to be canceled if a Full of the same job is launched on any 
given night since the Full, in our case, should take precedence and be 
run immediately.  What we see is that the Full does indeed cancel the 
running Incremental and then runs itself, HOWEVER the Full job takes on 
the storage properties (storage device) of the canceled Incremental job 
rather than using it's own settings.  The Full job then expects its Full 
Pool tape to be in the Incremental tape library, which it is not, and 
the job stalls for operator intervention.

Here's some config snippets:

   Maximum Concurrent Jobs = 2
   Allow Duplicate Jobs = no
   Cancel Lower Level Duplicates = yes
   Cancel Running Duplicates = no
   Cancel Queued Duplicates = no

Log snippets:

(incremental launches)
03-Aug 04:05 DIRECTOR JobId 316646: Start Backup JobId 316646, 
Job=CLIENT.2013-08-02_22.01.01_50
03-Aug 04:05 DIRECTOR JobId 316646: Using Device L100-Drive-0 to write.

(full launches and cancels incremental)
03-Aug 06:20 DIRECTOR JobId 316677: Cancelling duplicate JobId=316646.
03-Aug 06:20 DIRECTOR JobId 316677: 2001 Job 
sutter_5.2013-08-02_22.01.01_50 marked to be canceled.
03-Aug 06:20 DIRECTOR JobId 316677: Cancelling duplicate JobId=316646.
03-Aug 06:20 DIRECTOR JobId 316677: 2901 Job 
sutter_5.2013-08-02_22.01.01_50 not found.
03-Aug 06:20 DIRECTOR JobId 316677: 3904 Job 
sutter_5.2013-08-02_22.01.01_50 not found.
03-Aug 08:20 DIRECTOR JobId 316677: Start Backup JobId 316677, 
Job=sutter_5.2013-08-03_06.20.02_04

(full complains that volume is tried to load is incremental tape instead 
of full tape)
03-Aug 08:22 DIRECTOR JobId 316677: Using Device L100-Drive-0 to write.
03-Aug 08:22 SD_L100_ JobId 316677: 3304 Issuing autochanger load slot 
72, drive 0 command.
03-Aug 08:23 SD_L100_ JobId 316677: 3305 Autochanger load slot 72, 
drive 0, status is OK.
03-Aug 08:23 SD_L100_ JobId 316677: Warning: Director wanted Volume 
FB0718.
 Current Volume IM0097 not acceptable because:
 1998 Volume IM0097 catalog status is Full, not in Pool.

NOTE: Full job launch command was run job=sutter_5 level=Full 
storage=SL500-Drive-1 yes and yet, apparently, due to the job duplicate 
cancellation, the Full job instead attempted to use storage=L100-Drive-0.


thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Migrating from myisam to innodb

2013-03-01 Thread Stephen Thompson


Another perspective...

I've personally found that if your memory is limited (my bacula db 
server has 8Gb of RAM) that, for a bacula database, mysql performs 
_better_ than postgres.  My File table currently has 2,856,394,323 rows.

I've seen so many recommendations here and elsewhere about postgres 
being an obvious choice over mysql, but in real life practice, we've 
found at our site that mysql gave us better results (even after weeks of 
tuning postgres).

Our hybrid solution is to run mysql INNODB as the active database so to 
avoid table-locking which causes all kinds of problems, especially 
operator access to bconsole.  However, due to the painfully slow dumps 
from INNODB, we have a slave mysql server running MYISAM that we use for 
regular ole mysql dumps.

In general this works out fairly well for us.

The only unresolved issue that we have is that some of the bacula 
queries can take awhile to return.  I've tracked it down the way the db 
engine is responding to the query, but the odd thing is that the first 
time these queries run, they are quick, then the mysql engine changes 
the recipe it uses to a slower one.  Haven't figured out why or how to 
keep it running the quick way.

Stephen




On 03/01/2013 03:16 AM, Uwe Schuerkamp wrote:
 On Tue, Feb 26, 2013 at 04:23:20PM +, Alan Brown wrote:
 On 26/02/13 09:42, Uwe Schuerkamp wrote:


 for the record I'd like to give you some stats from our recent myisam
 - innodb conversion.


 For the sizes you're talking about, I'd recommend:

 1: A _lot_ more memory. 100Gb or so.

 and even more strongly:

 2: Postgresql


 Mysql is fast and good for small databases, but postgresql scales to
 large sizes with a lot less pain and suffering. Conversion here was
 relatively painless.



 Hi Alan  list,

 can you point me to some good conversion guides and esp. utlities? I
 checked the postgres documentation wiki, but half of the scripts
 linked there are dead it seems. I tried converting a mysql dump to pg
 using my2pg.pl, but the poor script ran out of memory 30 minutes into
 the conversion on the test machine (Centos 6, 8GB RAM ;-)

 I'm hoping our File table will get a lot smaller now over time as
 we've moved away from copy jobs for the time being, so the conversion
 should also get easier as tape volumes with millions of files on them
 get recycled and pruned.

 All the best, Uwe



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
510.214.6506 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Fwd: Re: wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-06 Thread Stephen Thompson


A quick test of this scenario seems to work.
Leaving Prefer Mounted Volumes = yes (default).
Setting both drives in autochanger to have 1/2 of the the total 
concurrently limit.  This per device setting seems to allow for multiple 
drives using the same Pool.

Not very well documented IMHO.

Stephen





 Original Message 
Return-Path:bob_het...@hotmail.com





Are you using the setting:

prefer mounted volumes=yes or no
?

If you had it set to yes, then you'd never use the 2nd tape drive, but
if you set it to no, sometimes you'd hit a deadlock.

I used to have an environment with more than a hundred daily jobs and
would hit a contention issue occasionally.  The developers eventually
abandoned that code in favor of setting the maximum concurrent jobs per
device

http://www.bacula.org/5.2.x-manuals/en/main/main/New_Features_in_5_0_0.html#SECTION0091


In addition, another problem I hit occasionally would appear after
upgrading the OS.  If you update your system you may need to rebuild
bacula.  Before I started rebuilding bacula at the end of system updates
I would hit race conditions and process crashes.

  Bob




--
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-05 Thread Stephen Thompson

Hello all,

I've had the following problem for ages (meaning multiple major 
revisions of bacula) and I've seen this come up from time to time on the 
mailing list, but I've never actually seen a resolution (please point me 
to one if it's been found).


background:

I run monthly Fulls and nightly Incrementals.  I have a 2 drive 
autochanger dedicated to my Incrementals.  I launch something like ~150 
Incremental jobs each night.  I am configured for 8 concurrent jobs on 
the Storage Daemon.


PROBLEM:

The first job(s) grab one of the 2 devices available in the changer 
(which is set to AutoSelect) and either load a tape, or use a tape from 
the previous evening.  All tapes in the changer are in the same 
Incremenal-Pool.

The second jobs(s) grab the other of the 2 devices available in the 
changer, but want to use the same tape that's just been mounted (or put 
into use) on the jobs that got launched first.  They will often literal 
wait the entire evening until 100's of jobs run through on only one 
device, until that tape is freed up, at which point it is unmounted from 
the first device and moved to the second.

Note, the behaviour seems to be to round-robin my 8 concurrency limit 
between the 2 available drives, which mean 4 jobs will run, and 4 jobs 
will block on waiting for the wanted Volume.  When the original 4 jobs 
are completed (not at the same time) additional jobs are launched that 
keep that wanted Volume in use.


LOG:

03-Nov 22:00 DIRECTOR JobId 267433: Start Backup JobId 267433, Job=JOB.
2012-11-03_22.00.00_0403-Nov 22:00 DIRECTOR JobId 267433: Using Device 
L100-Drive-003-Nov 22:00 DIRECTOR JobId 267433: Sending Accurate 
information.
03-Nov 22:00 sd_L100_ JobId 267433: 3307 Issuing autochanger unload 
slot 82, drive 0 command.
03-Nov 22:06 lawson-sd_L100_ JobId 267433: Warning: Volume IM0108 
wanted on L100-Drive-0 (/dev/L100-Drive-0) is in use by device 
L100-Drive-1 (/dev/L100-Drive-1)
03-Nov 22:09 sd_L100_ JobId 267433: Warning: Volume IM0108 wanted on 
L100-Drive-0 (/dev/L100-Drive-0) is in use by device L100-Drive-1 
(/dev/L100-Drive-1)
03-Nov 22:09 sd_L100_ JobId 267433: Warning: mount.c:217 Open device 
L100-Drive-0 (/dev/L100-Drive-0) Volume IM0108 failed: ERR=dev.c:513 
Unable to open device L100-Drive-0 (/dev/L100-Drive-0): ERR=No medium 
found
.
.
.


CONFIGS (partial and seem pretty straight-forward):

Schedule {
   Name = DefaultSchedule
   Run = Level=Incremental   sat-thu at 22:00
   Run = Level=Differential  fri at 22:00
}

JobDefs {
   Name = DefaultJob
   Type = Backup
   Level = Full
   Schedule = DefaultSchedule
   Incremental Backup Pool = Incremental-Pool
   Differential Backup Pool = Incremental-Pool
}

Pool {
   Name = Incremental-Pool
   Pool Type = Backup
   Storage = L100-changer
}

Storage {
   Name = L100-changer
   Device = L100-changer
   Media Type = LTO-3
   Autochanger = yes
   Maximum Concurrent Jobs = 8
}

Autochanger {
   Name = L100-changer
   Device = L100-Drive-0
   Device = L100-Drive-1
   Changer Device = /dev/L100-changer
}

Device {
   Name = L100-Drive-0
   Drive Index = 0
   Media Type = LTO-3
   Archive Device = /dev/L100-Drive-0
   AutomaticMount = yes;
   AlwaysOpen = yes;
   RemovableMedia = yes;
   RandomAccess = no;
   AutoChanger = yes;
   AutoSelect = yes;
}

Device {
   Name = L100-Drive-1
   Drive Index = 0
   Media Type = LTO-3
   Archive Device = /dev/L100-Drive-1
   AutomaticMount = yes;
   AlwaysOpen = yes;
   RemovableMedia = yes;
   RandomAccess = no;
   AutoChanger = yes;
   AutoSelect = yes;
}



thanks!
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-05 Thread Stephen Thompson


On 11/5/12 7:59 AM, John Drescher wrote:
 I've had the following problem for ages (meaning multiple major
 revisions of bacula) and I've seen this come up from time to time on the
 mailing list, but I've never actually seen a resolution (please point me
 to one if it's been found).


 background:

 I run monthly Fulls and nightly Incrementals.  I have a 2 drive
 autochanger dedicated to my Incrementals.  I launch something like ~150
 Incremental jobs each night.  I am configured for 8 concurrent jobs on
 the Storage Daemon.


 PROBLEM:

 The first job(s) grab one of the 2 devices available in the changer
 (which is set to AutoSelect) and either load a tape, or use a tape from
 the previous evening.  All tapes in the changer are in the same
 Incremenal-Pool.

 The second jobs(s) grab the other of the 2 devices available in the
 changer, but want to use the same tape that's just been mounted (or put
 into use) on the jobs that got launched first.  They will often literal
 wait the entire evening until 100's of jobs run through on only one
 device, until that tape is freed up, at which point it is unmounted from
 the first device and moved to the second.

 Note, the behaviour seems to be to round-robin my 8 concurrency limit
 between the 2 available drives, which mean 4 jobs will run, and 4 jobs
 will block on waiting for the wanted Volume.  When the original 4 jobs
 are completed (not at the same time) additional jobs are launched that
 keep that wanted Volume in use.


 LOG:

 03-Nov 22:00 DIRECTOR JobId 267433: Start Backup JobId 267433, Job=JOB.
 2012-11-03_22.00.00_0403-Nov 22:00 DIRECTOR JobId 267433: Using Device
 L100-Drive-003-Nov 22:00 DIRECTOR JobId 267433: Sending Accurate
 information.
 03-Nov 22:00 sd_L100_ JobId 267433: 3307 Issuing autochanger unload
 slot 82, drive 0 command.
 03-Nov 22:06 lawson-sd_L100_ JobId 267433: Warning: Volume IM0108
 wanted on L100-Drive-0 (/dev/L100-Drive-0) is in use by device
 L100-Drive-1 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: Volume IM0108 wanted on
 L100-Drive-0 (/dev/L100-Drive-0) is in use by device L100-Drive-1
 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: mount.c:217 Open device
 L100-Drive-0 (/dev/L100-Drive-0) Volume IM0108 failed: ERR=dev.c:513
 Unable to open device L100-Drive-0 (/dev/L100-Drive-0): ERR=No medium
 found
 .
 .
 .


 CONFIGS (partial and seem pretty straight-forward):

 Schedule {
 Name = DefaultSchedule
 Run = Level=Incremental   sat-thu at 22:00
 Run = Level=Differential  fri at 22:00
 }

 JobDefs {
 Name = DefaultJob
 Type = Backup
 Level = Full
 Schedule = DefaultSchedule
 Incremental Backup Pool = Incremental-Pool
 Differential Backup Pool = Incremental-Pool
 }

 Pool {
 Name = Incremental-Pool
 Pool Type = Backup
 Storage = L100-changer
 }

 Storage {
 Name = L100-changer
 Device = L100-changer
 Media Type = LTO-3
 Autochanger = yes
 Maximum Concurrent Jobs = 8
 }

 Autochanger {
 Name = L100-changer
 Device = L100-Drive-0
 Device = L100-Drive-1
 Changer Device = /dev/L100-changer
 }

 Device {
 Name = L100-Drive-0
 Drive Index = 0
 Media Type = LTO-3
 Archive Device = /dev/L100-Drive-0
 AutomaticMount = yes;
 AlwaysOpen = yes;
 RemovableMedia = yes;
 RandomAccess = no;
 AutoChanger = yes;
 AutoSelect = yes;
 }

 Device {
 Name = L100-Drive-1
 Drive Index = 0
 Media Type = LTO-3
 Archive Device = /dev/L100-Drive-1
 AutomaticMount = yes;
 AlwaysOpen = yes;
 RemovableMedia = yes;
 RandomAccess = no;
 AutoChanger = yes;
 AutoSelect = yes;
 }


 I do not have a good solution but I know by default bacula does not
 want to load the same pool into more than 1 storage device at the same
 time.

 John


I think it's something in the automated logic.  Because if I launch jobs 
by hand (same pool across 2 tapes devices in same autochanger) 
everything works fine.  I think it has more to do with the Scheduler 
assigning the same same Volume to all jobs and then not wanting to 
change that choice if that Volume is in use.

If I do a status on the Director for instance and see the jobs for the 
next day lined up in Scheduled jobs, they all have the same Volume listed.

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http

Re: [Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-05 Thread Stephen Thompson
On 11/05/12 08:03, Stephen Thompson wrote:


 On 11/5/12 7:59 AM, John Drescher wrote:
 I've had the following problem for ages (meaning multiple major
 revisions of bacula) and I've seen this come up from time to time on the
 mailing list, but I've never actually seen a resolution (please point me
 to one if it's been found).


 background:

 I run monthly Fulls and nightly Incrementals.  I have a 2 drive
 autochanger dedicated to my Incrementals.  I launch something like ~150
 Incremental jobs each night.  I am configured for 8 concurrent jobs on
 the Storage Daemon.


 PROBLEM:

 The first job(s) grab one of the 2 devices available in the changer
 (which is set to AutoSelect) and either load a tape, or use a tape from
 the previous evening.  All tapes in the changer are in the same
 Incremenal-Pool.

 The second jobs(s) grab the other of the 2 devices available in the
 changer, but want to use the same tape that's just been mounted (or put
 into use) on the jobs that got launched first.  They will often literal
 wait the entire evening until 100's of jobs run through on only one
 device, until that tape is freed up, at which point it is unmounted from
 the first device and moved to the second.

 Note, the behaviour seems to be to round-robin my 8 concurrency limit
 between the 2 available drives, which mean 4 jobs will run, and 4 jobs
 will block on waiting for the wanted Volume.  When the original 4 jobs
 are completed (not at the same time) additional jobs are launched that
 keep that wanted Volume in use.


 LOG:

 03-Nov 22:00 DIRECTOR JobId 267433: Start Backup JobId 267433, Job=JOB.
 2012-11-03_22.00.00_0403-Nov 22:00 DIRECTOR JobId 267433: Using Device
 L100-Drive-003-Nov 22:00 DIRECTOR JobId 267433: Sending Accurate
 information.
 03-Nov 22:00 sd_L100_ JobId 267433: 3307 Issuing autochanger unload
 slot 82, drive 0 command.
 03-Nov 22:06 lawson-sd_L100_ JobId 267433: Warning: Volume IM0108
 wanted on L100-Drive-0 (/dev/L100-Drive-0) is in use by device
 L100-Drive-1 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: Volume IM0108 wanted on
 L100-Drive-0 (/dev/L100-Drive-0) is in use by device L100-Drive-1
 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: mount.c:217 Open device
 L100-Drive-0 (/dev/L100-Drive-0) Volume IM0108 failed: ERR=dev.c:513
 Unable to open device L100-Drive-0 (/dev/L100-Drive-0): ERR=No medium
 found
 .
 .
 .


 CONFIGS (partial and seem pretty straight-forward):

 Schedule {
  Name = DefaultSchedule
  Run = Level=Incremental   sat-thu at 22:00
  Run = Level=Differential  fri at 22:00
 }

 JobDefs {
  Name = DefaultJob
  Type = Backup
  Level = Full
  Schedule = DefaultSchedule
  Incremental Backup Pool = Incremental-Pool
  Differential Backup Pool = Incremental-Pool
 }

 Pool {
  Name = Incremental-Pool
  Pool Type = Backup
  Storage = L100-changer
 }

 Storage {
  Name = L100-changer
  Device = L100-changer
  Media Type = LTO-3
  Autochanger = yes
  Maximum Concurrent Jobs = 8
 }

 Autochanger {
  Name = L100-changer
  Device = L100-Drive-0
  Device = L100-Drive-1
  Changer Device = /dev/L100-changer
 }

 Device {
  Name = L100-Drive-0
  Drive Index = 0
  Media Type = LTO-3
  Archive Device = /dev/L100-Drive-0
  AutomaticMount = yes;
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes;
  AutoSelect = yes;
 }

 Device {
  Name = L100-Drive-1
  Drive Index = 0
  Media Type = LTO-3
  Archive Device = /dev/L100-Drive-1
  AutomaticMount = yes;
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes;
  AutoSelect = yes;
 }


 I do not have a good solution but I know by default bacula does not
 want to load the same pool into more than 1 storage device at the same
 time.

 John


 I think it's something in the automated logic.  Because if I launch jobs
 by hand (same pool across 2 tapes devices in same autochanger)
 everything works fine.  I think it has more to do with the Scheduler
 assigning the same same Volume to all jobs and then not wanting to
 change that choice if that Volume is in use.


I also use Accurate backups which can sometimes take a bit before the 
job get's back to volume/drive assignments, so it might be a race 
condition where when the blocking jobs start they still want the same 
Volume as the jobs that run, because the jobs that run are still setting 
up Accurate backup and haven't been solidly assigned that Volume yet.  I 
don't know.  It's rather annoying, especially as we attempt to ramp up 
our backup capacity.

Lastly, it doesn't ALWAYS happen, though it does seem to happen more 
often than not.



 If I do a status on the Director for instance and see the jobs for the
 next day lined up in Scheduled jobs, they all have the same Volume

Re: [Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-05 Thread Stephen Thompson
On 11/05/2012 01:17 PM, Josh Fisher wrote:

 On 11/5/2012 11:03 AM, Stephen Thompson wrote:

 On 11/5/12 7:59 AM, John Drescher wrote:
 I've had the following problem for ages (meaning multiple major
 revisions of bacula) and I've seen this come up from time to time on the
 mailing list, but I've never actually seen a resolution (please point me
 to one if it's been found).


 background:

 I run monthly Fulls and nightly Incrementals.  I have a 2 drive
 autochanger dedicated to my Incrementals.  I launch something like ~150
 Incremental jobs each night.  I am configured for 8 concurrent jobs on
 the Storage Daemon.


 PROBLEM:

 The first job(s) grab one of the 2 devices available in the changer
 (which is set to AutoSelect) and either load a tape, or use a tape from
 the previous evening.  All tapes in the changer are in the same
 Incremenal-Pool.

 The second jobs(s) grab the other of the 2 devices available in the
 changer, but want to use the same tape that's just been mounted (or put
 into use) on the jobs that got launched first.  They will often literal
 wait the entire evening until 100's of jobs run through on only one
 device, until that tape is freed up, at which point it is unmounted from
 the first device and moved to the second.

 Note, the behaviour seems to be to round-robin my 8 concurrency limit
 between the 2 available drives, which mean 4 jobs will run, and 4 jobs
 will block on waiting for the wanted Volume.  When the original 4 jobs
 are completed (not at the same time) additional jobs are launched that
 keep that wanted Volume in use.


 LOG:

 03-Nov 22:00 DIRECTOR JobId 267433: Start Backup JobId 267433, Job=JOB.
 2012-11-03_22.00.00_0403-Nov 22:00 DIRECTOR JobId 267433: Using Device
 L100-Drive-003-Nov 22:00 DIRECTOR JobId 267433: Sending Accurate
 information.
 03-Nov 22:00 sd_L100_ JobId 267433: 3307 Issuing autochanger unload
 slot 82, drive 0 command.
 03-Nov 22:06 lawson-sd_L100_ JobId 267433: Warning: Volume IM0108
 wanted on L100-Drive-0 (/dev/L100-Drive-0) is in use by device
 L100-Drive-1 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: Volume IM0108 wanted on
 L100-Drive-0 (/dev/L100-Drive-0) is in use by device L100-Drive-1
 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: mount.c:217 Open device
 L100-Drive-0 (/dev/L100-Drive-0) Volume IM0108 failed: ERR=dev.c:513
 Unable to open device L100-Drive-0 (/dev/L100-Drive-0): ERR=No medium
 found
 .
 .
 .


 CONFIGS (partial and seem pretty straight-forward):

 Schedule {
   Name = DefaultSchedule
   Run = Level=Incremental   sat-thu at 
 22:00
   Run = Level=Differential  fri at 
 22:00
 }

 JobDefs {
   Name = DefaultJob
   Type = Backup
   Level = Full
   Schedule = DefaultSchedule
   Incremental Backup Pool = Incremental-Pool
   Differential Backup Pool = Incremental-Pool
 }

 Pool {
   Name = Incremental-Pool
   Pool Type = Backup
   Storage = L100-changer
 }

 Storage {
   Name = L100-changer
   Device = L100-changer
   Media Type = LTO-3
   Autochanger = yes
   Maximum Concurrent Jobs = 8
 }

 Autochanger {
   Name = L100-changer
   Device = L100-Drive-0
   Device = L100-Drive-1
   Changer Device = /dev/L100-changer
 }

 Device {
   Name = L100-Drive-0
   Drive Index = 0
   Media Type = LTO-3
   Archive Device = /dev/L100-Drive-0
   AutomaticMount = yes;
   AlwaysOpen = yes;
   RemovableMedia = yes;
   RandomAccess = no;
   AutoChanger = yes;
   AutoSelect = yes;
 }

 Device {
   Name = L100-Drive-1
   Drive Index = 0
   Media Type = LTO-3
   Archive Device = /dev/L100-Drive-1
   AutomaticMount = yes;
   AlwaysOpen = yes;
   RemovableMedia = yes;
   RandomAccess = no;
   AutoChanger = yes;
   AutoSelect = yes;
 }

 I do not have a good solution but I know by default bacula does not
 want to load the same pool into more than 1 storage device at the same
 time.

 John

 I think it's something in the automated logic.  Because if I launch jobs
 by hand (same pool across 2 tapes devices in same autochanger)
 everything works fine.  I think it has more to do with the Scheduler
 assigning the same same Volume to all jobs and then not wanting to
 change that choice if that Volume is in use.

 When both jobs start at the same time and same priority, they see the
 same exact next available volume for the pool, and so both select the
 same volume. When they select different drives, it is a problem, since
 the volume can only be in one drive.

 When you start the jobs manually, I assume you are starting them at
 different times. This works, because the first job is up and running
 with the volume loaded before the second job begins its selection
 process. One way to handle this issue is to have a different Schedule
 for each job and start the jobs at different times with one

Re: [Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-05 Thread Stephen Thompson


Going to try this out.

Stephen



On 11/05/2012 02:40 PM, Josh Fisher wrote:

 On 11/5/2012 4:28 PM, Stephen Thompson wrote:
 On 11/05/2012 01:17 PM, Josh Fisher wrote:
 On 11/5/2012 11:03 AM, Stephen Thompson wrote:
 On 11/5/12 7:59 AM, John Drescher wrote:
 I've had the following problem for ages (meaning multiple major
 revisions of bacula) and I've seen this come up from time to time on the
 mailing list, but I've never actually seen a resolution (please point me
 to one if it's been found).


 background:

 I run monthly Fulls and nightly Incrementals.  I have a 2 drive
 autochanger dedicated to my Incrementals.  I launch something like ~150
 Incremental jobs each night.  I am configured for 8 concurrent jobs on
 the Storage Daemon.


 PROBLEM:

 The first job(s) grab one of the 2 devices available in the changer
 (which is set to AutoSelect) and either load a tape, or use a tape from
 the previous evening.  All tapes in the changer are in the same
 Incremenal-Pool.

 The second jobs(s) grab the other of the 2 devices available in the
 changer, but want to use the same tape that's just been mounted (or put
 into use) on the jobs that got launched first.  They will often literal
 wait the entire evening until 100's of jobs run through on only one
 device, until that tape is freed up, at which point it is unmounted from
 the first device and moved to the second.

 Note, the behaviour seems to be to round-robin my 8 concurrency limit
 between the 2 available drives, which mean 4 jobs will run, and 4 jobs
 will block on waiting for the wanted Volume.  When the original 4 jobs
 are completed (not at the same time) additional jobs are launched that
 keep that wanted Volume in use.


 LOG:

 03-Nov 22:00 DIRECTOR JobId 267433: Start Backup JobId 267433, Job=JOB.
 2012-11-03_22.00.00_0403-Nov 22:00 DIRECTOR JobId 267433: Using Device
 L100-Drive-003-Nov 22:00 DIRECTOR JobId 267433: Sending Accurate
 information.
 03-Nov 22:00 sd_L100_ JobId 267433: 3307 Issuing autochanger unload
 slot 82, drive 0 command.
 03-Nov 22:06 lawson-sd_L100_ JobId 267433: Warning: Volume IM0108
 wanted on L100-Drive-0 (/dev/L100-Drive-0) is in use by device
 L100-Drive-1 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: Volume IM0108 wanted on
 L100-Drive-0 (/dev/L100-Drive-0) is in use by device L100-Drive-1
 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: mount.c:217 Open device
 L100-Drive-0 (/dev/L100-Drive-0) Volume IM0108 failed: ERR=dev.c:513
 Unable to open device L100-Drive-0 (/dev/L100-Drive-0): ERR=No medium
 found
 .
 .
 .


 CONFIGS (partial and seem pretty straight-forward):

 Schedule {
 Name = DefaultSchedule
 Run = Level=Incremental   sat-thu at 
 22:00
 Run = Level=Differential  fri at 
 22:00
 }

 JobDefs {
 Name = DefaultJob
 Type = Backup
 Level = Full
 Schedule = DefaultSchedule
 Incremental Backup Pool = Incremental-Pool
 Differential Backup Pool = Incremental-Pool
 }

 Pool {
 Name = Incremental-Pool
 Pool Type = Backup
 Storage = L100-changer
 }

 Storage {
 Name = L100-changer
 Device = L100-changer
 Media Type = LTO-3
 Autochanger = yes
 Maximum Concurrent Jobs = 8
 }

 Autochanger {
 Name = L100-changer
 Device = L100-Drive-0
 Device = L100-Drive-1
 Changer Device = /dev/L100-changer
 }

 Device {
 Name = L100-Drive-0
 Drive Index = 0
 Media Type = LTO-3
 Archive Device = /dev/L100-Drive-0
 AutomaticMount = yes;
 AlwaysOpen = yes;
 RemovableMedia = yes;
 RandomAccess = no;
 AutoChanger = yes;
 AutoSelect = yes;
 }

 Device {
 Name = L100-Drive-1
 Drive Index = 0
 Media Type = LTO-3
 Archive Device = /dev/L100-Drive-1
 AutomaticMount = yes;
 AlwaysOpen = yes;
 RemovableMedia = yes;
 RandomAccess = no;
 AutoChanger = yes;
 AutoSelect = yes;
 }

 I do not have a good solution but I know by default bacula does not
 want to load the same pool into more than 1 storage device at the same
 time.

 John

 I think it's something in the automated logic.  Because if I launch jobs
 by hand (same pool across 2 tapes devices in same autochanger)
 everything works fine.  I think it has more to do with the Scheduler
 assigning the same same Volume to all jobs and then not wanting to
 change that choice if that Volume is in use.
 When both jobs start at the same time and same priority, they see the
 same exact next available volume for the pool, and so both select the
 same volume. When they select different drives, it is a problem, since
 the volume can only be in one drive.

 When you start the jobs manually, I assume you are starting them at
 different times. This works, because the first job is up

Re: [Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-05 Thread Stephen Thompson


No such luck.  I already have Prefer Mounted Volumes = no set for all 
jobs.  That's apparently not a solution.

Stephen



On 11/5/12 2:57 PM, Stephen Thompson wrote:


 Going to try this out.

 Stephen



 On 11/05/2012 02:40 PM, Josh Fisher wrote:

 On 11/5/2012 4:28 PM, Stephen Thompson wrote:
 On 11/05/2012 01:17 PM, Josh Fisher wrote:
 On 11/5/2012 11:03 AM, Stephen Thompson wrote:
 On 11/5/12 7:59 AM, John Drescher wrote:
 I've had the following problem for ages (meaning multiple major
 revisions of bacula) and I've seen this come up from time to time on the
 mailing list, but I've never actually seen a resolution (please point me
 to one if it's been found).


 background:

 I run monthly Fulls and nightly Incrementals.  I have a 2 drive
 autochanger dedicated to my Incrementals.  I launch something like ~150
 Incremental jobs each night.  I am configured for 8 concurrent jobs on
 the Storage Daemon.


 PROBLEM:

 The first job(s) grab one of the 2 devices available in the changer
 (which is set to AutoSelect) and either load a tape, or use a tape from
 the previous evening.  All tapes in the changer are in the same
 Incremenal-Pool.

 The second jobs(s) grab the other of the 2 devices available in the
 changer, but want to use the same tape that's just been mounted (or put
 into use) on the jobs that got launched first.  They will often literal
 wait the entire evening until 100's of jobs run through on only one
 device, until that tape is freed up, at which point it is unmounted from
 the first device and moved to the second.

 Note, the behaviour seems to be to round-robin my 8 concurrency limit
 between the 2 available drives, which mean 4 jobs will run, and 4 jobs
 will block on waiting for the wanted Volume.  When the original 4 jobs
 are completed (not at the same time) additional jobs are launched that
 keep that wanted Volume in use.


 LOG:

 03-Nov 22:00 DIRECTOR JobId 267433: Start Backup JobId 267433, Job=JOB.
 2012-11-03_22.00.00_0403-Nov 22:00 DIRECTOR JobId 267433: Using Device
 L100-Drive-003-Nov 22:00 DIRECTOR JobId 267433: Sending Accurate
 information.
 03-Nov 22:00 sd_L100_ JobId 267433: 3307 Issuing autochanger unload
 slot 82, drive 0 command.
 03-Nov 22:06 lawson-sd_L100_ JobId 267433: Warning: Volume IM0108
 wanted on L100-Drive-0 (/dev/L100-Drive-0) is in use by device
 L100-Drive-1 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: Volume IM0108 wanted on
 L100-Drive-0 (/dev/L100-Drive-0) is in use by device L100-Drive-1
 (/dev/L100-Drive-1)
 03-Nov 22:09 sd_L100_ JobId 267433: Warning: mount.c:217 Open device
 L100-Drive-0 (/dev/L100-Drive-0) Volume IM0108 failed: ERR=dev.c:513
 Unable to open device L100-Drive-0 (/dev/L100-Drive-0): ERR=No medium
 found
 .
 .
 .


 CONFIGS (partial and seem pretty straight-forward):

 Schedule {
  Name = DefaultSchedule
  Run = Level=Incremental   sat-thu 
 at 22:00
  Run = Level=Differential  fri 
 at 22:00
 }

 JobDefs {
  Name = DefaultJob
  Type = Backup
  Level = Full
  Schedule = DefaultSchedule
  Incremental Backup Pool = Incremental-Pool
  Differential Backup Pool = Incremental-Pool
 }

 Pool {
  Name = Incremental-Pool
  Pool Type = Backup
  Storage = L100-changer
 }

 Storage {
  Name = L100-changer
  Device = L100-changer
  Media Type = LTO-3
  Autochanger = yes
  Maximum Concurrent Jobs = 8
 }

 Autochanger {
  Name = L100-changer
  Device = L100-Drive-0
  Device = L100-Drive-1
  Changer Device = /dev/L100-changer
 }

 Device {
  Name = L100-Drive-0
  Drive Index = 0
  Media Type = LTO-3
  Archive Device = /dev/L100-Drive-0
  AutomaticMount = yes;
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes;
  AutoSelect = yes;
 }

 Device {
  Name = L100-Drive-1
  Drive Index = 0
  Media Type = LTO-3
  Archive Device = /dev/L100-Drive-1
  AutomaticMount = yes;
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes;
  AutoSelect = yes;
 }

 I do not have a good solution but I know by default bacula does not
 want to load the same pool into more than 1 storage device at the same
 time.

 John

 I think it's something in the automated logic.  Because if I launch jobs
 by hand (same pool across 2 tapes devices in same autochanger)
 everything works fine.  I think it has more to do with the Scheduler
 assigning the same same Volume to all jobs and then not wanting to
 change that choice if that Volume is in use.
 When both jobs start at the same time and same priority, they see the
 same exact next available volume for the pool, and so both select the
 same volume. When

Re: [Bacula-users] Is tape filling up too early?

2012-10-17 Thread Stephen Thompson




I recently found out that I had a bad tape drive.

With the tape in the drive run the following and see if it says there 
are errors:

smartctl -a /dev/nst0


If there are errors, it's wasting tape and hence less capacity.

Stephen



On 10/17/2012 11:14 AM, Sergio Belkin wrote:
 Hi folks

 I'm using LTO3 tapes and are filling up too fast. They have supposedly
 800 GB. I know that never reach that capacity, but I am somewhat
 surprised that is full with only ~ 333 GB!!  (lesser than a half)


 If I issue a list media pool command I get

 | MediaId | VolumeName   | VolStatus | Enabled | VolBytes|
 VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType |
 LastWritten |
 +-+--+---+-+-+--+--+-+--+---+---+-+
 | 100 | LUNOCT12LTO3 | Full  |   1 | 421,590,177,792 |
   431 |   31,536,000 |   0 |0 | 0 | LTO3  |
 2012-10-16 08:11:08 |


 Output of mt -f /dev/nst0  status

 SCSI 2 tape drive:
 File number=0, block number=0, partition=0.
 Tape block size 0 bytes. Density code 0x44 (no translation).
 Soft error count since last status=0
 General status bits on (4101):
   BOT ONLINE IM_REP_EN

 The volume was recycled with 'mt -f /dev/nst0 rewind;mt -f /dev/nst0 weof'

 My storage daemon config is as follow

 Storage { # definition of myself
Name = superbackup-sd
SDPort = 9103  # Director's port
WorkingDirectory = /var/bacula/working
Pid Directory = /var/run
Maximum Concurrent Jobs = 20

 }
 Director {
Name = superbackup-dir
Password = ucuc
 }
 Director {
Name = superbackup-mon
Password = ucuc
Monitor = yes
 }
 Device {
Name = LTO3
Media Type = LTO3
Archive Device = /dev/nst0  #modificar a 1 para usar el DAT4S
AutomaticMount = yes;   # when device opened, read it
AlwaysOpen = yes;
RemovableMedia = yes;
Maximum Spool Size = 30g
Maximum Job Spool Size = 20gb
Spool Directory = /var/spool/bacula
#Maximum Network Buffer Size =  10240
#Hardware end of medium = No;
Fast Forward Space File = yes
#TWO EOF = yes
 }

 Messages {
Name = Standard
director = supernoc-dir = all
 }
 You have new mail in /var/spool/mail/root


 Could you suggest me something to improve it?

 Thanks in advance!



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-10-05 Thread Stephen Thompson

Thank you everyone for your help!

Oracle replaced the drive and while it's not running with as high a 
throughput as I would like, it's at least up at the 60MB/s (random data) 
that my other drives are at, rather than it's previous 30MB/s.

I'm still going to experiment with some of the ideas that were tossed 
out and see if I can't get even better throughput of for bacula.

thanks again,
Stephen



On 10/2/12 2:47 AM, Alan Brown wrote:
 On 02/10/12 01:35, Stephen Thompson wrote:


 Correction, the non-problem drive has a higher ECC fast error count,
 but the problem drive has a significantly higher Corrective algorithm
 invocations count.


 What that means is that it rewrote the data, which accounts for the
 lower throughput.

 LTO drives read as they write and if there are errors, they write again.

 If a cleaning tape doesn't work then you need to get the drive looked
 at/replaced under warranty.


 On 10/1/12 5:33 PM, Stephen Thompson wrote:

 On 10/1/12 4:06 PM, Alan Brown wrote:
 On 01/10/12 23:38, Stephen Thompson wrote:
 More importantly, I realized that my testing 6 months ago was not on
 all 4 of my drives, but only 2 of them.  Today, I discovered one of my
 drives (untested in the past) is getting 1/2 the throughput for random
 data writes as the others!!
 smartctl -a /dev/sg(drive) will tell you a lot

 Put a cleaning tape in it






 Cleaning tape did not improve results.

 I see some errors in the counter log on the problem drive, but I see
 even more errors on another drive which isn't having a throughput
 problem (specifically SL500 Drive 1 is the lower throughput, but C4
 Drive 1 actually has a higher error count).



 SL500 Drive 0 (~60MB/s random data throughput)
 =
 Error counter log:
   Errors Corrected by   Total   Correction
 GigabytesTotal
   ECC  rereads/errors   algorithm
 processeduncorrected
   fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  00 0 0  0  0.000
  0
 write: 00 0 0  0  0.000
  0


 SL500 Drive 1 (~30MB/s random data throughput)
 =
 Error counter log:
   Errors Corrected by   Total   Correction
 GigabytesTotal
   ECC  rereads/errors   algorithm
 processeduncorrected
   fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  00 0 0  0  0.000
  0
 write: 104540 0 0 821389  0.000
  0


 C4 Drive 0 (~60MB/s random data throughput)
 ==
 Error counter log:
   Errors Corrected by   Total   Correction
 GigabytesTotal
   ECC  rereads/errors   algorithm
 processeduncorrected
   fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  20 0 0  2  0.000
  0
 write: 00 0 0  0  0.000
  0


 C4 Drive 1 (~60MB/s random data throughput)
 ==
 Error counter log:
   Errors Corrected by   Total   Correction
 GigabytesTotal
   ECC  rereads/errors   algorithm
 processeduncorrected
   fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  20 0 0  2  0.000
  0
 write: 189610 0 0  48261  0.000
  0




 Stephen




-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-10-01 Thread Stephen Thompson



Hi,

I ran some btape tests today to verify that I'd be improving throughput 
by changing blocksize from 256KB to 2MB and found that this does indeed 
appear to be true in terms of increasing compression efficiency, but it 
doesn't seem to affect incompressible data much, if at all.  Still, it 
seems worth changing and I thank you for pointing me in that direction.

More importantly, I realized that my testing 6 months ago was not on all 
4 of my drives, but only 2 of them.  Today, I discovered one of my 
drives (untested in the past) is getting 1/2 the throughput for random 
data writes as the others!!

btape
*speed file_size=4 nb_file=4 skip_raw


SL500 Drive 0   SL500 Drive 1   C4 Drive 0  C4 Drive 1

256KB block size:
  Zeros =   92.86 MB/s   92.36 MB/s  91.38 MB/s  92.86 MB/s
  Random=   63.16 MB/s   27.53 MB/s  63.39 MB/s  63.60 MB/s

2MB block size:
  Zeros =  123.5  MB/s  122.7  MB/s 122.7  MB/s 122.7  MB/s
  Random=   62.24 MB/s   28.44 MB/s  63.62 MB/s  63.62 MB/s

^

thanks,
Stephen





On 09/28/2012 05:08 AM, Alan Brown wrote:
 On 28/09/12 02:38, Stephen Thompson wrote:

 Aren't these considered reasonable settings for LTO3?

 Maximum block size = 262144   # 256kb
 Maximum File Size = 2gb

 Not really.

 Change maximum file size to 10Gb and maximum block size to 2M

 You _must_ set all open tapes to used and restart the storage daemon
 when changing the block size. Bacula can't cope with varying maximum
 sizes on a tape

 Even with those changes, if you have a lot of small, incompressible
 files you'll see high tape overheads.




 thanks for the help!
 Stephen





-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-10-01 Thread Stephen Thompson
On 10/01/2012 03:52 PM, James Harper wrote:

 Hi,

 I ran some btape tests today to verify that I'd be improving throughput by
 changing blocksize from 256KB to 2MB and found that this does indeed
 appear to be true in terms of increasing compression efficiency, but it
 doesn't seem to affect incompressible data much, if at all.  Still, it seems
 worth changing and I thank you for pointing me in that direction.

 More importantly, I realized that my testing 6 months ago was not on all
 4 of my drives, but only 2 of them.  Today, I discovered one of my drives
 (untested in the past) is getting 1/2 the throughput for random data writes 
 as
 the others!!


 Is it definitely LTO3 and definitely using LTO3 media? LTO2 was about half 
 the speed, including using LTO2 media in an LTO3 drive.

 James


Yes, all 4 drives are HP Ultrium 3 drives.
And the same LTO3 bacula volume was used in all 4 testing runs today.
All drives are connected via 2Gb fiber.
All tests were done independent of each other with no other activity on 
the backup server during the time of the testing.


Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-10-01 Thread Stephen Thompson


On 10/1/12 4:06 PM, Alan Brown wrote:
 On 01/10/12 23:38, Stephen Thompson wrote:

 More importantly, I realized that my testing 6 months ago was not on
 all 4 of my drives, but only 2 of them.  Today, I discovered one of my
 drives (untested in the past) is getting 1/2 the throughput for random
 data writes as the others!!

 smartctl -a /dev/sg(drive) will tell you a lot

 Put a cleaning tape in it







Cleaning tape did not improve results.

I see some errors in the counter log on the problem drive, but I see 
even more errors on another drive which isn't having a throughput 
problem (specifically SL500 Drive 1 is the lower throughput, but C4 
Drive 1 actually has a higher error count).



SL500 Drive 0 (~60MB/s random data throughput)
=
Error counter log:
Errors Corrected by   Total   Correction 
GigabytesTotal
ECC  rereads/errors   algorithm 
processeduncorrected
fast | delayed   rewrites  corrected  invocations   [10^9 
bytes]  errors
read:  00 0 0  0  0.000 
   0
write: 00 0 0  0  0.000 
   0


SL500 Drive 1 (~30MB/s random data throughput)
=
Error counter log:
Errors Corrected by   Total   Correction 
GigabytesTotal
ECC  rereads/errors   algorithm 
processeduncorrected
fast | delayed   rewrites  corrected  invocations   [10^9 
bytes]  errors
read:  00 0 0  0  0.000 
   0
write: 104540 0 0 821389  0.000 
   0


C4 Drive 0 (~60MB/s random data throughput)
==
Error counter log:
Errors Corrected by   Total   Correction 
GigabytesTotal
ECC  rereads/errors   algorithm 
processeduncorrected
fast | delayed   rewrites  corrected  invocations   [10^9 
bytes]  errors
read:  20 0 0  2  0.000 
   0
write: 00 0 0  0  0.000 
   0


C4 Drive 1 (~60MB/s random data throughput)
==
Error counter log:
Errors Corrected by   Total   Correction 
GigabytesTotal
ECC  rereads/errors   algorithm 
processeduncorrected
fast | delayed   rewrites  corrected  invocations   [10^9 
bytes]  errors
read:  20 0 0  2  0.000 
   0
write: 189610 0 0  48261  0.000 
   0




Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-10-01 Thread Stephen Thompson



Correction, the non-problem drive has a higher ECC fast error count, 
but the problem drive has a significantly higher Corrective algorithm 
invocations count.


On 10/1/12 5:33 PM, Stephen Thompson wrote:


 On 10/1/12 4:06 PM, Alan Brown wrote:
 On 01/10/12 23:38, Stephen Thompson wrote:

 More importantly, I realized that my testing 6 months ago was not on
 all 4 of my drives, but only 2 of them.  Today, I discovered one of my
 drives (untested in the past) is getting 1/2 the throughput for random
 data writes as the others!!

 smartctl -a /dev/sg(drive) will tell you a lot

 Put a cleaning tape in it







 Cleaning tape did not improve results.

 I see some errors in the counter log on the problem drive, but I see
 even more errors on another drive which isn't having a throughput
 problem (specifically SL500 Drive 1 is the lower throughput, but C4
 Drive 1 actually has a higher error count).



 SL500 Drive 0 (~60MB/s random data throughput)
 =
 Error counter log:
  Errors Corrected by   Total   Correction
 GigabytesTotal
  ECC  rereads/errors   algorithm
 processeduncorrected
  fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  00 0 0  0  0.000
 0
 write: 00 0 0  0  0.000
 0


 SL500 Drive 1 (~30MB/s random data throughput)
 =
 Error counter log:
  Errors Corrected by   Total   Correction
 GigabytesTotal
  ECC  rereads/errors   algorithm
 processeduncorrected
  fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  00 0 0  0  0.000
 0
 write: 104540 0 0 821389  0.000
 0


 C4 Drive 0 (~60MB/s random data throughput)
 ==
 Error counter log:
  Errors Corrected by   Total   Correction
 GigabytesTotal
  ECC  rereads/errors   algorithm
 processeduncorrected
  fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  20 0 0  2  0.000
 0
 write: 00 0 0  0  0.000
 0


 C4 Drive 1 (~60MB/s random data throughput)
 ==
 Error counter log:
  Errors Corrected by   Total   Correction
 GigabytesTotal
  ECC  rereads/errors   algorithm
 processeduncorrected
  fast | delayed   rewrites  corrected  invocations   [10^9
 bytes]  errors
 read:  20 0 0  2  0.000
 0
 write: 189610 0 0  48261  0.000
 0




 Stephen


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-27 Thread Stephen Thompson
On 09/25/2012 10:43 AM, Alan Brown wrote:
 On 25/09/12 17:43, Stephen Thompson wrote:
 Our Sun/Oracle service engineer claims that our drives do not require
 cleaning tapes.  Does that sound legit?

 In general: true (as in, Don't do it as a scheduled item), but all LTO
 drives require cleaning tapes from time to time and sometimes benefit
 from loading one even if the clean light isn't on. It primarily
 depends on the cleanliness of the room where the drive is.

 Our throughput is pretty reasonable for our hardware -- we do use disk
 staging and get something like 60Mb/s to tape.

 60Mb/s is _slow_ for LTO3. You need to take a serious look at what
 you're using as stage disk and consider using a raid0 array of SSDs in
 order to keep up.

 Lastly, the tapes that get 200 vs 800 are from the same batch of tapes,
 same number of uses, and used by the same pair of SL500 drives.  That's
 primarily why I wondered if it could be data dependent (or a bacula bug).


 What happens if you mark the volumes as append and put them back in
 the library?


I haven't had a lot of time to look into this today, but I do this quick 
test and it immediately marks the volume Full again.

27-Sep 14:20 sd-SL500 JobId 260069: Volume FB0763 previously written, 
moving to end of data.
27-Sep 14:21 sd-SL500 JobId 260069: Ready to append to end of Volume 
FB0763 at file=110.
27-Sep 14:21 sd-SL500 JobId 260069: Spooling data ...
27-Sep 14:21 sd-SL500 JobId 260069: Job write elapsed time = 00:00:01, 
Transfer rate = 759.3 K Bytes/second
27-Sep 14:21 sd-SL500 JobId 260069: Committing spooled data to Volume 
FB0763. Despooling 762,358 bytes ...
27-Sep 14:21 sd-SL500 JobId 260069: End of Volume FB0763 at 110:1 on 
device SL500-Drive-0 (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
27-Sep 14:21 sd-SL500 JobId 260069: Re-read of last block succeeded.
27-Sep 14:21 sd-SL500 JobId 260069: End of medium on Volume FB0763 
Bytes=219,730,936,832 Blocks=838,207 at 27-Sep-2012 14:21.
27-Sep 14:21 sd-SL500 JobId 260069: 3307 Issuing autochanger unload 
slot 36, drive 0 command.





 I've seen transient scsi errors result in tapes being marked as full.

 What does smartctl show for the drive and tape in question? (run this
 against the /dev/sg of the tape drive)





-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://ad.doubleclick.net/clk;258768047;13503038;j?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-27 Thread Stephen Thompson


On 9/27/12 6:17 PM, Alan Brown wrote:
 On 27/09/12 22:25, Stephen Thompson wrote:
 What happens if you mark the volumes as append and put them back in
 the library?



 I haven't had a lot of time to look into this today, but I do this
 quick test and it immediately marks the volume Full again.


 Then it really is full and the rest is down to overheads.

 Consider using larger block sizes.




Aren't these considered reasonable settings for LTO3?

   Maximum block size = 262144   # 256kb
   Maximum File Size = 2gb



thanks for the help!
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-26 Thread Stephen Thompson
On 09/25/2012 02:29 PM, Cejka Rudolf wrote:
 Stephen Thompson wrote (2012/09/25):
 The tape in question have only been used once or twice.

 Do you mean just one or two drive loads and unloads?


Yes, I mean the tapes have only been in a drive once or twice, possibly 
for a dozen sequential jobs while in the drive, but only in and out of 
the drive once or twice.

I have seen this 200-300Gb capacity on new tapes as well as used.

I see it in both my SL500 library as well as my C4 library, which is a 
combined 4 LTO3 drives (2 in each library).


 The library is a StorageTek whose SLConsole reports no media (or drive)
 errors, though I will look into those linux-based tools.

 There are several types of errors, recoverable and non-recoverable, and
 I'm afraid that you see just non-recoverable, but it is too late to see
 them.

 Our Sun/Oracle service engineer claims that our drives do not require
 cleaning tapes.  Does that sound legit?

 If you are interested, you can study
 http://www.tarconis.com/documentos/LTO_Cleaning_wp.pdf ;o)
 So in HP case, it is possible to agree. However, you still
 have to have atleast one cleaning cartridge prepared ;o)

 Our throughput is pretty reasonable for our hardware -- we do use disk
 staging and get something like 60Mb/s to tape.

 HP LTO-3 drive can slow down physical speed to 27 MB/s, IBM LTO-3
 to 40 MB/s. Native speed is 80 MB/s, bot all these speeds are after
 compression. If you have 60 MB/s before compression and there are
 some places with somewhat better compression than 2:1, then you are not
 able to feed HP LTO-3. For IBM drive, it is suffucient to have places
 with just 2:1 to need repositions.

 Lastly, the tapes that get 200 vs 800 are from the same batch of tapes,
 same number of uses, and used by the same pair of SL500 drives.  That's
 primarily why I wondered if it could be data dependent (or a bacula bug).

 And what about the reason to switch to the next tape? Do you have something
 like this in your reports?

 22-Sep 02:22 backup-sd JobId 74990: End of Volume 1 at 95:46412 on device 
 drive0 (/dev/nsa0). Write of 65536 bytes got 0.
 22-Sep 02:22 backup-sd JobId 74990: Re-read of last block succeeded.
 22-Sep 02:22 backup-sd JobId 74990: End of medium on Volume 1 
 Bytes=381,238,317,056 Blocks=5,817,238 at 22-Sep-2012 02:22.


Here's an example of a tape that had one job and only wrote ~278Gb to 
the tape:

10-Sep 10:08 sd-SL500 JobId 256773: Recycled volume FB0095 on device 
SL500-Drive-1 (/dev/SL500-Drive-1), all previous data lost.
10-Sep 10:08 sd-SL500 JobId 256773: New volume FB0095 mounted on 
device SL500-Drive-1 (/dev/SL500-Drive-1) at 10-Sep-2012 10:08.
10-Sep 13:02 sd-SL500 JobId 256773: End of Volume FB0095 at 149:5906 
on device SL500-Drive-1 (/dev/SL500-Drive-1). Write of 262144 bytes 
got -1.
10-Sep 13:02 sd-SL500 JobId 256773: Re-read of last block succeeded.
10-Sep 13:02 sd-SL500 JobId 256773: End of medium on Volume FB0095 
Bytes=299,532,813,312 Blocks=1,142,627 at 10-Sep-2012 13:02.


 Do not you use something from the following things in bacula configuration?
  UseVolumeOnce
  Maximum Volume Jobs
  Maximum Volume Bytes
  Volume Use Duration
 ?


No, none of those are configured.


Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
How fast is your code?
3 out of 4 devs don\\\'t know how their code performs in production.
Find out how slow your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219672;13503038;z?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-26 Thread Stephen Thompson
On 09/26/2012 02:35 PM, Stephen Thompson wrote:
 On 09/25/2012 02:29 PM, Cejka Rudolf wrote:
 Stephen Thompson wrote (2012/09/25):
 The tape in question have only been used once or twice.

 Do you mean just one or two drive loads and unloads?


 Yes, I mean the tapes have only been in a drive once or twice, possibly
 for a dozen sequential jobs while in the drive, but only in and out of
 the drive once or twice.

 I have seen this 200-300Gb capacity on new tapes as well as used.


I think I pointed this out before, but I also have used and new tapes 
with 400-800Gb on them.  It seems really hit or miss, though the tapes 
with 400Gb or less are probably a 1/3 of my tapes.  The other 2/3 have 
above 400Gb.



 I see it in both my SL500 library as well as my C4 library, which is a
 combined 4 LTO3 drives (2 in each library).


 The library is a StorageTek whose SLConsole reports no media (or drive)
 errors, though I will look into those linux-based tools.

 There are several types of errors, recoverable and non-recoverable, and
 I'm afraid that you see just non-recoverable, but it is too late to see
 them.

 Our Sun/Oracle service engineer claims that our drives do not require
 cleaning tapes.  Does that sound legit?

 If you are interested, you can study
 http://www.tarconis.com/documentos/LTO_Cleaning_wp.pdf ;o)
 So in HP case, it is possible to agree. However, you still
 have to have atleast one cleaning cartridge prepared ;o)

 Our throughput is pretty reasonable for our hardware -- we do use disk
 staging and get something like 60Mb/s to tape.

 HP LTO-3 drive can slow down physical speed to 27 MB/s, IBM LTO-3
 to 40 MB/s. Native speed is 80 MB/s, bot all these speeds are after
 compression. If you have 60 MB/s before compression and there are
 some places with somewhat better compression than 2:1, then you are not
 able to feed HP LTO-3. For IBM drive, it is suffucient to have places
 with just 2:1 to need repositions.

 Lastly, the tapes that get 200 vs 800 are from the same batch of tapes,
 same number of uses, and used by the same pair of SL500 drives.  That's
 primarily why I wondered if it could be data dependent (or a bacula bug).

 And what about the reason to switch to the next tape? Do you have something
 like this in your reports?

 22-Sep 02:22 backup-sd JobId 74990: End of Volume 1 at 95:46412 on device 
 drive0 (/dev/nsa0). Write of 65536 bytes got 0.
 22-Sep 02:22 backup-sd JobId 74990: Re-read of last block succeeded.
 22-Sep 02:22 backup-sd JobId 74990: End of medium on Volume 1 
 Bytes=381,238,317,056 Blocks=5,817,238 at 22-Sep-2012 02:22.


 Here's an example of a tape that had one job and only wrote ~278Gb to
 the tape:

 10-Sep 10:08 sd-SL500 JobId 256773: Recycled volume FB0095 on device
 SL500-Drive-1 (/dev/SL500-Drive-1), all previous data lost.
 10-Sep 10:08 sd-SL500 JobId 256773: New volume FB0095 mounted on
 device SL500-Drive-1 (/dev/SL500-Drive-1) at 10-Sep-2012 10:08.
 10-Sep 13:02 sd-SL500 JobId 256773: End of Volume FB0095 at 149:5906
 on device SL500-Drive-1 (/dev/SL500-Drive-1). Write of 262144 bytes
 got -1.
 10-Sep 13:02 sd-SL500 JobId 256773: Re-read of last block succeeded.
 10-Sep 13:02 sd-SL500 JobId 256773: End of medium on Volume FB0095
 Bytes=299,532,813,312 Blocks=1,142,627 at 10-Sep-2012 13:02.


 Do not you use something from the following things in bacula configuration?
   UseVolumeOnce
   Maximum Volume Jobs
   Maximum Volume Bytes
   Volume Use Duration
 ?


 No, none of those are configured.


 Stephen



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
How fast is your code?
3 out of 4 devs don\\\'t know how their code performs in production.
Find out how slow your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219672;13503038;z?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-25 Thread Stephen Thompson



Thanks everyone for the suggestions, they at least give me somewhere to 
look, as I was running low on ideas.


More info...

The tape in question have only been used once or twice.

The library is a StorageTek whose SLConsole reports no media (or drive) 
errors, though I will look into those linux-based tools.

Our Sun/Oracle service engineer claims that our drives do not require 
cleaning tapes.  Does that sound legit?

Our throughput is pretty reasonable for our hardware -- we do use disk 
staging and get something like 60Mb/s to tape.

Lastly, the tapes that get 200 vs 800 are from the same batch of tapes, 
same number of uses, and used by the same pair of SL500 drives.  That's 
primarily why I wondered if it could be data dependent (or a bacula bug).


thanks!
Stephen


On 09/25/12 02:29, Cejka Rudolf wrote:
 We've been using LTO3 tapes with bacula for a few years now.  Recently I've
 noticed how variable our tape capacity it, ranging from 200-800 Gb.
Is that strictly governed by the compressibility of the actual data being
 backed up?

 Hello,
the lower bound 200 GB on 400 GB LTO-3 tapes is not possible due
 to the drive compression, because it always compares, if compressed
 data are shorter that original. In other case, it writes data uncompressed.
 So, in all cases, you should see atleast 400 000 000 000 bytes written
 on all tapes.

 Or is there some chance that bacula isn't squeezing as much
 onto my tapes as I would expect? 200Gb is not very much!

 In bacula, look mainly for the reasons, why there is just 200 GB written.
 If the tape is full, think about these:

 - Weared tapes. Typical tape service life is written as 200 full cycles.
However, read
http://www.xma4govt.co.uk/Libraries/Manufacturer/ultriumwhitepaper_EEE.sflb
where they experienced problems with some tapes just only after
30 cycles! How many cycles could you have with your tapes?

 - Do you use disk staging, so that tape writes are done at full speed?
Do you have a good disk staging? Considering using SSDs for staging
is very wise. If data rate is lower that 1/3 to 1/2 of native tape
speed (based on drive vendor, HP or IBM), then drive has to perform
tape repositions, which means another important excessive drive and
tape wearing.

My experiences are, that even HW RAID-0 with four 10k disks could not
be sufficient and when there are data writes and reads in parallel,
it could not put 80 MB/s to the drive, typically just 50-70 MB/s,
which is still acceptable for LTO-3, but not good.

Currently, I have 4 x 450 GB SSDs HW RAID-0 with over 1500 GB/s without
problem running writes and reads in parallel and just after that I hope
that it is really sufficient for = LTO-3 staging and putting drives and
tapes wearing to minimum.

 - Dirty heads. You can enforce cleaning cycle, but then return to the
two points above and other suggestiong, like using some monitoring
like ltt on Linux (or I have some home made reporting tool using
camcontrol on FreeBSD), where it would be possible to ensure, that
your problem are weared tapes, or something else.

 Best regards.



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-25 Thread Stephen Thompson
On 09/25/2012 10:43 AM, Alan Brown wrote:
 On 25/09/12 17:43, Stephen Thompson wrote:
 Our Sun/Oracle service engineer claims that our drives do not require
 cleaning tapes.  Does that sound legit?

 In general: true (as in, Don't do it as a scheduled item), but all LTO
 drives require cleaning tapes from time to time and sometimes benefit
 from loading one even if the clean light isn't on. It primarily
 depends on the cleanliness of the room where the drive is.

 Our throughput is pretty reasonable for our hardware -- we do use disk
 staging and get something like 60Mb/s to tape.

 60Mb/s is _slow_ for LTO3. You need to take a serious look at what
 you're using as stage disk and consider using a raid0 array of SSDs in
 order to keep up.


Why do you say that's slow when the max speed appears to be 80?

http://en.wikipedia.org/wiki/Linear_Tape-Open




 Lastly, the tapes that get 200 vs 800 are from the same batch of tapes,
 same number of uses, and used by the same pair of SL500 drives.  That's
 primarily why I wondered if it could be data dependent (or a bacula bug).


 What happens if you mark the volumes as append and put them back in
 the library?

 I've seen transient scsi errors result in tapes being marked as full.

 What does smartctl show for the drive and tape in question? (run this
 against the /dev/sg of the tape drive)





-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-25 Thread Stephen Thompson
On 09/25/2012 11:17 AM, Konstantin Khomoutov wrote:
 On Tue, 25 Sep 2012 11:00:07 -0700
 Stephen Thompson step...@seismo.berkeley.edu wrote:

 60Mb/s is _slow_ for LTO3. You need to take a serious look at what
 you're using as stage disk and consider using a raid0 array of SSDs
 in order to keep up.
 Why do you say that's slow when the max speed appears to be 80?
 http://en.wikipedia.org/wiki/Linear_Tape-Open
 It's quite logical, that to not starve the consumer, the producer
 should be at least as fast or faster, so you have to provide at least
 80 Mb/s sustained read rate from your spooling media to be sure the
 tape drive is kept busy.


No, I mean, there's slow and there's __SLOW__.  He seemed to be 
indicating that it was unacceptably slow.  I understand it's not optimal.

Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] LTO3 tape capacity (variable?)

2012-09-24 Thread Stephen Thompson

Hello all,

This is not likely a bacula questions, but in the chance that it is, or 
the experience on this list, I figured I would ask.

We've been using LTO3 tapes with bacula for a few years now.  Recently 
I've noticed how variable our tape capacity it, ranging from 200-800 Gb. 
  Is that strictly governed by the compressibility of the actual data 
being backed up?  Or is there some chance that bacula isn't squeezing as 
much onto my tapes as I would expect?

200Gb is not very much!

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] LTO3 tape capacity (variable?)

2012-09-24 Thread Stephen Thompson



Thanks for the info, John.

Is there anyone else in the bacula community with LTO3's seeing this 
behaviour?  I don't believe (but am not 100% sure) that I'm having any 
hardware-related issues.

Not sure what to make of this.  About 25% of tapes in a monthly run (70 
tapes) are under the 400Gb native, but then the other 75% are above it, 
some even hitting the 800Gb top.

Stephen



On 09/24/2012 12:02 PM, John Drescher wrote:
 This is not likely a bacula questions, but in the chance that it is, or
 the experience on this list, I figured I would ask.

 We've been using LTO3 tapes with bacula for a few years now.  Recently
 I've noticed how variable our tape capacity it, ranging from 200-800 Gb.
Is that strictly governed by the compressibility of the actual data
 being backed up?  Or is there some chance that bacula isn't squeezing as
 much onto my tapes as I would expect?

 200Gb is not very much!

 These tapes are 400GB native. If you get substantially less than that
 you have a configuration problem (you set limits on the volume size or
 duration) or a hardware problem. Compression should be handled
 entirely and automatically by the tape drive. Bacula does not enable
 or disable hardware compression it just passes the data to the drive
 and writes as much as it can up until it hits its first hardware
 error. At that point bacula calls the tape full and verifies that it
 can read the last block. I believe if it can't read the last block
 this block will be the first block written on the next volume.

 John



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bacula 5.2.11: Director crashes

2012-09-12 Thread Stephen Thompson


We updated our bacula server from 5.2.10 to 5.2.11 earlier today.
A few hours later the bacula-dir crashed.  This is on RedHat 6.3.

No traceback generated.


Stephen




On 09/12/2012 05:45 AM, Uwe Schuerkamp wrote:
 Hi folks,

 I updated one of our bacula servers to 5.2.11 today (CentOS 6.x,
 compiled from source), but sadly the director crashes after a couple
 of copy jobs which were due this morning. Any idea how to go about
 debugging the issue?

 The server has a dir-bactrace file, but it appears to be empty, also
 the last couple of lines in the log file don't give away much beyond
 the selected jobids for copying.

 All the best,

 Uwe



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] BAT and qt vesrion

2012-08-13 Thread Stephen Thompson

You can also use the depkgs-qt from the bacula website.
It contains the necessary QT which you can statically link without 
installing the non-redhat QT on your system.

Stephen



On 08/09/2012 12:55 PM, Thomas Lohman wrote:
 I downloaded the latest stable QT open source version (4.8.2 at the
 time) and built it before building Bacula 5.2.10.  Bat seems to work
 fine with it.  If you do this, just be aware that the first time you
 build it, it will probably find the older 4.6.x RH QT libraries and
 embed their location in the shared library path so when you go to use
 it, it won't work.  The first time I built it, I told it to explicitly
 look in it's own source tree for it's libraries (by setting LDFLAGS),
 installed that version and then re-built it again telling it to now look
 in the install directory.


 --tom

 I tried to compile bacula-5.2.10 with BAT on a RHEL6.2 server. I
 found that BAT did not get installed because it needs qt version
 4.7.4 or higher but RHEL6.2 has version qt-4.6.2-24 as the latest.  I
 would like to know what the others are doing about this issue?

 Uthra

 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula confused about volumes

2012-08-05 Thread Stephen Thompson


We're seeing this with a lot more frequency, though we've changed no 
configuration.  Jobs are often left waiting an entire run in order to 
use a volume that's in use by the other drive within a 2 drive changer.

Stephen



On 7/25/12 7:38 AM, Stephen Thompson wrote:

 Hey all,

 I've been meaning to post about this for awhile, but it comes up pretty
 rarely (maybe once every few months running hundreds of job a night).

 With an autochanger with 2 drives, each set to AutoSelect, it's possible
 for bacula to want the same volume in both drives at the same time,
 which creates an Operator Intervention situation.

 Here's an example where apparently previous jobs were using a particular
 volume in one drive and somehow jobs assigned to the other drives wanted
 the exact same volume, causing them to pause and require operator
 intervention.


 sd_C4 Version: 5.2.10 (28 June 2012) x86_64-unknown-linux-gnu redhat
 Enterprise release
 Daemon started 23-Jul-12 10:13. Jobs: run=295, running=3.
Heap: heap=135,168 smbytes=2,089,365 max_bytes=3,689,580 bufs=299
 max_bufs=396
Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0

 Running Jobs:
 Writing: Incremental Backup job AAA JobId=247971 Volume=IM0081
   pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
   spooling=0 despooling=0 despool_wait=0
   Files=0 Bytes=0 Bytes/sec=0
   FDReadSeqNo=6 in_msg=6 out_msg=4 fd=9
 Writing: Incremental Backup job BBB JobId=247973 Volume=IM0081
   pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
   spooling=0 despooling=0 despool_wait=0
   Files=0 Bytes=0 Bytes/sec=0
   FDReadSeqNo=6 in_msg=6 out_msg=4 fd=13
 Writing: Incremental Backup job CCC JobId=247975 Volume=IM0081
   pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
   spooling=0 despooling=0 despool_wait=0
   Files=0 Bytes=0 Bytes/sec=0
   FDReadSeqNo=6 in_msg=6 out_msg=4 fd=15
 Writing: Incremental Backup job DDD JobId=247976 Volume=IM0081
   pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
   spooling=0 despooling=0 despool_wait=0
   Files=0 Bytes=0 Bytes/sec=0
   FDReadSeqNo=6 in_msg=6 out_msg=4 fd=18
 

 Jobs waiting to reserve a drive:
 

 Terminated Jobs:
JobId  LevelFiles  Bytes   Status   FinishedName
 ===
 XXX
 

 Device status:
 Autochanger C4-changer with devices:
  C4-Drive-0 (/dev/C4-Drive-0)
  C4-Drive-1 (/dev/C4-Drive-1)
 Device C4-Drive-0 (/dev/C4-Drive-0) is not open.
   Device is BLOCKED waiting for mount of volume IM0081,
  Pool:Incremental-Pool
  Media type:  LTO-3
   Drive 0 is not loaded.
 Device C4-Drive-1 (/dev/C4-Drive-1) is mounted with:
   Volume:  IM0081
   Pool:Incremental-Pool
   Media type:  LTO-3
   Slot 32 is loaded in drive 1.
   Total Bytes=369,270,534,144 Blocks=1,408,808 Bytes/block=262,115
   Positioned at File=203 Block=0
 

 Used Volume status:
 IM0070 on device C4-Drive-1 (/dev/C4-Drive-1)
   Reader=0 writers=0 devres=0 volinuse=0
 IM0081 on device C4-Drive-0 (/dev/C4-Drive-0)
   Reader=0 writers=0 devres=4 volinuse=0
 



 Anyone else have this happen?
 Race condition?

 thanks,
 Stephen


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Long running jobs and BackupCatalog

2012-08-02 Thread Stephen Thompson


The enterprise version may have a pause feature, but the open-source one 
does not.

We run a slave database server and make a daily dump from that, knowing 
that it will not preserve the records being made for running jobs, but 
since the running jobs aren't complete when the dump begins, they 
wouldn't be useful records to have anyway (and we're willing to be 
behind by a day on our backups if a disaster were to occur).

It's also possible to run a transactional engine on your master db and 
do a dump while jobs are running, but we found the dump times to be 
ridiculously high (like 12+ hours).  Our Catalog is something like 300Gb.

There are other options out there as well, like using a snapshot of your 
underlying filesystem, but, yeah, a pause feature sure would be nice for 
many many reasons.

Stephen




On 8/2/12 6:36 AM, Clark, Patricia A. wrote:
 Because I have quite a few long running jobs, my BackupCatalog job is not 
 getting run more than once or twice per week.  I understand the potential 
 instability of backing up the catalog while there are running jobs.  Is there 
 anything in the bacula pipeline that would pause running jobs so that the 
 catalog could be backed up?  Say a snapshot capability?

 Patti Clark
 Information International Associates, Inc.
 Linux Administrator and subcontractor to:
 Research and Development Systems Support Oak Ridge National Laboratory


 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] bacula confused about volumes

2012-07-25 Thread Stephen Thompson

Hey all,

I've been meaning to post about this for awhile, but it comes up pretty 
rarely (maybe once every few months running hundreds of job a night).

With an autochanger with 2 drives, each set to AutoSelect, it's possible 
for bacula to want the same volume in both drives at the same time, 
which creates an Operator Intervention situation.

Here's an example where apparently previous jobs were using a particular 
volume in one drive and somehow jobs assigned to the other drives wanted 
the exact same volume, causing them to pause and require operator 
intervention.


sd_C4 Version: 5.2.10 (28 June 2012) x86_64-unknown-linux-gnu redhat 
Enterprise release
Daemon started 23-Jul-12 10:13. Jobs: run=295, running=3.
  Heap: heap=135,168 smbytes=2,089,365 max_bytes=3,689,580 bufs=299 
max_bufs=396
  Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0

Running Jobs:
Writing: Incremental Backup job AAA JobId=247971 Volume=IM0081
 pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
 spooling=0 despooling=0 despool_wait=0
 Files=0 Bytes=0 Bytes/sec=0
 FDReadSeqNo=6 in_msg=6 out_msg=4 fd=9
Writing: Incremental Backup job BBB JobId=247973 Volume=IM0081
 pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
 spooling=0 despooling=0 despool_wait=0
 Files=0 Bytes=0 Bytes/sec=0
 FDReadSeqNo=6 in_msg=6 out_msg=4 fd=13
Writing: Incremental Backup job CCC JobId=247975 Volume=IM0081
 pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
 spooling=0 despooling=0 despool_wait=0
 Files=0 Bytes=0 Bytes/sec=0
 FDReadSeqNo=6 in_msg=6 out_msg=4 fd=15
Writing: Incremental Backup job DDD JobId=247976 Volume=IM0081
 pool=Incremental-Pool device=C4-Drive-0 (/dev/C4-Drive-0)
 spooling=0 despooling=0 despool_wait=0
 Files=0 Bytes=0 Bytes/sec=0
 FDReadSeqNo=6 in_msg=6 out_msg=4 fd=18


Jobs waiting to reserve a drive:


Terminated Jobs:
  JobId  LevelFiles  Bytes   Status   FinishedName
===
XXX


Device status:
Autochanger C4-changer with devices:
C4-Drive-0 (/dev/C4-Drive-0)
C4-Drive-1 (/dev/C4-Drive-1)
Device C4-Drive-0 (/dev/C4-Drive-0) is not open.
 Device is BLOCKED waiting for mount of volume IM0081,
Pool:Incremental-Pool
Media type:  LTO-3
 Drive 0 is not loaded.
Device C4-Drive-1 (/dev/C4-Drive-1) is mounted with:
 Volume:  IM0081
 Pool:Incremental-Pool
 Media type:  LTO-3
 Slot 32 is loaded in drive 1.
 Total Bytes=369,270,534,144 Blocks=1,408,808 Bytes/block=262,115
 Positioned at File=203 Block=0


Used Volume status:
IM0070 on device C4-Drive-1 (/dev/C4-Drive-1)
 Reader=0 writers=0 devres=0 volinuse=0
IM0081 on device C4-Drive-0 (/dev/C4-Drive-0)
 Reader=0 writers=0 devres=4 volinuse=0




Anyone else have this happen?
Race condition?

thanks,
Stephen
-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Fatal error: askdir.c:339 NULL Volume name. This shouldn't happen!!!

2012-07-15 Thread Stephen Thompson


Update.  We are still seeing this in 5.2.10 as well.
It seems to happen more often towards the beginning of a series of jobs, 
when a tape is first chosen (i.e. not when a job is directly using a 
tape that's already been chosen and loaded into a drive by a previous job).

Stephen



On 7/5/12 7:44 AM, Stephen Thompson wrote:


 Update.  We have seen the problem 2-3 times this past month running
 5.2.9 on Redhat 6.2, much less frequent than before but still there.

 Stephen



 On 6/20/12 7:40 AM, Stephen Thompson wrote:


 Well, since we upgraded to 5.2.9 we have not seen the problem.
 Also when running 5.2.6 we were seeing it 2-3 times a week, during which
 we run hundreds of incrementals and several fulls per day.
 The error happened both with fulls and incrementals (which we have in
 two different LTO3 libraries).  There was nothing amiss with our catalog
 or volumes, or at least nothing obvious.  The error occurred when
 attempting to use different volumes (mostly previously used ones,
 including recycled), but those same volume were successful for other
 jobs that attempted to use them.  Lastly, it wasn't reproducible, like I
 said it happened 2-3 time out of several hundred jobs, but it was
 happening over the course of a month or two while we ran 5.2.6 on RedHat
 6.2.

 Here was our config for 5.2.6


 PATH=/usr/lib64/qt4/bin:$PATH
 BHOME=/home/bacula
 EMAIL=bac...@seismo.berkeley.edu

 env CFLAGS='-g -O2' \
./configure \
--prefix=$BHOME \
--sbindir=$BHOME/bin \
--sysconfdir=$BHOME/conf \
--with-working-dir=$BHOME/work \
--with-bsrdir=$BHOME/log \
--with-logdir=$BHOME/log \
--with-pid-dir=/var/run \
--with-subsys-dir=/var/run \
--with-dump-email=$EMAIL \
--with-job-email=$EMAIL \
--with-mysql \
--with-dir-user=bacula \
--with-dir-group=bacula \
--with-sd-user=bacula \
--with-sd-group=bacula \
  --with-openssl \
  --with-tcp-wrappers \
--enable-smartalloc \
--with-readline=/usr/include/readline \
--disable-conio \
--enable-bat \
| tee configure.out




 On 6/20/12 7:23 AM, Igor Blazevic wrote:
 On 18.06.2012 16:26, Stephen Thompson wrote:


 hello,

 Hello:)


 Anyone run into this error before?

 We hadn't until we upgraded our bacula server from Centos 5.8 to Redhat
 6.2, after which we of course had to recompile bacula.  However, we used
 the same source, version, and options, the exception being that we added
 readline for improved bconsole functionality.

 Can you post your config options, please? I've compiled versions 5.0.3 and
 5.2.6 on RHEL 6.2 with following options:

 CFLAGS=-g -Wall ./configure \
  --sysconfdir=/etc/bacula \
  --with-dir-user=bacula \
  --with-dir-group=bacula \
  --with-sd-user=bacula \
  --with-sd-group=bacula \
  --with-fd-user=root \
  --with-fd-group=root \
  --with-dir-password=somepasswd \
  --with-fd-password=somepasswd \
  --with-sd-password=somepasswd \
  --with-mon-dir-password=somepasswd \
  --with-mon-fd-password=somepasswd \
  --with-mon-sd-password=somepasswd \
  --with-working-dir=/var/lib/bacula \
  --with-scriptdir=/etc/bacula/scripts \
  --with-smtp-host=localhost \
  --with-subsys-dir=/var/lib/bacula/lock/subsys \
  --with-pid-dir=/var/lib/bacula/run \
  --enable-largefile \
  --disable-tray-monitor \
  --enable-build-dird  \
  --enable-build-stored \
  --with-openssl \
  --with-tcp-wrappers \
  --with-python \
  --enable-smartalloc \
  --with-x \
  --enable-bat \
  --disable-libtool \
  --with-postgresql \
  --with-readline=/usr/include/readline \
  --disable-conio

 and can atest that everything works just fine although I only used NEW
 volumes with it. Maybe there is something amiss with your catalog or
 volume media?





 --

 Igor Blažević





-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-10 Thread Stephen Thompson
On 07/10/2012 10:53 AM, Martin Simmons wrote:
 On Mon, 09 Jul 2012 12:55:14 -0700, Stephen Thompson said:

 On 07/09/12 11:37, Martin Simmons wrote:
 On Fri, 06 Jul 2012 11:12:35 -0700, Stephen Thompson said:

 On 07/06/2012 11:01 AM, Martin Simmons wrote:
 On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:

 Hello again,

 Here's something even stranger...  Another Full job logs that it's
 written to a volume in the Full pool (FB0956), but then the status
 output of the job lists a volume in the Incremental pool (IM0093).  This
 Incremental volume was never even mentioned in the log as a volume to
 which the job despooled.

 It could be a database problem (the volumes listed in the status output 
 come
 from a query).  What is the output of the sql commands below?

 SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE 
 JobMedia.JobId=242323 AND JobMedia.MediaId=Media.MediaId;

 SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
 ('IM0093','FB0956');


 Looks like it did in fact write to the Incremental tape IM0093 instead
 of the requested Full tape BUT logged that it wrote to a Full tape
 FB0956.  This begs the questions 1) Why is it writing to a tape in
 another pool? and 2) Why is logging that it wrote to a different tape
 than it did?

 You could verify that IM0093 contains the data by using bls -j with the tape
 loaded (but not mounted in Bacula).

 It looks like you have concurrent jobs (non-consecutive JobMediaId values).
 Was another job trying to use IM0093?  Maybe IM0093 was in another drive and
 Bacula mixed up the drives somehow?


 Yes, I believe that FB0956 was in one drive and IM0093 in the other,
 though I do not understand how bacula 'mixed up' which volume to use, or
 which drive a particular volume was in.

 Not sure how closely related this is, but I've seen cases occasionally
 where bacula will say that it cannot mount a certain volume in Drive0
 and requires user intervention, only to find that the volume requested
 is already mounted and in use in Drive1 by other jobs.  So it is
 possible for bacula either to lose track of which drive a volume is in
 or to not be sure if a volume is already in use.

 I did a partial restore of the job and it did in fact load and read off
 IM0093 successfully.  So in some sense I know what happened, I just
 don't know why it happened or how to prevent it (other than isolating
 jobs, but that defeats the point of concurrency).

 You could try upgrading to 5.2.10.  If that doesn't fix it, then reporting it
 in the bug tracker might be the next step
 (http://www.bacula.org/en/?page=bugs).


Already upgraded.  We'll see if it happens again.

thanks,
Stephen



 __Martin

 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-09 Thread Stephen Thompson
On 07/09/12 11:37, Martin Simmons wrote:
 On Fri, 06 Jul 2012 11:12:35 -0700, Stephen Thompson said:

 On 07/06/2012 11:01 AM, Martin Simmons wrote:
 On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:

 Hello again,

 Here's something even stranger...  Another Full job logs that it's
 written to a volume in the Full pool (FB0956), but then the status
 output of the job lists a volume in the Incremental pool (IM0093).  This
 Incremental volume was never even mentioned in the log as a volume to
 which the job despooled.

 It could be a database problem (the volumes listed in the status output come
 from a query).  What is the output of the sql commands below?

 SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE 
 JobMedia.JobId=242323 AND JobMedia.MediaId=Media.MediaId;

 SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
 ('IM0093','FB0956');


 Looks like it did in fact write to the Incremental tape IM0093 instead
 of the requested Full tape BUT logged that it wrote to a Full tape
 FB0956.  This begs the questions 1) Why is it writing to a tape in
 another pool? and 2) Why is logging that it wrote to a different tape
 than it did?

 You could verify that IM0093 contains the data by using bls -j with the tape
 loaded (but not mounted in Bacula).

 It looks like you have concurrent jobs (non-consecutive JobMediaId values).
 Was another job trying to use IM0093?  Maybe IM0093 was in another drive and
 Bacula mixed up the drives somehow?


Yes, I believe that FB0956 was in one drive and IM0093 in the other, 
though I do not understand how bacula 'mixed up' which volume to use, or 
which drive a particular volume was in.

Not sure how closely related this is, but I've seen cases occasionally 
where bacula will say that it cannot mount a certain volume in Drive0 
and requires user intervention, only to find that the volume requested 
is already mounted and in use in Drive1 by other jobs.  So it is 
possible for bacula either to lose track of which drive a volume is in 
or to not be sure if a volume is already in use.

I did a partial restore of the job and it did in fact load and read off 
IM0093 successfully.  So in some sense I know what happened, I just 
don't know why it happened or how to prevent it (other than isolating 
jobs, but that defeats the point of concurrency).

Stephen




 __Martin

 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


  1   2   >