[Bacula-users] Questions about spooling

2011-09-01 Thread frank_sg
Hi,

I am doing some tests about spooling. I have created a spool filesystem for 
testing purposes. I decided to try a quiet small one - just 10 GB. Well, I get 
an ugly perfomance (10MB/s) with this one. So I assume that the spool file 
system is too small - is this assumption right?

To get a bigger spool fs, there might be some options:
1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 TB 
RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB SSDs? 
2) Does it make any sense to have a spool fs much bigger than the tape size? 
(LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a 
fs  1,6 TB?) 
3) Specially with the SSDs - will I run in problems because of MTBF? Is anybody 
using SSDs for spool fs?

I have an autoloader with an SAS-LTO4 drive and I would like to get the drive 
to steam as fast as possible.

Thanks for you answers!
Regards, 
Frank

+--
|This was sent by f...@frankeseidel.de via Backup Central.
|Forward SPAM to ab...@backupcentral.com.
+--



--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread Alexandre Chapellon

  
  
I have no answer sorry, just more questions :)
  
  How do you measure the performance?
  Does the 10MB/s comes from the bacula job report?
  
  regards.

Le 01/09/2011 11:36, frank_sg a crit:

  Hi,

I am doing some tests about spooling. I have created a spool filesystem for testing purposes. I decided to try a quiet small one - just 10 GB. Well, I get an ugly perfomance (10MB/s) with this one. So I assume that the spool file system is too small - is this assumption right?

To get a bigger spool fs, there might be some options:
1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB SSDs? 
2) Does it make any sense to have a spool fs much bigger than the tape size? (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a fs  1,6 TB?) 
3) Specially with the SSDs - will I run in problems because of MTBF? Is anybody using SSDs for spool fs?

I have an autoloader with an SAS-LTO4 drive and I would like to get the drive to steam as fast as possible.

Thanks for you answers!
Regards, 
Frank

+--
|This was sent by f...@frankeseidel.de via Backup Central.
|Forward SPAM to ab...@backupcentral.com.
+--



--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free "Love Thy Logs" t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
  
  
Alexandre Chapellon
Ingnierie des systmes open sources et
  rseaux.
  Follow me on twitter: @alxgomz
  

  

attachment: a_chapellon.vcf--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread Marcello Romani
Il 01/09/2011 11:36, frank_sg ha scritto:
 Hi,

 I am doing some tests about spooling. I have created a spool filesystem for 
 testing purposes. I decided to try a quiet small one - just 10 GB. Well, I 
 get an ugly perfomance (10MB/s) with this one. So I assume that the spool 
 file system is too small - is this assumption right?

 To get a bigger spool fs, there might be some options:
 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 
 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB 
 SSDs?
 2) Does it make any sense to have a spool fs much bigger than the tape size? 
 (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a 
 fs  1,6 TB?)
 3) Specially with the SSDs - will I run in problems because of MTBF? Is 
 anybody using SSDs for spool fs?

 I have an autoloader with an SAS-LTO4 drive and I would like to get the drive 
 to steam as fast as possible.

 Thanks for you answers!
 Regards,
 Frank

 +--
 |This was sent by f...@frankeseidel.de via Backup Central.
 |Forward SPAM to ab...@backupcentral.com.
 +--



 --
 Special Offer -- Download ArcSight Logger for FREE!
 Finally, a world-class log management solution at an even better
 price-free! And you'll get a free Love Thy Logs t-shirt when you
 download Logger. Secure your free ArcSight Logger TODAY!
 http://p.sf.net/sfu/arcsisghtdev2dev
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users

I'll add my 2 cents about spool filesystem size.
IME the spool space gets emptied after every job run, so you need enough 
space just for the biggest job run you'll ever have, not for the entire 
tape.
I'm not sure if this is a general rule or applies only to my setup though.

About performance: to test the spool space performance I'd run some 
local disk benchmarks. Bacula job speed depends on so many other factors 
that it's almost impossible to use it as a benchmark for a single 
subsystem of the entire bacup process.
In particular, look for bottlenecks at the client side: cpu, network and 
disk seek times.

HTH.

-- 
Marcello Romani

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread Marcello Romani
Il 01/09/2011 11:36, frank_sg ha scritto:
 Hi,

 I am doing some tests about spooling. I have created a spool filesystem for 
 testing purposes. I decided to try a quiet small one - just 10 GB. Well, I 
 get an ugly perfomance (10MB/s) with this one. So I assume that the spool 
 file system is too small - is this assumption right?

 To get a bigger spool fs, there might be some options:
 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 
 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB 
 SSDs?
 2) Does it make any sense to have a spool fs much bigger than the tape size? 
 (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a 
 fs  1,6 TB?)
 3) Specially with the SSDs - will I run in problems because of MTBF? Is 
 anybody using SSDs for spool fs?

 I have an autoloader with an SAS-LTO4 drive and I would like to get the drive 
 to steam as fast as possible.

 Thanks for you answers!
 Regards,
 Frank

 +--
 |This was sent by f...@frankeseidel.de via Backup Central.
 |Forward SPAM to ab...@backupcentral.com.
 +--



 --
 Special Offer -- Download ArcSight Logger for FREE!
 Finally, a world-class log management solution at an even better
 price-free! And you'll get a free Love Thy Logs t-shirt when you
 download Logger. Secure your free ArcSight Logger TODAY!
 http://p.sf.net/sfu/arcsisghtdev2dev
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users

I almost forgot: the spool filesystem must be able to provide a 
sustained transfer rate a little greater than the nominal tape write 
rate, otherwise the benefits of spooling on tape wear will be lost due 
to the tape having to stop to wait for data from spool area.

HTH.

-- 
Marcello Romani

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Questions about spooling

2011-09-01 Thread frank_sg
Thanks fpr replying.

@Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job report.

@Marcello: No - very time the spool fs is full (or the maximum spool size per 
job etc.) is reached the spool fs is despooled to tape. And that is where I 
hope to get the advantage from: despooling with full speed to tape. You are 
right, the fs has to be fast enough - which SSDs would be obviously. That the 
reason for the idea about them. A RAID0 with 12 disks direct attached with SAS 
should also be fast enough - but which option is the one to choose?

Regards, 
Frank

+--
|This was sent by f...@frankeseidel.de via Backup Central.
|Forward SPAM to ab...@backupcentral.com.
+--



--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread Marcello Romani
Il 01/09/2011 13:38, frank_sg ha scritto:
 Thanks fpr replying.

 @Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job
 report.

 @Marcello: No - very time the spool fs is full (or the maximum spool
 size per job etc.) is reached the spool fs is despooled to tape. And
 that is where I hope to get the advantage from: despooling with full
 speed to tape. You are right, the fs has to be fast enough - which
 SSDs would be obviously. That the reason for the idea about them. A
 RAID0 with 12 disks direct attached with SAS should also be fast
 enough - but which option is the one to choose?

 Regards, Frank


I have not investigated deeply into bacula spooling... After your 
clarification I re-read this:

http://www.bacula.org/en/dev-manual/main/main/Data_Spooling.html

but I'm still not sure how the whole thing works if one doesn't set a 
limit on spool area size (I didn't).

What puzzles me is that every e-mail I get from bacula (one e-mail per 
job run) contains lines similar to these:

Job write elapsed time = 00:01:35, Transfer rate = 4.455 M Bytes/second
Committing spooled data to Volume WednesdayTape. Despooling
423,960,356 bytes ...
Despooling elapsed time = 00:00:25, Transfer rate = 16.95 M Bytes/second
Sending spooled attrs to the Director. Despooling 2,392,621 bytes ...
[...]
Rate:   3405.1 KB/s
Software Compression:   None

Which makes me wonder whether each job is spooled independently...

About job speed: notice how the final speed reported for the whole job 
is very low compared to the despooling speed.
That figure can be misleading. I think one has to consider transfer rate 
and despooling speed separately.

-- 
Marcello Romani

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread John Drescher
 I am doing some tests about spooling. I have created a spool filesystem for 
 testing purposes. I decided to try a quiet small one - just 10 GB. Well, I 
 get an ugly perfomance (10MB/s) with this one. So I assume that the spool 
 file system is too small - is this assumption right?


Remember that a job will not despool when it spools so that this will
not be as efficient as possible unless you run more than 1 job at a
time or have a spool size larger than your largest job. Since I use
concurrency I use a 5 or 10GB spool file like I have been doing for
over 5 years out of the 8+ years I have used bacula. On my LTO2
archive I regularly see despool rates of 20 to 60 MB/s to a single
LTO2 tape drive all the time however backup rates are highly dependent
of what type of backup and the percentage of small files. Incrementals
will show low rates because a large percentage of the time of the
backup is spent searching for the few files to backup instead of
actually doing the backup. A full will generally have a way higher
backup rate because all files are selected instead of spending time
searching. However with this said Incrementals are generally much
faster than fulls even though they show a much lower rate.

On top of this source filesystem performance and database performance
are very important. Also make sure you are not using software
compression if you have a tape drive. Software compression even on the
fastest i7 with an SSD will most likely be less than 10MB/s while tape
drive HW compression can easily achieve 10 times that rate.

John

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread Alexandre Chapellon



Le 01/09/2011 13:38, frank_sg a écrit :

Thanks fpr replying.

@Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job report.
At least in my reports the speed shown is the amount of data transferred 
divided by the amount of time to complete the job.
For example I had a job which kept waiting for appendable volume during 
almost 9 hours.
The backup job itself completed quite quickly as It was almost finished 
as soon as a new volume has been made available. However the report told 
me the speed was 4kB/s... which is not representative at all, of course.
IIRC the documentation, Spooling allows to write to disk data before it 
is written  to tape. This allow for better network throuput. Anyhow, 
Bacula only consider the job to be completed once all the data has been 
written to *tape* (or whatever other final destination).


From the documentation:
/When the backup has only been spooled to disk, it is not complete yet/

In other words you should not trust the bacula reports to measure the 
network/disk throuput.

Use something like iostat or iptraf, if that's what you care about.

From the documentation Of course, if your spool device is not large 
enough to hold all the data from your File daemon, you may actually slow 
down the overall backup


But spooling is also usefull when you have concurrent backs, in this 
case, i guess you spool dir shoud be at least as large as the sums of 
all your concurrent backups.


In really don't think SSD would help here. Any SATA is capable of at 
least...; let's say 50MB/s throuput.




@Marcello: No - very time the spool fs is full (or the maximum spool size per 
job etc.) is reached the spool fs is despooled to tape. And that is where I 
hope to get the advantage from: despooling with full speed to tape. You are 
right, the fs has to be fast enough - which SSDs would be obviously. That the 
reason for the idea about them. A RAID0 with 12 disks direct attached with SAS 
should also be fast enough - but which option is the one to choose?

Regards,
Frank

+--
|This was sent by f...@frankeseidel.de via Backup Central.
|Forward SPAM to ab...@backupcentral.com.
+--



--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


--
http://www.horoa.net

Alexandre Chapellon

Ingénierie des systèmes open sources et réseaux.
Follow me on twitter: @alxgomz http://www.twitter.com/alxgomz

attachment: a_chapellon.vcf--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread Adrian Reyer
On Thu, Sep 01, 2011 at 02:36:08AM -0700, frank_sg wrote:
 To get a bigger spool fs, there might be some options:
 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 
 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB 
 SSDs? 
 2) Does it make any sense to have a spool fs much bigger than the tape size? 
 (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a 
 fs  1,6 TB?) 
 3) Specially with the SSDs - will I run in problems because of MTBF? Is 
 anybody using SSDs for spool fs?
 I have an autoloader with an SAS-LTO4 drive and I would like to get the drive 
 to steam as fast as possible.

LTO-4 I have here as well. I experience data rates between 45MB/s and
95MB/s, no use in SAS or SSD, recent SATA would be enough to feed that
as you operate them basically in streaming mode.
We run 30-40 concurrent jobs into the same 1TB spool directory, resided
on a RAID-5 of 4 recent 2 TB SATA2 disks. Limits are the tape drive and
the network, not the disks.
For the right size fo the spool: I set it so big a job never needs to
wait for tape to continue spooling as there are a few very big clients
and a few that are attached via low bandwidth, but I limit a single job
spool to 800GB.

Regards,
Adrian
-- 
LiHAS - Adrian Reyer - Hessenwiesenstraße 10 - D-70565 Stuttgart
Fon: +49 (7 11) 78 28 50 90 - Fax:  +49 (7 11) 78 28 50 91
Mail: li...@lihas.de - Web: http://lihas.de
Linux, Netzwerke, Consulting  Support - USt-ID: DE 227 816 626 Stuttgart

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread mark . bergman
In the message dated: Thu, 01 Sep 2011 14:09:36 +0200,
The pithy ruminations from Marcello Romani on 
Re: [Bacula-users] Questions about spooling were:
= Il 01/09/2011 13:38, frank_sg ha scritto:
=  Thanks fpr replying.
= 
=  @Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job
=  report.
= 
=  @Marcello: No - very time the spool fs is full (or the maximum spool
=  size per job etc.) is reached the spool fs is despooled to tape. And
=  that is where I hope to get the advantage from: despooling with full

There is an advantage from despooling from a fast filesystem to tape--but
that advantage is more in terms of saving wear  tear on the tape
drive and media than in speed.

In my tests (and others [1][2]), because bacula stops collecting files
from clients as soon as it begins despooling to tape, the overall
throughput will never be as great as the filesystem is capable of
providing.

[1] http://copilotco.com/mail-archives/bacula-devel.2007/msg02642.html
[2] 
http://www.bacula.org/git/cgit.cgi/bacula/plain/bacula/projects?h=Branch-5.1

=  speed to tape. You are right, the fs has to be fast enough - which
=  SSDs would be obviously. That the reason for the idea about them. A
=  RAID0 with 12 disks direct attached with SAS should also be fast
=  enough - but which option is the one to choose?

Either option is much more than fast enough.


= 
=  Regards, Frank
= 
= 

[SNIP!]

= 
= About job speed: notice how the final speed reported for the whole job 
= is very low compared to the despooling speed.
= That figure can be misleading. I think one has to consider transfer rate 
= and despooling speed separately.

Maybe. From the point of view of selecting components and tuning, yes,
it does make sense to consider the transfer rates (client to spool file
and spool file to tape) separately. However, the most important value
for any business is the cumulative time--how long it takes to backup a
client, not the individual parts that make up that time.

Here are some numbers from our environment:

[A] Throughput w/o spooling: ~22MB/s
this represents the aggregate of the speed to read data
from disk and write to tape, with shoe-shining, network
congestion, disk contention, etc.

[B] Throughput to spool file: ~55MB/s
this represents the aggregate of the speed to read data
from disk (a 9TB logical volume made up from multiple
RAID5 and RAID6 LUNs) via 4Mb/s fibre and write to the
RAID-10 spool partition on 10K RPM SAS disks. This
includes any SAN congestion, disk contention, etc.

[C] Throuput from disk spool file to LTO-4 tape: ~108MB/s
This is the raw despooling-speed from 10K SAS disks
to the tape drive over 4Mb/s fibre.

[D] End-to-end throughput with spooling: ~27MB/s
This is very disappointing...this is the overall
throughput of [B] + [C] above. While eliminating
shoe-shining is much better for the tape media and tape
drive, the overall performance is almost identical to
[A], while it should be close to [B]. The reason for
the decrease in performance is that bacula stops all
spooling as soon as it starts de-spooling.


In this case, the imporant value is in [D]...that determines the total time
for a backup. It really doesn't matter that the throughput from the spool disk
to the tape drive is 4x greater than the aggregate throughput, because
bacula's design prohibits better performance.

In an ideal configuration, there could be multiple spool directories defined,
and bacula would open a new spool file in the next directory as soon as it
begins despooling.

= 
= -- 
= Marcello Romani
= 

Thanks,

Mark

---advertisements below this line were added without my consent or endorsement--

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Questions about spooling

2011-09-01 Thread Adrian Reyer
On Thu, Sep 01, 2011 at 11:06:47AM -0400, mark.berg...@uphs.upenn.edu wrote:
   [A], while it should be close to [B]. The reason for
   the decrease in performance is that bacula stops all
   spooling as soon as it starts de-spooling.
 In an ideal configuration, there could be multiple spool directories defined,
 and bacula would open a new spool file in the next directory as soon as it
 begins despooling.

We run 1 spool directory and several concurrent jobs spool to it. While
one gets despooled, the others continue spooling. However, if you run
out of spool space, spooling is stopped on all jobs till a complete
despooling is done.
Best practice IMHO with big disks:
- SpoolSize large enough to hold most of your backups completly
- Spool Directory large enough to hold several jobs

In my case most backups are 50GB, incrementals often only 100MB. I use
a 1TB Spool Directory and a Spool Size of 800GB, large enough for all
but my biggest backup job. It works fine here.

Regards,
Adrian
-- 
LiHAS - Adrian Reyer - Hessenwiesenstraße 10 - D-70565 Stuttgart
Fon: +49 (7 11) 78 28 50 90 - Fax:  +49 (7 11) 78 28 50 91
Mail: li...@lihas.de - Web: http://lihas.de
Linux, Netzwerke, Consulting  Support - USt-ID: DE 227 816 626 Stuttgart

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users