[Bacula-users] Questions about spooling
Hi, I am doing some tests about spooling. I have created a spool filesystem for testing purposes. I decided to try a quiet small one - just 10 GB. Well, I get an ugly perfomance (10MB/s) with this one. So I assume that the spool file system is too small - is this assumption right? To get a bigger spool fs, there might be some options: 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB SSDs? 2) Does it make any sense to have a spool fs much bigger than the tape size? (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a fs 1,6 TB?) 3) Specially with the SSDs - will I run in problems because of MTBF? Is anybody using SSDs for spool fs? I have an autoloader with an SAS-LTO4 drive and I would like to get the drive to steam as fast as possible. Thanks for you answers! Regards, Frank +-- |This was sent by f...@frankeseidel.de via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
I have no answer sorry, just more questions :) How do you measure the performance? Does the 10MB/s comes from the bacula job report? regards. Le 01/09/2011 11:36, frank_sg a crit: Hi, I am doing some tests about spooling. I have created a spool filesystem for testing purposes. I decided to try a quiet small one - just 10 GB. Well, I get an ugly perfomance (10MB/s) with this one. So I assume that the spool file system is too small - is this assumption right? To get a bigger spool fs, there might be some options: 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB SSDs? 2) Does it make any sense to have a spool fs much bigger than the tape size? (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a fs 1,6 TB?) 3) Specially with the SSDs - will I run in problems because of MTBF? Is anybody using SSDs for spool fs? I have an autoloader with an SAS-LTO4 drive and I would like to get the drive to steam as fast as possible. Thanks for you answers! Regards, Frank +-- |This was sent by f...@frankeseidel.de via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free "Love Thy Logs" t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users -- Alexandre Chapellon Ingnierie des systmes open sources et rseaux. Follow me on twitter: @alxgomz attachment: a_chapellon.vcf-- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
Il 01/09/2011 11:36, frank_sg ha scritto: Hi, I am doing some tests about spooling. I have created a spool filesystem for testing purposes. I decided to try a quiet small one - just 10 GB. Well, I get an ugly perfomance (10MB/s) with this one. So I assume that the spool file system is too small - is this assumption right? To get a bigger spool fs, there might be some options: 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB SSDs? 2) Does it make any sense to have a spool fs much bigger than the tape size? (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a fs 1,6 TB?) 3) Specially with the SSDs - will I run in problems because of MTBF? Is anybody using SSDs for spool fs? I have an autoloader with an SAS-LTO4 drive and I would like to get the drive to steam as fast as possible. Thanks for you answers! Regards, Frank +-- |This was sent by f...@frankeseidel.de via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users I'll add my 2 cents about spool filesystem size. IME the spool space gets emptied after every job run, so you need enough space just for the biggest job run you'll ever have, not for the entire tape. I'm not sure if this is a general rule or applies only to my setup though. About performance: to test the spool space performance I'd run some local disk benchmarks. Bacula job speed depends on so many other factors that it's almost impossible to use it as a benchmark for a single subsystem of the entire bacup process. In particular, look for bottlenecks at the client side: cpu, network and disk seek times. HTH. -- Marcello Romani -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
Il 01/09/2011 11:36, frank_sg ha scritto: Hi, I am doing some tests about spooling. I have created a spool filesystem for testing purposes. I decided to try a quiet small one - just 10 GB. Well, I get an ugly perfomance (10MB/s) with this one. So I assume that the spool file system is too small - is this assumption right? To get a bigger spool fs, there might be some options: 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB SSDs? 2) Does it make any sense to have a spool fs much bigger than the tape size? (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a fs 1,6 TB?) 3) Specially with the SSDs - will I run in problems because of MTBF? Is anybody using SSDs for spool fs? I have an autoloader with an SAS-LTO4 drive and I would like to get the drive to steam as fast as possible. Thanks for you answers! Regards, Frank +-- |This was sent by f...@frankeseidel.de via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users I almost forgot: the spool filesystem must be able to provide a sustained transfer rate a little greater than the nominal tape write rate, otherwise the benefits of spooling on tape wear will be lost due to the tape having to stop to wait for data from spool area. HTH. -- Marcello Romani -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] Questions about spooling
Thanks fpr replying. @Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job report. @Marcello: No - very time the spool fs is full (or the maximum spool size per job etc.) is reached the spool fs is despooled to tape. And that is where I hope to get the advantage from: despooling with full speed to tape. You are right, the fs has to be fast enough - which SSDs would be obviously. That the reason for the idea about them. A RAID0 with 12 disks direct attached with SAS should also be fast enough - but which option is the one to choose? Regards, Frank +-- |This was sent by f...@frankeseidel.de via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
Il 01/09/2011 13:38, frank_sg ha scritto: Thanks fpr replying. @Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job report. @Marcello: No - very time the spool fs is full (or the maximum spool size per job etc.) is reached the spool fs is despooled to tape. And that is where I hope to get the advantage from: despooling with full speed to tape. You are right, the fs has to be fast enough - which SSDs would be obviously. That the reason for the idea about them. A RAID0 with 12 disks direct attached with SAS should also be fast enough - but which option is the one to choose? Regards, Frank I have not investigated deeply into bacula spooling... After your clarification I re-read this: http://www.bacula.org/en/dev-manual/main/main/Data_Spooling.html but I'm still not sure how the whole thing works if one doesn't set a limit on spool area size (I didn't). What puzzles me is that every e-mail I get from bacula (one e-mail per job run) contains lines similar to these: Job write elapsed time = 00:01:35, Transfer rate = 4.455 M Bytes/second Committing spooled data to Volume WednesdayTape. Despooling 423,960,356 bytes ... Despooling elapsed time = 00:00:25, Transfer rate = 16.95 M Bytes/second Sending spooled attrs to the Director. Despooling 2,392,621 bytes ... [...] Rate: 3405.1 KB/s Software Compression: None Which makes me wonder whether each job is spooled independently... About job speed: notice how the final speed reported for the whole job is very low compared to the despooling speed. That figure can be misleading. I think one has to consider transfer rate and despooling speed separately. -- Marcello Romani -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
I am doing some tests about spooling. I have created a spool filesystem for testing purposes. I decided to try a quiet small one - just 10 GB. Well, I get an ugly perfomance (10MB/s) with this one. So I assume that the spool file system is too small - is this assumption right? Remember that a job will not despool when it spools so that this will not be as efficient as possible unless you run more than 1 job at a time or have a spool size larger than your largest job. Since I use concurrency I use a 5 or 10GB spool file like I have been doing for over 5 years out of the 8+ years I have used bacula. On my LTO2 archive I regularly see despool rates of 20 to 60 MB/s to a single LTO2 tape drive all the time however backup rates are highly dependent of what type of backup and the percentage of small files. Incrementals will show low rates because a large percentage of the time of the backup is spent searching for the few files to backup instead of actually doing the backup. A full will generally have a way higher backup rate because all files are selected instead of spending time searching. However with this said Incrementals are generally much faster than fulls even though they show a much lower rate. On top of this source filesystem performance and database performance are very important. Also make sure you are not using software compression if you have a tape drive. Software compression even on the fastest i7 with an SSD will most likely be less than 10MB/s while tape drive HW compression can easily achieve 10 times that rate. John -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
Le 01/09/2011 13:38, frank_sg a écrit : Thanks fpr replying. @Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job report. At least in my reports the speed shown is the amount of data transferred divided by the amount of time to complete the job. For example I had a job which kept waiting for appendable volume during almost 9 hours. The backup job itself completed quite quickly as It was almost finished as soon as a new volume has been made available. However the report told me the speed was 4kB/s... which is not representative at all, of course. IIRC the documentation, Spooling allows to write to disk data before it is written to tape. This allow for better network throuput. Anyhow, Bacula only consider the job to be completed once all the data has been written to *tape* (or whatever other final destination). From the documentation: /When the backup has only been spooled to disk, it is not complete yet/ In other words you should not trust the bacula reports to measure the network/disk throuput. Use something like iostat or iptraf, if that's what you care about. From the documentation Of course, if your spool device is not large enough to hold all the data from your File daemon, you may actually slow down the overall backup But spooling is also usefull when you have concurrent backs, in this case, i guess you spool dir shoud be at least as large as the sums of all your concurrent backups. In really don't think SSD would help here. Any SATA is capable of at least...; let's say 50MB/s throuput. @Marcello: No - very time the spool fs is full (or the maximum spool size per job etc.) is reached the spool fs is despooled to tape. And that is where I hope to get the advantage from: despooling with full speed to tape. You are right, the fs has to be fast enough - which SSDs would be obviously. That the reason for the idea about them. A RAID0 with 12 disks direct attached with SAS should also be fast enough - but which option is the one to choose? Regards, Frank +-- |This was sent by f...@frankeseidel.de via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users -- http://www.horoa.net Alexandre Chapellon Ingénierie des systèmes open sources et réseaux. Follow me on twitter: @alxgomz http://www.twitter.com/alxgomz attachment: a_chapellon.vcf-- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
On Thu, Sep 01, 2011 at 02:36:08AM -0700, frank_sg wrote: To get a bigger spool fs, there might be some options: 1) What is better: bigger spool fs or faster spool fs? So first option: 3,6 TB RAID0 with 12 SAS disks direct attatched vs second option: 2 or 3 120 GB SSDs? 2) Does it make any sense to have a spool fs much bigger than the tape size? (LTO4 - 800MB, with compression up to 1,6 TB - so does it make sense to use a fs 1,6 TB?) 3) Specially with the SSDs - will I run in problems because of MTBF? Is anybody using SSDs for spool fs? I have an autoloader with an SAS-LTO4 drive and I would like to get the drive to steam as fast as possible. LTO-4 I have here as well. I experience data rates between 45MB/s and 95MB/s, no use in SAS or SSD, recent SATA would be enough to feed that as you operate them basically in streaming mode. We run 30-40 concurrent jobs into the same 1TB spool directory, resided on a RAID-5 of 4 recent 2 TB SATA2 disks. Limits are the tape drive and the network, not the disks. For the right size fo the spool: I set it so big a job never needs to wait for tape to continue spooling as there are a few very big clients and a few that are attached via low bandwidth, but I limit a single job spool to 800GB. Regards, Adrian -- LiHAS - Adrian Reyer - Hessenwiesenstraße 10 - D-70565 Stuttgart Fon: +49 (7 11) 78 28 50 90 - Fax: +49 (7 11) 78 28 50 91 Mail: li...@lihas.de - Web: http://lihas.de Linux, Netzwerke, Consulting Support - USt-ID: DE 227 816 626 Stuttgart -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
In the message dated: Thu, 01 Sep 2011 14:09:36 +0200, The pithy ruminations from Marcello Romani on Re: [Bacula-users] Questions about spooling were: = Il 01/09/2011 13:38, frank_sg ha scritto: = Thanks fpr replying. = = @Alexandre: Yes, exactly, the 10MB/s (average) come from bacula job = report. = = @Marcello: No - very time the spool fs is full (or the maximum spool = size per job etc.) is reached the spool fs is despooled to tape. And = that is where I hope to get the advantage from: despooling with full There is an advantage from despooling from a fast filesystem to tape--but that advantage is more in terms of saving wear tear on the tape drive and media than in speed. In my tests (and others [1][2]), because bacula stops collecting files from clients as soon as it begins despooling to tape, the overall throughput will never be as great as the filesystem is capable of providing. [1] http://copilotco.com/mail-archives/bacula-devel.2007/msg02642.html [2] http://www.bacula.org/git/cgit.cgi/bacula/plain/bacula/projects?h=Branch-5.1 = speed to tape. You are right, the fs has to be fast enough - which = SSDs would be obviously. That the reason for the idea about them. A = RAID0 with 12 disks direct attached with SAS should also be fast = enough - but which option is the one to choose? Either option is much more than fast enough. = = Regards, Frank = = [SNIP!] = = About job speed: notice how the final speed reported for the whole job = is very low compared to the despooling speed. = That figure can be misleading. I think one has to consider transfer rate = and despooling speed separately. Maybe. From the point of view of selecting components and tuning, yes, it does make sense to consider the transfer rates (client to spool file and spool file to tape) separately. However, the most important value for any business is the cumulative time--how long it takes to backup a client, not the individual parts that make up that time. Here are some numbers from our environment: [A] Throughput w/o spooling: ~22MB/s this represents the aggregate of the speed to read data from disk and write to tape, with shoe-shining, network congestion, disk contention, etc. [B] Throughput to spool file: ~55MB/s this represents the aggregate of the speed to read data from disk (a 9TB logical volume made up from multiple RAID5 and RAID6 LUNs) via 4Mb/s fibre and write to the RAID-10 spool partition on 10K RPM SAS disks. This includes any SAN congestion, disk contention, etc. [C] Throuput from disk spool file to LTO-4 tape: ~108MB/s This is the raw despooling-speed from 10K SAS disks to the tape drive over 4Mb/s fibre. [D] End-to-end throughput with spooling: ~27MB/s This is very disappointing...this is the overall throughput of [B] + [C] above. While eliminating shoe-shining is much better for the tape media and tape drive, the overall performance is almost identical to [A], while it should be close to [B]. The reason for the decrease in performance is that bacula stops all spooling as soon as it starts de-spooling. In this case, the imporant value is in [D]...that determines the total time for a backup. It really doesn't matter that the throughput from the spool disk to the tape drive is 4x greater than the aggregate throughput, because bacula's design prohibits better performance. In an ideal configuration, there could be multiple spool directories defined, and bacula would open a new spool file in the next directory as soon as it begins despooling. = = -- = Marcello Romani = Thanks, Mark ---advertisements below this line were added without my consent or endorsement-- -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Questions about spooling
On Thu, Sep 01, 2011 at 11:06:47AM -0400, mark.berg...@uphs.upenn.edu wrote: [A], while it should be close to [B]. The reason for the decrease in performance is that bacula stops all spooling as soon as it starts de-spooling. In an ideal configuration, there could be multiple spool directories defined, and bacula would open a new spool file in the next directory as soon as it begins despooling. We run 1 spool directory and several concurrent jobs spool to it. While one gets despooled, the others continue spooling. However, if you run out of spool space, spooling is stopped on all jobs till a complete despooling is done. Best practice IMHO with big disks: - SpoolSize large enough to hold most of your backups completly - Spool Directory large enough to hold several jobs In my case most backups are 50GB, incrementals often only 100MB. I use a 1TB Spool Directory and a Spool Size of 800GB, large enough for all but my biggest backup job. It works fine here. Regards, Adrian -- LiHAS - Adrian Reyer - Hessenwiesenstraße 10 - D-70565 Stuttgart Fon: +49 (7 11) 78 28 50 90 - Fax: +49 (7 11) 78 28 50 91 Mail: li...@lihas.de - Web: http://lihas.de Linux, Netzwerke, Consulting Support - USt-ID: DE 227 816 626 Stuttgart -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users