software vs hardware compression with large backups to LTO7
I just wanted to drop a note to the list with my experience. I've been using Amanda since around 2005 with 3 Amanda servers originally with AIT5, then transitioned to LTO6 and now to LTO7 on one server (still LTO6 on the other 2). During that time, disk storage on two of these servers has escalated almost exponentially. The AIT5 was 400GB native and seemed capacious. Server disk drives had gotten as big as 300G, but we still had many smaller drives. Then we started seeing drives in the 1TB, 2TB and then 4TB range, and setting up multiple RAID arrays with these. We transitioned to LTO6 on all three servers (Sony dropped out of the competition and AIT died). More recently we started seeing 6TB and 10TB drives. We now have arrays of 10TB drives on two servers, and have added a new library with two LTO7 drives on one server. The other department is struggling with costs. We have nearly 100TB total disk capacity on each of these two servers now. One of the issues I have been dealing with is how long it takes Amanda to do the backups. With the LTO6, and doing about 2.5TB compressed backups each night on a 7 day backup cycle, it was not uncommon for the backups to run over 24 hours. I took to checking on it around the time the next run was to start in the late evening and making a considered decision whether to do an amcleanup -k so that the next run could start, e.g. sacrifice one or two DLEs to get a fresh run that would include all of the DLEs. Typically, when looking at the runs with top, I would see multiple gzips occupying CPU time. I understand that I could use pigz, but these are multi-purpose servers, and I didn't want to dominate the CPUs even more. The situation was becoming almost untenable, and disk space was escalating all the time. I've now had a bit of time running with LTO7, Amanda 3.4.5, and letting the tape drive do hardware compression. It happens that I (and some others on the list) couldn't figure out how to turn off compression on the LTO7, but I was already motivated to try running with hardware compression. I'm now routinely getting anywhere from 5TB to 8TB of uncompressed data (a substantial portion of it is uncompressible, e.g. large TIFFs), and it is often completing in about 8-10 hours without affecting the server load during the work day. I'm using two 4TB SSDs for holding space, and I'm using include and exclude to keep DLEs to no more than about 900GB, though many are smaller. I had previously aimed at staying under 500GB and shooting for 200 to 300GB for DLEs. I've converted all the dumptypes that I'm using to application am-gtar with compress none. I set the size of the LTO7 to ~12TB in the tapetype definition, and I set the maxdumpsize parameter to ~15TB. Everything is finally running quite smoothly. With a 48 slot library and two LTO7 drives, I'm also ready for some additional growth. I'm seeing top speeds of about 230MB/s for tape writes, which is much faster than the top rating for LTO6, but not up to the top rating for LTO7 of 300MB/s. I had previously gotten up in the 150s for LTO6 which has a top rating of 160MB/s. I'm not sure where the bottleneck is for the LTO7. The library is running off a SAS HBA separate from the SAS HBA for the external disk cabinets. However, tape writing is not a bottleneck for my backups overall, so it's not a big deal. The basic point here is that I've cut away from my long time thinking that it is better to do software compression so that Amanda's planner knows how much tape space is actually being used. While it feels a little uncomfortable not knowing how well the compression is working, it was always a heuristic that end of tape is how you know the tape is full. If it seemed like too little data was written, then you assumed tape errors, shoe shining, or some other issue. So, it's not really that different, and now it's much faster. -- --- Chris Hoogendyk - O__ Systems Administrator c/ /'_ --- Biology & Geosciences Departments (*) \(*) -- 315 Morrill Science Center ~~ - University of Massachusetts, Amherst --- Erdös 4
Re: Software vs: Hardware compression
I could. But, I already have on the order of 150 DLEs spread across several servers for each Amanda server. Amanda's inherent multi-threading allows for multiple dumpers per server being backed up, restricted by spindle number. So, typically, when Amanda fires off, I might have 6 or 8 dump processes going, and I can juggle my configuration to adjust that. Under optimal circumstances, those would be going on simultaneously with taping, and the SSDs would see all that action coming and going. What I'm getting is that the tape is on the order of 10 times faster than the dump process when gz is in the mix. So, the tape is seeing a lot of idle time. When gz is not in the mix, the dump is significantly faster than the tape, though not double. On 10/28/16 4:27 PM, Charles Curley wrote: On Fri, 28 Oct 2016 15:44:09 -0400 Chris Hoogendyk wrote: If it were "only" the Amanda server, I could go all out with pigz. However, it is also the department server and runs mail, web, samba, printing, etc. If I were to top out all the cores with pigz, I would have everyone in the department complaining about performance on other services. Can you sneak in a -p option to pigz? -- --- Chris Hoogendyk - O__ Systems Administrator c/ /'_ --- Biology & Geosciences Departments (*) \(*) -- 315 Morrill Science Center ~~ - University of Massachusetts, Amherst --- Erdös 4
Re: Software vs: Hardware compression
On Friday 28 October 2016 14:27:22 Charles Curley wrote: > On Fri, 28 Oct 2016 15:44:09 -0400 > > Chris Hoogendyk wrote: > > If it were "only" the Amanda server, I could go all out with pigz. > > However, it is also the department server and runs mail, web, samba, > > printing, etc. If I were to top out all the cores with pigz, I would > > have everyone in the department complaining about performance on > > other services. > > Can you sneak in a -p option to pigz? Maybe set client_custom_compress to a script that does something like"nice -n 19 pigz $@". pigz shouldn't interfere much with other workloads, especially niced.
Re: Software vs: Hardware compression
On Fri, 28 Oct 2016 15:44:09 -0400 Chris Hoogendyk wrote: > If it were "only" the Amanda server, I could go all out with pigz. > However, it is also the department server and runs mail, web, samba, > printing, etc. If I were to top out all the cores with pigz, I would > have everyone in the department complaining about performance on > other services. Can you sneak in a -p option to pigz? -- The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized. -- U.S. Const. Amendment IV Key fingerprint = CE5C 6645 A45A 64E4 94C0 809C FFF6 4C48 4ECD DFDB
Re: Software vs: Hardware compression
If it were "only" the Amanda server, I could go all out with pigz. However, it is also the department server and runs mail, web, samba, printing, etc. If I were to top out all the cores with pigz, I would have everyone in the department complaining about performance on other services. On 10/28/16 3:37 PM, J Chapman Flack wrote: On 10/28/2016 02:37 PM, Chris Hoogendyk wrote: all of the data is being compressed, and the compression is significant, but it has become the bottleneck. Top shows multiple of Amanda's gz processes at the top of the list all day. In your setup, are there enough DLEs to compress that Amanda can keep all your cores busy with gz processes working on different DLEs? Or do you have some cores underutilized, that you could maybe bring into the game by using pigz instead of gz? Chapman Flack -- --- Chris Hoogendyk - O__ Systems Administrator c/ /'_ --- Biology & Geosciences Departments (*) \(*) -- 315 Morrill Science Center ~~ - University of Massachusetts, Amherst --- Erdös 4
Re: Software vs: Hardware compression
On 10/28/2016 02:37 PM, Chris Hoogendyk wrote: > all of the data is being compressed, and the compression is significant, > but it has become the bottleneck. Top shows multiple of Amanda's gz > processes at the top of the list all day. In your setup, are there enough DLEs to compress that Amanda can keep all your cores busy with gz processes working on different DLEs? Or do you have some cores underutilized, that you could maybe bring into the game by using pigz instead of gz? Chapman Flack
Re: Software vs: Hardware compression
On 10/28/2016 02:37 PM, Chris Hoogendyk wrote: > It also knows what a particular DLE can be compressed to based > on the history. If the tape drive is doing the compression, then it is a > black box. Amanda doesn't know what the DLE got compressed to, and it > doesn't know how that relates to the capacity of the tape. That makes > planning more difficult. This might be a good place to plug an idea I floated back in January: http://marc.info/?l=amanda-hackers&m=145312885902912&w=2 ... it turns out in the SCSI standard for tape drive commands, there are ways to ask the drive, after writing each fileset, "hey drive! what did that last set compress to?" I haven't had time to pursue it ... I think the part I could probably do without help is teaching the tape driver to ask the drive for the statistics. It might take a more seasoned Amanda hacker to work out how to get those numbers stored in the history, similarly to what happens with software compression, so they can be used by the planner later. It still does strike me as worth exploring Chapman Flack
Re: Software vs: Hardware compression
That is somewhat of a complicated question. The simplest statement is that if Amanda manages the compression, and you have told it the capacity of the tape, then it knows what can fit on the tape. It also knows what a particular DLE can be compressed to based on the history. If the tape drive is doing the compression, then it is a black box. Amanda doesn't know what the DLE got compressed to, and it doesn't know how that relates to the capacity of the tape. That makes planning more difficult. Also, computers are getting faster and are typically multi core, so having gz running compressions for multiple DLEs on multiple cores is easily manageable. Then there are the howevers. I'm currently dealing with a couple of servers that are each getting into the range of 50 to 100 TB of capacity that needs to be backed up to LTO6. One of those servers has been too frequently running into 36 hour or even 60 hour backup cycles. As I was comparing the two servers, I noticed that on one server, the largest amount of data consists of TIFF files for the digitized herbarium collection. Those don't compress, so I had set those DLEs to not use compression. I was getting well over 200MB/s from disk to holding disk for those, and then on the order of 155MB/s out to tape. Then, on both this same server and on the server that was running over a day, the DLEs that were being compressed were getting something on the order of 15MB/s from disk to holding disk, followed by on the order of 155MB/s out to tape. On the one server, that wasn't such a big deal, because the largest amount of data was not being compressed. On the other server, all of the data is being compressed, and the compression is significant, but it has become the bottleneck. Top shows multiple of Amanda's gz processes at the top of the list all day. So, I'm beginning to rethink things for this server. These are SuperMicro servers with two AMD Opteron2.6GHz12core processor 6344 running Ubuntu LTS 14.04. They both have large external SAS multipath disk cabinets that are managed with mdadm and lvm. They both currently have about 24 external drives ranging from 1TB to 6TB built into a number of Raid5 and Raid6 arrays, and they both have two 1TB enterprise SSDs for holding disks. The tape systems are Overland NEOs 200 series with IBM LTO6 tape drives. My understanding of LTO6 is that the compression is hardware accelerated and is not supposed to slow down the data transfer. It is certainly going to be a bit of an experiment, but I'm reaching the point where I need to figure out how to get these backups done more quickly. As it is now, the tape is getting a lot of idle time while it waits for DLEs to be completed and ready to be written out to tape. I've been using Amanda for more than 10 years on these servers and their predecessors with LTO6 and previously with AIT5, and it has always worked well. I'm only now getting the rapidly increasing demand for large data arrays that is putting real stress on our backup capabilities. I've got 3 Amanda servers with LTO6 libraries backing up about 12 servers in 4 different departments. On 10/28/16 12:40 PM, Ochressandro Rettinger wrote: Why does Amanda recommend the use of software compression vs: the built in hardware compression of the tape drive itself? Is that in fact still the current recommendation? -Sandro -- --- Chris Hoogendyk - O__ Systems Administrator c/ /'_ --- Biology & Geosciences Departments (*) \(*) -- 315 Morrill Science Center ~~ - University of Massachusetts, Amherst --- Erdös 4
Software vs: Hardware compression
Why does Amanda recommend the use of software compression vs: the built in hardware compression of the tape drive itself? Is that in fact still the current recommendation? -Sandro