Re: Software vs: Hardware compression

Chris Hoogendyk Fri, 28 Oct 2016 11:41:14 -0700

That is somewhat of a complicated question.

The simplest statement is that if Amanda manages the compression, and you have told it the capacityof the tape, then it knows what can fit on the tape. It also knows what a particular DLE can becompressed to based on the history. If the tape drive is doing the compression, then it is a blackbox. Amanda doesn't know what the DLE got compressed to, and it doesn't know how that relates to thecapacity of the tape. That makes planning more difficult. Also, computers are getting faster and aretypically multi core, so having gz running compressions for multiple DLEs on multiple cores iseasily manageable.

Then there are the howevers. I'm currently dealing with a couple of servers that are each gettinginto the range of 50 to 100 TB of capacity that needs to be backed up to LTO6. One of those servershas been too frequently running into 36 hour or even 60 hour backup cycles. As I was comparing thetwo servers, I noticed that on one server, the largest amount of data consists of TIFF files for thedigitized herbarium collection. Those don't compress, so I had set those DLEs to not usecompression. I was getting well over 200MB/s from disk to holding disk for those, and then on theorder of 155MB/s out to tape. Then, on both this same server and on the server that was running overa day, the DLEs that were being compressed were getting something on the order of 15MB/s from diskto holding disk, followed by on the order of 155MB/s out to tape. On the one server, that wasn'tsuch a big deal, because the largest amount of data was not being compressed. On the other server,all of the data is being compressed, and the compression is significant, but it has become thebottleneck. Top shows multiple of Amanda's gz processes at the top of the list all day.

So, I'm beginning to rethink things for this server. These are SuperMicro servers with two AMDOpteron2.6GHz12core processor 6344 running Ubuntu LTS 14.04. They both have large external SASmultipath disk cabinets that are managed with mdadm and lvm. They both currently have about 24external drives ranging from 1TB to 6TB built into a number of Raid5 and Raid6 arrays, and they bothhave two 1TB enterprise SSDs for holding disks. The tape systems are Overland NEOs 200 series withIBM LTO6 tape drives. My understanding of LTO6 is that the compression is hardware accelerated andis not supposed to slow down the data transfer. It is certainly going to be a bit of an experiment,but I'm reaching the point where I need to figure out how to get these backups done more quickly. Asit is now, the tape is getting a lot of idle time while it waits for DLEs to be completed and readyto be written out to tape.

I've been using Amanda for more than 10 years on these servers and their predecessors with LTO6 andpreviously with AIT5, and it has always worked well. I'm only now getting the rapidly increasingdemand for large data arrays that is putting real stress on our backup capabilities. I've got 3Amanda servers with LTO6 libraries backing up about 12 servers in 4 different departments.



On 10/28/16 12:40 PM, Ochressandro Rettinger wrote:

Why does Amanda recommend the use of software compression vs: the built inhardware compression of the tape drive itself? Is that in fact still the current recommendation?
                -Sandro


--
---------------

Chris Hoogendyk

-
   O__  ---- Systems Administrator
  c/ /'_ --- Biology & Geosciences Departments
 (*) \(*) -- 315 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst

<hoogen...@bio.umass.edu>

---------------

Erdös 4

Re: Software vs: Hardware compression

Reply via email to