Hi All

I have a TSM 6.3.4 server on newish P7 hardware and AIX V7.1. HBAs are
all 8Gb.  The sans behind it are 8Gb or 4Gb depending which path they
take as we are in the middle of a SAN upgrade and there is still an old
switch in the mix.

Disk is XIV behind SVC.  Tape is TS3500 and LTO5.

According to the LTO wikipedia entry I should be able to get 140MB/sec
raw out of the drive.  I have an internal company document that suggests
sustained 210MB/sec (compressed) is attainable in the real world.

So far my server backs up 500GB per night of DB2 and Oracle databases on
to file pools, without deduplication.  Housekeeping then does a
single-streamed simultaneous migrate and copy to onsite and offsite
tapes.  Inter site bandwidth is 4Gb and I have most of that to myself.

That process takes over 5 hours so I'm seeing less than 100MB/sec.

Accordingly I started a tuning exercise.  I copied 50GB of my filepool
twice to give me a test dataset and started testing, of course when
there was no other activity on the TSM box.

The data comes off disk at 500MB/sec to /dev/null, so that is not a
bottleneck.

Copying using dd to tape runs at a peak of 120MB/sec with periods of
much lower than that, as measured using nmon's fc stats on the HBAs. I
presume some of that slowdown is where the tape reaches its end and has
to reverse direction.

Elapsed time for 100GB is 18 min, with little variation so average speed
is 95MB/sec

dd ibs and obs values were varied and ibs=256K obs=1024K seems to give
the best result.

Elapsed time is very consistent.

Copying to a local drive on the same switch blade as the tape HBA or
copying across blades made no difference.

Copying to a drive at the remote site increased elapsed time by 2
minutes, as one would expect with more switches in the path and a longer
turnaround time.

Tape to tape copy was not noticeably different to disk to tape.

Reading from tape to /dev/null was no different.

In all cases CPU time was about half of the elapsed time.

lsattr on the drives shows that compression is on (this is also the default)

The tape FC adapters are set to use the large transfer size.

The test was also run using 64KB pages and svmon was used to verify the
setting was effective. Again no difference.

I'm running out of ideas here.  num_cmd_elements on the hbas is 500 (the
default)  I'm thinking of increasing that to 2000, but it will require
an outage and hence change control.

Does anyone have any ideas, references I could look at or practical
advice as to how to get this to perform?

Thanks

Steve

Steven Harris
TSM Admin
Canberra Australia

Reply via email to