Re: Frustrated by slowness in TSM 6.2
Maybe something simple like verifying TCPWIN on the receiving side is 2x TCPBUF on th sender. Set COMPRESS=NO to make sure you're not misreading retransmits. Check topas during your local backup to itself. Check nmon's disk stats during the backup to see if you've got a hot LUN. Check the same from any disk perf monitoring. Check errpt Check your db2 logs for any sort of errors With the XIV, streaming thruput should be fine. It's only the IOPS that will be weak. Your physical limit would be around 16k IOPS, though you have on-disk cache and write combining, as well as the 120GB of cache (8*15). You could run into some back-side 10GE saturation if your LUN pathing isn't well balanced. VIO servers also have some limitations. If you're using VIO MPIO, are you set for round robin at every stage? By default, you'll be active/passive between the two vscsi adapters, and then whatever you're doing for load balance on the VIO servers. Also, the VIO servers will need CPU to drive IOPS. Check topas on the VIO servers during your tests. NPIV is preferred for lower latency through the VIO server, plus you can run 4-path multipath with load balancing on the client rather than having the VIO server(s) muddle through. --- Sincerely, Josh-Daniel S. Davis On Fri, Oct 8, 2010 at 11:27, Andrew Carlson wrote: > Hi all > > I am running TSM 6.2.1.1 on AIX V615 in a LPAR on a P770. The LPAR > has 6 shared CPU's, 12 virtual CPU;s, and 64GB of memory. There are 2 > VIO servers with 4 fiber channel connections to XIV storage for the DB > and LOG, and 2 10Gbit Ethernet in each VIO in an Etherchannel > configuration. The storage pool is on Data Domain DD880's, 2 per AIX, > 1 per instance. > > I am seeing consistenly poor performance from this setup. I have > tested network from VIO to cloud, and LPAR to VIO, which seems fine. > I tested LPAR to Data Domain, and things seem fine. But, when backups > are running (and I only have a few nodes there yet, this is a new > setup), TSM doesn't seem to want to go over 20 to 30MB/s throughput. > I tried backing up the TSM server over lo, and that was a little > better at 50MB/s, but not screaming. I tried using chunk of SAN as a > disk pool ahead of the Data Domain, no change. I am at my whit's end. > > If anyone has any ideas, please let me know. Thanks. > > -- > Andy Carlson > --- > Gamecube:$150,PSO:$50,Broadband Adapter: $35, Hunters License: $8.95/month, > The feeling of seeing the red box with the item you want in it:Priceless.
Re: Frustrated by slowness in TSM 6.2
Andrew, The crowd may be right, and the XIV may be your bottleneck for the DB, but I wouldn't focus on that. In your test environment, with only a small number of backups running at once, there probably isn't all that much database traffic generated, is there? And not many database reads, if much of your database should fit in memory. Database writes should be going to cache in the XIV, if it is as lightly loaded as you say, so I don't see that as much of a bottleneck when only a few clients are getting backed up. What kind of client backups are you testing? Are they large file database backups? Those can generate very good I/O throughput, because the client is sending the data as fast as possible. Or incremental filesystem backups on Windows servers? Those can generate very pool I/O throughput, if they have to examine thousands of files for each file that needs to be sent to the server. Can you say with assurance that the clients themselves are able to send more than 20-30MB/sec? Do you know what performance those same clients get when they backup to your production environment? Try backing them up to their production environment, at some time of night when the TSM server is not maxed out. Use that as a known starting point. If you just want to test throughput, and don't care about anything else: 1) Turn off client compression, if it is on. 2) Do "selective" backups of the whole filesystem, so the clients send everything without having to make any time-consuming decisions about what gets sent. 3) Pick a time for the test with the client is very lightly loaded. 4) Try to pick a client with a small number of very large (multi-GB) files, not zillions of small files. Andrew, I know you already know these things, but I include them for the benefit of the rest of the list. The point I am making is to allow the TSM client shove data across as fast as it can, and if it performs really well, then the device that is absorbing all that incoming data (The DataDomain, or other disk storage pool) is performing well. If another client is sending zillions of files, but performing very slowly, maybe that client is creating a lot more traffic to the database, and that is where your bottle neck. In other words, different clients can be used to show what part of the TSM server is the slowest performer. Best Regards, John D. Schneider The Computer Coaching Community, LLC Office: (314) 635-5424 / Toll Free: (866) 796-9226 Cell: (314) 750-8721 Original Message ---- Subject: Re: [ADSM-L] Frustrated by slowness in TSM 6.2 From: Paul Zarnowski Date: Fri, October 08, 2010 11:37 pm To: ADSM-L@VM.MARIST.EDU Rick, I think their response would be something along these lines... The XIV can perform better than other traditional arrays because the [cache miss] I/Os are spread across so many more spindles. I get that. But it seems to be that that can break down when the overall I/O load gets sufficiently high, across all of the spindles. In an I/O intensive environment such as TSM, I think this could be more likely to happen - particularly if you are using XIV for storage pools as well as for database volumes. I'm still skeptical about how far it can go. I can buy that it has good performance --- for a SATA-based product. But not compared to a pure 15K spindle-based product. Oh, and the SATA drives are larger than the SAS or FC drives, which doesn't help. ..Paul At 01:57 PM 10/8/2010, Richard Rhodes wrote: >> I would be suspicious of having the db on XIV. Do you have any FC >> or SAS Disk you could try putting the DB on? I know XIV has lots >> of CPU & cache, but underneath it all is still SATA. I've heard >> Marketing types rave about how fast XIV is, even with SATA, >> because I/O can be spread across many spindles, but I'm not >> entirely convinced it's as good as 15k FC or SAS. > >This is _exactly_ what IBM has not, and seems unwilling, to explain. > >Soon after IBM finalized the purchase of XIV, they had a series >of seminars around the country (usa) about the box. This wasn't some >little out of the way seminar . . . Moshe (inventor of the box) >was there and gave much of the presentation. I attended one - Lets >just say it was strange!!! They hammered on "high performance", over >and over. They threw up one graph where they claimed 25k iops at >3ms response time for a "cache miss" workload. Lets see, cache miss >means having to go to the spindle to do the I/O. SATA drives come >no where close to this response time. The workload was either >not cache miss, or, they effectively short-stroked the drive such >that the heads never moved. When I questioned this claim I >got nowhere - just run-around. > >Rick > > > >- >The information contained in this message is i
Re: Frustrated by slowness in TSM 6.2
Rick, I think their response would be something along these lines... The XIV can perform better than other traditional arrays because the [cache miss] I/Os are spread across so many more spindles. I get that. But it seems to be that that can break down when the overall I/O load gets sufficiently high, across all of the spindles. In an I/O intensive environment such as TSM, I think this could be more likely to happen - particularly if you are using XIV for storage pools as well as for database volumes. I'm still skeptical about how far it can go. I can buy that it has good performance --- for a SATA-based product. But not compared to a pure 15K spindle-based product. Oh, and the SATA drives are larger than the SAS or FC drives, which doesn't help. ..Paul At 01:57 PM 10/8/2010, Richard Rhodes wrote: >> I would be suspicious of having the db on XIV. Do you have any FC >> or SAS Disk you could try putting the DB on? I know XIV has lots >> of CPU & cache, but underneath it all is still SATA. I've heard >> Marketing types rave about how fast XIV is, even with SATA, >> because I/O can be spread across many spindles, but I'm not >> entirely convinced it's as good as 15k FC or SAS. > >This is _exactly_ what IBM has not, and seems unwilling, to explain. > >Soon after IBM finalized the purchase of XIV, they had a series >of seminars around the country (usa) about the box. This wasn't some >little out of the way seminar . . . Moshe (inventor of the box) >was there and gave much of the presentation. I attended one - Lets >just say it was strange!!! They hammered on "high performance", over >and over. They threw up one graph where they claimed 25k iops at >3ms response time for a "cache miss" workload. Lets see, cache miss >means having to go to the spindle to do the I/O. SATA drives come >no where close to this response time. The workload was either >not cache miss, or, they effectively short-stroked the drive such >that the heads never moved. When I questioned this claim I >got nowhere - just run-around. > >Rick > > > >- >The information contained in this message is intended only for the personal >and confidential use of the recipient(s) named above. If the reader of this >message is not the intended recipient or an agent responsible for delivering >it to the intended recipient, you are hereby notified that you have received >this document in error and that any review, dissemination, distribution, or >copying of this message is strictly prohibited. If you have received this >communication in error, please notify us immediately, and delete the original >message. -- Paul ZarnowskiPh: 607-255-4757 Manager, Storage Services Fx: 607-255-8521 719 Rhodes Hall, Ithaca, NY 14853-3801Em: p...@cornell.edu
Re: Frustrated by slowness in TSM 6.2
> I would be suspicious of having the db on XIV. Do you have any FC > or SAS Disk you could try putting the DB on? I know XIV has lots > of CPU & cache, but underneath it all is still SATA. I've heard > Marketing types rave about how fast XIV is, even with SATA, > because I/O can be spread across many spindles, but I'm not > entirely convinced it's as good as 15k FC or SAS. This is _exactly_ what IBM has not, and seems unwilling, to explain. Soon after IBM finalized the purchase of XIV, they had a series of seminars around the country (usa) about the box. This wasn't some little out of the way seminar . . . Moshe (inventor of the box) was there and gave much of the presentation. I attended one - Lets just say it was strange!!! They hammered on "high performance", over and over. They threw up one graph where they claimed 25k iops at 3ms response time for a "cache miss" workload. Lets see, cache miss means having to go to the spindle to do the I/O. SATA drives come no where close to this response time. The workload was either not cache miss, or, they effectively short-stroked the drive such that the heads never moved. When I questioned this claim I got nowhere - just run-around. Rick - The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
Re: Frustrated by slowness in TSM 6.2
It is a lightly loaded XIV, and the disk system does not seem under pressure, unless I force it with dd or something in testing, but I will check it out. Any other ideas out there? On Fri, Oct 8, 2010 at 11:37 AM, Paul Zarnowski wrote: > I would be suspicious of having the db on XIV. Do you have any FC or SAS Disk > you could try putting the DB on? I know XIV has lots of CPU & cache, but > underneath it all is still SATA. I've heard Marketing types rave about how > fast XIV is, even with SATA, because I/O can be spread across many spindles, > but I'm not entirely convinced it's as good as 15k FC or SAS. > > ..Paul > > > On Oct 8, 2010, at 12:27 PM, "Andrew Carlson" wrote: > >> Hi all >> >> I am running TSM 6.2.1.1 on AIX V615 in a LPAR on a P770. The LPAR >> has 6 shared CPU's, 12 virtual CPU;s, and 64GB of memory. There are 2 >> VIO servers with 4 fiber channel connections to XIV storage for the DB >> and LOG, and 2 10Gbit Ethernet in each VIO in an Etherchannel >> configuration. The storage pool is on Data Domain DD880's, 2 per AIX, >> 1 per instance. >> >> I am seeing consistenly poor performance from this setup. I have >> tested network from VIO to cloud, and LPAR to VIO, which seems fine. >> I tested LPAR to Data Domain, and things seem fine. But, when backups >> are running (and I only have a few nodes there yet, this is a new >> setup), TSM doesn't seem to want to go over 20 to 30MB/s throughput. >> I tried backing up the TSM server over lo, and that was a little >> better at 50MB/s, but not screaming. I tried using chunk of SAN as a >> disk pool ahead of the Data Domain, no change. I am at my whit's end. >> >> If anyone has any ideas, please let me know. Thanks. >> >> -- >> Andy Carlson >> --- >> Gamecube:$150,PSO:$50,Broadband Adapter: $35, Hunters License: $8.95/month, >> The feeling of seeing the red box with the item you want in it:Priceless. > -- Andy Carlson --- Gamecube:$150,PSO:$50,Broadband Adapter: $35, Hunters License: $8.95/month, The feeling of seeing the red box with the item you want in it:Priceless.
Re: Frustrated by slowness in TSM 6.2
I would be suspicious of having the db on XIV. Do you have any FC or SAS Disk you could try putting the DB on? I know XIV has lots of CPU & cache, but underneath it all is still SATA. I've heard Marketing types rave about how fast XIV is, even with SATA, because I/O can be spread across many spindles, but I'm not entirely convinced it's as good as 15k FC or SAS. ..Paul On Oct 8, 2010, at 12:27 PM, "Andrew Carlson" wrote: > Hi all > > I am running TSM 6.2.1.1 on AIX V615 in a LPAR on a P770. The LPAR > has 6 shared CPU's, 12 virtual CPU;s, and 64GB of memory. There are 2 > VIO servers with 4 fiber channel connections to XIV storage for the DB > and LOG, and 2 10Gbit Ethernet in each VIO in an Etherchannel > configuration. The storage pool is on Data Domain DD880's, 2 per AIX, > 1 per instance. > > I am seeing consistenly poor performance from this setup. I have > tested network from VIO to cloud, and LPAR to VIO, which seems fine. > I tested LPAR to Data Domain, and things seem fine. But, when backups > are running (and I only have a few nodes there yet, this is a new > setup), TSM doesn't seem to want to go over 20 to 30MB/s throughput. > I tried backing up the TSM server over lo, and that was a little > better at 50MB/s, but not screaming. I tried using chunk of SAN as a > disk pool ahead of the Data Domain, no change. I am at my whit's end. > > If anyone has any ideas, please let me know. Thanks. > > -- > Andy Carlson > --- > Gamecube:$150,PSO:$50,Broadband Adapter: $35, Hunters License: $8.95/month, > The feeling of seeing the red box with the item you want in it:Priceless.
Frustrated by slowness in TSM 6.2
Hi all I am running TSM 6.2.1.1 on AIX V615 in a LPAR on a P770. The LPAR has 6 shared CPU's, 12 virtual CPU;s, and 64GB of memory. There are 2 VIO servers with 4 fiber channel connections to XIV storage for the DB and LOG, and 2 10Gbit Ethernet in each VIO in an Etherchannel configuration. The storage pool is on Data Domain DD880's, 2 per AIX, 1 per instance. I am seeing consistenly poor performance from this setup. I have tested network from VIO to cloud, and LPAR to VIO, which seems fine. I tested LPAR to Data Domain, and things seem fine. But, when backups are running (and I only have a few nodes there yet, this is a new setup), TSM doesn't seem to want to go over 20 to 30MB/s throughput. I tried backing up the TSM server over lo, and that was a little better at 50MB/s, but not screaming. I tried using chunk of SAN as a disk pool ahead of the Data Domain, no change. I am at my whit's end. If anyone has any ideas, please let me know. Thanks. -- Andy Carlson --- Gamecube:$150,PSO:$50,Broadband Adapter: $35, Hunters License: $8.95/month, The feeling of seeing the red box with the item you want in it:Priceless.