Re: mediawait during backup to disk
>> On Mon, 3 May 2010 13:55:52 +1000, Mehdi Salehi >> said: > Here is my question from the group: > Which one performs better? > a- One big TSM volume on a 2TB LUN > b- More than one TSM volumes on a 2TB LUN If it's SATA RAID5 underneath, I'd be scared to do more than 4 or so, and that's if you've got the lun all to yourself. - Allen S. Rout
Re: mediawait during backup to disk
The server has 16 processors and 128 GB ram. > From Michael Green: "Have you actually configured queue depth...?" No, we haven't. It is set at the default of '16'. It can be as high as 256. The HBA's are HP FC1242SR, which are re-branded QLogic QLE2462. All best wishes, Keith
Re: mediawait during backup to disk
We use EMC storage. We use the following settings on our AIX servers which are specified by EMC in their host attachment guides. # hba settings chdev -l fcsX -a num_cmd_elems=2048# hba queue depth chdev -l fcsX -a max_xfer_size=0x100 # max transfer size # scsi protocol adapter settings chdev -l fscsiX -a fc_err_recov=fast_fail # fail I/O's fast chdev -l fscsiX -a dyntrk=yes # allow aix to track binding changes # Symmetrix/DMX hdisk chdev -l $i -a queue_depth=128# hdisk queue depth # Clariion chdev -l $i -a queue_depth=32# hdisk queue depth Please note - just maxing out the queue depth on the host can cause problems. The storage system port also has a queue depth. If your storage system ports are shared across multiple server then you don't want the total queue depth across the servers to overflow the queue depth on the storage port. In other words, you don't want a server to fire a I/O to a storage system only to have the storage system send back an error saying it's queue is full, requiring a retry on the I/O - this is a major performance hit. How do you do this? In all practical sence . . .you don't. Find out what your vendor recomemds and use that. On AIX you can tell if you have hit IT'S hdisk queues are full in "iostat -Dl", althought this tells you nothing about the storage system or HBA. Michael Green To Sent by: "ADSM: ADSM-L@VM.MARIST.EDU Dist Stor cc Manager" Re: mediawait during backup to disk 05/03/2010 09:26 AM Please respond to "ADSM: Dist Stor Manager" On Mon, May 3, 2010 at 3:48 PM, Howard Coles wrote: > Smaller storage pool volumes and more of them. Then TSM only has to > open and use what it needs at the time, and more processes can be run > successfully at the same time. Of course this all depends on how much > RAM and processor power you have as well. The Single LUN shouldn't be a > problem as long as you have a good queue depth, but I would definitely > break up the pool into multiple volumes. Have you actually configured queue depth (at HBA level I guess) to anything other than default? If yes, would you please share the details with us? - The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
Re: mediawait during backup to disk
I'm fairly sure I remember an IBM recommendation that suggested you should try to have broadly as many individual disk volumes defined in a stgpool as you will have concurrent client sessions writing to that stgpool. David McClelland London On 3 May 2010, at 13:48, Howard Coles wrote: Smaller storage pool volumes and more of them. Then TSM only has to open and use what it needs at the time, and more processes can be run successfully at the same time. Of course this all depends on how much RAM and processor power you have as well. The Single LUN shouldn't be a problem as long as you have a good queue depth, but I would definitely break up the pool into multiple volumes. See Ya' Howard Coles Jr. John 3:16! -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Mehdi Salehi Sent: Sunday, May 02, 2010 10:56 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] mediawait during backup to disk - One point is that you gain virtually nothing by 6 volume groups. You can put the six LUNs in a single VG, no performance difference I believe. - Have you tuned the queue_depth of the LUNs based on AMS documentation. Check it by "lsattr -El hdiskX" Here is my question from the group: Which one performs better? a- One big TSM volume on a 2TB LUN b- More than one TSM volumes on a 2TB LUN Thanks
Re: mediawait during backup to disk
On Mon, May 3, 2010 at 3:48 PM, Howard Coles wrote: > Smaller storage pool volumes and more of them. Then TSM only has to > open and use what it needs at the time, and more processes can be run > successfully at the same time. Of course this all depends on how much > RAM and processor power you have as well. The Single LUN shouldn't be a > problem as long as you have a good queue depth, but I would definitely > break up the pool into multiple volumes. Have you actually configured queue depth (at HBA level I guess) to anything other than default? If yes, would you please share the details with us?
Re: mediawait during backup to disk
Smaller storage pool volumes and more of them. Then TSM only has to open and use what it needs at the time, and more processes can be run successfully at the same time. Of course this all depends on how much RAM and processor power you have as well. The Single LUN shouldn't be a problem as long as you have a good queue depth, but I would definitely break up the pool into multiple volumes. See Ya' Howard Coles Jr. John 3:16! -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Mehdi Salehi Sent: Sunday, May 02, 2010 10:56 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] mediawait during backup to disk - One point is that you gain virtually nothing by 6 volume groups. You can put the six LUNs in a single VG, no performance difference I believe. - Have you tuned the queue_depth of the LUNs based on AMS documentation. Check it by "lsattr -El hdiskX" Here is my question from the group: Which one performs better? a- One big TSM volume on a 2TB LUN b- More than one TSM volumes on a 2TB LUN Thanks
Re: mediawait during backup to disk
- One point is that you gain virtually nothing by 6 volume groups. You can put the six LUNs in a single VG, no performance difference I believe. - Have you tuned the queue_depth of the LUNs based on AMS documentation. Check it by "lsattr -El hdiskX" Here is my question from the group: Which one performs better? a- One big TSM volume on a 2TB LUN b- More than one TSM volumes on a 2TB LUN Thanks
Re: mediawait during backup to disk
Allen, Thank you for that bit of advice. It was just what I needed. The question of how small to make the TSM volumes was on my mind. Our disk pools are on Hitachi AMS2300 SAS drives. We have use of six 2 TB LUNs. One volume group is defined per LUN, then 14 TSM volumes per VG at 125 GB. The TSM volumes are defined to the storage pools going across (then down) the volume groups rather than simply down one volume group at a time. Our database is on USP-V. If you or others wish to offer any more specific advice about this architecture I would be glad to receive it. With my thanks and best wishes, Keith
Re: mediawait during backup to disk
Robert, Thank you. It is a mix of Windows and Linux clients. And, we impose a cloptset on all clients that includes a dirmc that writes to disk. Best wishes, Keith
Re: mediawait during backup to disk
>> On Fri, 30 Apr 2010 14:07:56 -0400, Keith Arbogast >> said: > I'll redefine smaller, more plentiful TSM volumes for this pool, and > see if that keeps the sand out of my socks. Heh. :) Good luck. Don't go nuts with the "smaller" direction. There are major performance problems at the extremes. Unless you're running on something radically distributed like an XIV, you're going to want to keep the underlying spindles in mind. Every RAID group constitutes a contention domain. 100 processes trying to write to the same RAID5 is similar to 100 processes trying to write to the same spindle. In fact, they _are_ all trying to write to the same spindle, times the number of spindles in your group. If you don't have some other force guiding your decision, I'd suggest a one-digit number of volumes per contention domain. Maybe even less than 5. - Allen S. Rout
Re: mediawait during backup to disk
Are these windows clients? Is the dirmc (directory management class) pointed to disk or tape? You may have windows clients queueing up to write their directories to tape. [RC] From: Keith Arbogast To: ADSM-L@VM.MARIST.EDU Date: 04/30/2010 08:09 AM Subject: [ADSM-L] mediawait during backup to disk Sent by: "ADSM: Dist Stor Manager" What does it mean when a node incurs mediawait during a backup to disk? The disk pool was not full, or even close to the Hi Mig trigger. The mediawait condition is reported by TSMManager, not the Activity Log, but the node record does show roughly 50 % pct mediawait . Several clients showed this condition in TSMManager and the Pct Mediawait field, and all during the same period of the night. All tape drives would have been busy during this time, writing offsite backups of disk pools at our other data center. With hopeful best wishes, Keith Arbogast U.S. BANCORP made the following annotations - Electronic Privacy Notice. This e-mail, and any attachments, contains information that is, or may be, covered by electronic communications privacy laws, and is also confidential and proprietary in nature. If you are not the intended recipient, please be advised that you are legally prohibited from retaining, using, copying, distributing, or otherwise disclosing this information in any manner. Instead, please reply to the sender that you have received this communication in error, and then immediately delete it. Thank you in advance for your cooperation. -
Re: mediawait during backup to disk
Thanks to all who offered advice. I may have been taking the book definition of mediawait too literally -- as pertaining to removable volumes, not disk. I was able to correlate the MediaWait condition showed by TSMManager with dsmaccnt.log records for the nodes last night. 20 nodes were affected between 2:30 and 3:45 AM. None, the rest of the night. And, none the rest of the month except for one other night. The Maximum Size Threshold on the pool, is set to No Limit. So, I believe biggee files are ruled out. Finishing one backup and housekeeping cycle before the next one begins is like sweeping sand back into the ocean. We are always eager to learn of a better broom. I'll redefine smaller, more plentiful TSM volumes for this pool, and see if that keeps the sand out of my socks. With my thanks and best wishes, Keith Arbogast
Re: mediawait during backup to disk
Check out the "Maximum Size Threshold" that is defined for the disk pool in question. Perhaps you are trying to back up a file larger than the threshold and now TSM needs to mount a tape to handle that file... -Jeff Nast SMDC Health Systems Duluth MN -Original Message- From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of John D. Schneider Sent: Friday, April 30, 2010 10:33 AM To: ADSM-L@vm.marist.edu Subject: Re: mediawait during backup to disk Keith, You could mitigate this somewhat by creating more disk storage pool volumes, and spreading them across more physical disks. I don't entirely agree with Allen that your disk pool is going to be slower than your incoming networks. It is certainly possible to build a disk storage pool environment that can respond that fast. However, you have to decide whether it is worth spending any money to do so. How close are you to running outside your backup window? If you are getting all your backups done in a timely manner, then the actual speed doesn't matter. But if you are pressing your window, you might have justification for some engineering improvements. Best Regards, John D. Schneider The Computer Coaching Community, LLC Office: (314) 635-5424 / Toll Free: (866) 796-9226 Cell: (314) 750-8721 Original Message Subject: Re: [ADSM-L] mediawait during backup to disk From: "Allen S. Rout" Date: Fri, April 30, 2010 10:14 am To: ADSM-L@VM.MARIST.EDU >> On Fri, 30 Apr 2010 11:07:35 -0400, Keith Arbogast said: > What does it mean when a node incurs mediawait during a backup to > disk? It means that the process was blocking on disk writes. Your disk (usually) can't write as fast as your various networks can pass you data. So there's some fraction of media wait, on a fine granularity. - Allen S. Rout This e-mail communication and any attachments may contain confidential and privileged information for the use of the designated recipients named above. If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or its contents is prohibited. As required by federal and state laws, you need to hold this information as privileged and confidential. If you have received this communication in error, please notify the sender and destroy all copies of this communication and any attachments.
Re: mediawait during backup to disk
Keith, You could mitigate this somewhat by creating more disk storage pool volumes, and spreading them across more physical disks. I don't entirely agree with Allen that your disk pool is going to be slower than your incoming networks. It is certainly possible to build a disk storage pool environment that can respond that fast. However, you have to decide whether it is worth spending any money to do so. How close are you to running outside your backup window? If you are getting all your backups done in a timely manner, then the actual speed doesn't matter. But if you are pressing your window, you might have justification for some engineering improvements. Best Regards, John D. Schneider The Computer Coaching Community, LLC Office: (314) 635-5424 / Toll Free: (866) 796-9226 Cell: (314) 750-8721 Original Message Subject: Re: [ADSM-L] mediawait during backup to disk From: "Allen S. Rout" Date: Fri, April 30, 2010 10:14 am To: ADSM-L@VM.MARIST.EDU >> On Fri, 30 Apr 2010 11:07:35 -0400, Keith Arbogast >> said: > What does it mean when a node incurs mediawait during a backup to > disk? It means that the process was blocking on disk writes. Your disk (usually) can't write as fast as your various networks can pass you data. So there's some fraction of media wait, on a fine granularity. - Allen S. Rout
Re: mediawait during backup to disk
>> On Fri, 30 Apr 2010 11:07:35 -0400, Keith Arbogast >> said: > What does it mean when a node incurs mediawait during a backup to > disk? It means that the process was blocking on disk writes. Your disk (usually) can't write as fast as your various networks can pass you data. So there's some fraction of media wait, on a fine granularity. - Allen S. Rout