Well, I'd say that seems to be what is causing your intermittent failures then. Unfortunately, there is no "magic bullet" approach to fix this situation -- it requires cooperation of all the admins involved (TSM, DBA, Unix, applications), and the TSM admin has the responsibility to educate all parties about the interactions. For example, the DBAs must be made aware that if they set their parallelism (is that the right term?) too high (higher than MAXNUMMP), some channels will not be able to work, and RMAN jobs will fail.
Do NOT set MAXNUMMP higher than the number of installed drives.... that will almost guarantee failures. If you set it equal to the number of installed drives, then all of those drives must be available for that node when it wants them, or there will be failures. It requires coordinated scheduling. The approach I would take is to set MAXNUMMP only as high as that client needs to get its backup done in the time allotted.... if a particular node MUST backup 100GB in ten minutes (as an absurd example), it will need several drives... but if it has four hours to complete its backup, then one drive is plenty. Another approach, if you have it available, is to use an external scheduler (such as Control-M) rather than the TSM scheduler. Most enterprise class schedulers can manage the drives as a resource pool, and will only start a backup that needs four drives if four drives are actually available. This is a labor-intensive approach (initially), and it still is not fool-proof. The approach we use, is that ALL backups go to a disk pool initially. Nothing goes directly to tape. Disk pools do not use the MAXNUMMP value, so you can run as many channels and sessions as your hardware/OS/TSM can handle. This also eliminates the shoe-shining problem with streaming tape drives such as LTO and DLT (or at least postpones it). However, this does introduce another problem, at least for TDP Oracle clients... if the disk pool fills up, TDPO will not go to the next pool in the hierarchy like the BA client does... it will fail. In practice this means keeping the migration threshold low enough on those disk pools so that there will always be enough space available for TDPO. Again, this requires some careful analysis and cooperation of the TSM admin and the DBA team. Sorry for the long-winded response, I was on a roll! Hope it helps... Robin Sharpe Berlex Labs "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 09/14/2006 04:29:48 PM: > > Is it possible that there are two or more sessions running for the TDP > client simultaneously? > > Yes, absolutely. The Oracle DBAs have observed that this happens most > frequently when there's media wait, in which case multiple log backups > could be running simultaneously. > > How do others handle this? Does it make sense to set MAXNUMMP to the > number of drives...or even higher? I remember being very unhappy when I > set the value of MAXNUMMP to the number of drives, but I can't remember > what happened. Maybe I was running into some other problem. > > Thanks again for all the ideas, > anker