Hello All, many thanks to all who responded to my inquiry. It looks like Mark's response, item #2, was the answer. I had my region size at 1280M and TSM was running just awful. I had a phone conversation with Mark and afterwards, I tried his suggestion of REDUCING the region size. Note the before/after output to the "show memu SHORT" (Case sensitive!) display:
Region Size = 1280M MAX initial storage 1342177280 (1280.0 MB) Freeheld bytes 145620 (0.1 MB) MaxQuickFree bytes 26387005 (25.2MB) 56 Page buffers of 32210 : 315 buffers of 4026. 4 Large buffers of 2013 : 222 XLarge buffers of 251. 202 buffers free: 336 hiAlloc buffers: 134 current buffers. 50 units of 688 bytes hiAlloc: 44 units of 72 bytes hiCur. Region Size=512M MAX initial storage 536870912 (512.0 MB) Freeheld bytes 10280787 (9.8 MB) MaxQuickFree bytes 10280878 (9.8 MB) 56 Page buffers of 12549 : 4 buffers of 1568. 2 Large buffers of 784 : 18 XLarge buffers of 98. 66992 buffers free: 81083 hiAlloc buffers: 1903 current buffers. 28969 units of 56 bytes hiAlloc: 1532 units of 104 bytes hiCur. Look at the second line of the displays. It appears that with region=1280M the "Freeheld bytes" buffer was WAY under allocated. Only 145K was allocated. With the region size set to 512M 9.8MB was allocated to the buffer and TSM is running significantly better. Whether or not this will help someone else I do not know. This is the first I've heard that REDUCING region size will help performance. It is counter-intuitive. I had been increasing it slowly over a period of time based on information I had found on ADSM.ORG. It's hard to argue with results however. My maintenance cycle is currently around 3 hours further along today than it usually is. Take care, Al Alan Davenport Senior Storage Administrator Selective Insurance Co. of America [EMAIL PROTECTED] (973) 948-1306 -----Original Message----- From: Darby, Mark [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 12, 2003 12:03 PM To: [EMAIL PROTECTED] Subject: Re: OS390 TSM Performance questions. Hello, Al. We have much to share. We are OS/390 2.10 on a 7060-H50 (~120 MIPS) with approx. 100Mbit network connectivity and have had many, long-standing TSM performance problems. We are currently running 4.2.3.2. We have discovered in working with TSM support (without a technical explanation as to "why?") that reducing the TSM server's region to 512M and setting (by reducing) bufpoolsize to 131072 (i.e., 128MB) works for us. We had previously tried several region settings from 1.75G down to 960M with the same problematic results until "happening upon" the severely reduced, storage-constrained "settings" with which we are now running (or should I say, limping). This was determined with the help of the Tivoli "performance team" in response to a long string of numerous performance-related PMRs. Here are some things we have discovered - and which work best for us: 1. Region over 512M causes serious and pervasive performance problems 2. BufPoolSize much over 131072 MAY also cause/contribute similarly (and definitely doesn't help) 3. CPU utilization is VERY high for any database-intensive processes 4. Database corruption may be the root cause for our severe symptoms (this is purely conjecture on my part at this point, but supported, to some degree, by TSM support statements recommending we fix known DB corruption - which, of course, with dump/reload/audit performance being what it is, is an impossible "hit" to take). FYI: We plan to "move out" of the TSM server with database corruption "into" a new, virgin server(s) as soon as time and other factors permit. Prior to adjusting our "settings" as indicated above, we were experiencing severe, pervasive, and nearly continual performance problems (and CPU over-utilization), server unresponsiveness, and what I would call "stress-related" failures of all sorts, and a whole plethora of other, unmentioned "problems". After making "the adjustments" we have found that, although the TSM server still frequently gets "tangled up in its shorts", the problems are not as severe nor are they as frequent or pervasive, and performance is better than when we ran it in the "larger memory footprint". Although it is closer to acceptable, it is still well below the kind of performance I expect from an application running on the platform (i.e., S/390). We cannot even imagine a reason why these adjustments have helped, but they have. It is totally counter-intuitive to me that reducing the memory footprint would yield these results, but it has. I would call IBM/Tivoli support, if I were you, and start a diagnostic regimen with them on your particular issues. We were told by them that many OS/390 shops are getting far superior performance, throughput, and (I presume) a much better CPU utilization picture than we experience. Further, their stated position is that some environmental factor, unique to "us", is the root cause for our performance issues. Aside from our limited bandwidth and database corruption "issues", I cannot think of any other factor that makes us extremely unique among all the other users of the TSM server on OS/390. You are the first shop I have heard reporting an experience similar to ours. Please feel free to explore this further with me off-line if you wish. Regards, Mark Darby (301) 903-5229 -----Original Message----- From: Alan Davenport [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 12, 2003 10:46 AM To: [EMAIL PROTECTED] Subject: OS390 TSM Performance questions. Hello, We're running TSM v5.1.5.4 on an IBM 20660A2 processor running OS390 v10. There is a 100Mbit, single port OSA card on the processor. We are backing up 197 clients per night. MAXSCHEDSESSIONS is set to allow 116 simultaneous backup sessions. Our backup window begins at 20:00 and ends at 07:30 the next morning. We are seeing poor performance on our backups during the window. For example, one server that will backup in 6-7 minutes outside the window takes hours to complete during the window. The TSM server has a region size of 1280M and MPTHREADING is set to YES. Self tune buffer size and TXN size is enabled. We are backing up to a 100GB disc buffer to an EMC model 8830 drive array. On average we backup 30-40GB per night with a peak of 75-80GB. I know there are much larger shops backing up many more servers out there running OS390 also. What I would like to know is, on large shops, what is your OSA configuration? Are you running multi-port OSAs and/or gigabit cards? For comparison, I would also like to know how many clients you are backing up per night. Where do you think the bottleneck is? Have you seen similar problems and what did you do to help alleviate the problem? I am fairly confident that TSM is not CPU constrained during the window. We recently moved TSM to a higher service class with little effect on the problem. Do you feel we are saturating the OSA card? Any thoughts and suggestions would be greatly appreciated. Take care, Al Alan Davenport Senior Storage Administrator Selective Insurance Co. of America [EMAIL PROTECTED] (973) 948-1306