Re: Tcp / OSA question

2006-08-18 Thread Macioce, Larry
The prod guest is 1234464k mem and we have 1048936k swap space and smr
is: IABIAS : INTENSITY=90%; DURATION=2 
LDUBUF : Q1=300% Q2=200% Q3=100%   
STORBUF: Q1=300% Q2=250% Q3=200%   
DSPBUF : Q1=32767 Q2=32767 Q3=32767

But the part that's really is weird is the VM lpar timed out on a ping
and this was proven by the VM screen blanking out and kicking back a
658.
When I speak (or type) that the guest goes into a wait I'm looking at
top on the guest. The number will reach into the mid 40s.


Thanks
Mace
-Original Message-
From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Neale Ferguson
Sent: Friday, August 18, 2006 11:43 AM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: Tcp / OSA question

Sounds like the guest was in the eligible list. When you notice this
again
can you do a #CP IND on a power user like MAINT and see if the E3
figure
is non-zero. What are your SRM settings? How big is the guest? How much
real
memory do you have? How much expanded?

-Original Message-
A strange thing happened this morning. First off let me say that the
zbox
 
is sitting on a public addr while the rest of the network is on a
private
 
network. We have been experiencing poor ttl number and mediocore respose


times when pinging for the public addr. 
Anyway this moring I noticed the wait time on one of the linux guests
was
 
poor so I pinged the private router and got good numbers. I pinged the 

public router and agot poor ttl and decent response numbers. I then
tried
 
to ping the VM machine(from command line on my windows desktop). The VM 

screen hun for a second and I got a LB658 and the ping timedout. After 

about 5 secs the VM session came back and the sddr pinged. 
The linux guest never died and they were still going when I looked. 
Any ideas??? 

-
***
*
The information transmitted is intended solely for the individual
or entity to which it is addressed and may contain confidential
and/or
privileged material. Any review, retransmission, dissemination or
other use of or taking action in reliance upon this information by
persons or entities other than the intended recipient is
prohibited. If you have received this email in error please contact
the sender and delete the
material from any computer.
***
*



Re: Tcp / OSA question

2006-08-18 Thread Adam Thornton

On Aug 18, 2006, at 10:12 AM, Macioce, Larry wrote:


The prod guest is 1234464k mem and we have 1048936k swap space and smr
is: IABIAS : INTENSITY=90%; DURATION=2
LDUBUF : Q1=300% Q2=200% Q3=100%
STORBUF: Q1=300% Q2=250% Q3=200%
DSPBUF : Q1=32767 Q2=32767 Q3=32767

But the part that's really is weird is the VM lpar timed out on a ping
and this was proven by the VM screen blanking out and kicking back a
658.
When I speak (or type) that the guest goes into a wait I'm looking at
top on the guest. The number will reach into the mid 40s.


How big is your VM system?  That's an awfully large Linux system, and  
maybe what you're seeing is that VM is finding it difficult to get  
the entire guest into working storage at once.  Look at the output of  
free, pay special attention to how much is in buffers and cache, and  
resize the guest's real store to effectively eliminate that amount.


Also, what is going on on the system to make the number of eligible  
processes jump to 40?  I'm going to hazard a guess that it's some  
hideously exuberantly-threaded Java app.  Am I right?


Adam


Re: Tcp / OSA question

2006-08-18 Thread Macioce, Larry
Here are the numbers straight form thehorses mouth:
Mem:   1234464k total,  1224416k used,10048k free, 2284k buffers
Swap:  1048936k total,   939956k used,   108980k free,   268164k cached
The VM lpar if 3g and yes the entire guest apps are in java.
The thing about the free and buffer is that they are dynamic numbers, so
I wouldn't know where to change them.
thx 

Mace

-Original Message-
From: The IBM z/VM Operating System [mailto:[EMAIL PROTECTED] On
Behalf Of Adam Thornton
Sent: Friday, August 18, 2006 1:33 PM
To: IBMVM@LISTSERV.UARK.EDU
Subject: Re: Tcp / OSA question

On Aug 18, 2006, at 10:12 AM, Macioce, Larry wrote:

 The prod guest is 1234464k mem and we have 1048936k swap space and smr
 is: IABIAS : INTENSITY=90%; DURATION=2
 LDUBUF : Q1=300% Q2=200% Q3=100%
 STORBUF: Q1=300% Q2=250% Q3=200%
 DSPBUF : Q1=32767 Q2=32767 Q3=32767

 But the part that's really is weird is the VM lpar timed out on a ping
 and this was proven by the VM screen blanking out and kicking back a
 658.
 When I speak (or type) that the guest goes into a wait I'm looking at
 top on the guest. The number will reach into the mid 40s.

How big is your VM system?  That's an awfully large Linux system, and  
maybe what you're seeing is that VM is finding it difficult to get  
the entire guest into working storage at once.  Look at the output of  
free, pay special attention to how much is in buffers and cache, and  
resize the guest's real store to effectively eliminate that amount.

Also, what is going on on the system to make the number of eligible  
processes jump to 40?  I'm going to hazard a guess that it's some  
hideously exuberantly-threaded Java app.  Am I right?

Adam

-
***
*
The information transmitted is intended solely for the individual
or entity to which it is addressed and may contain confidential
and/or
privileged material. Any review, retransmission, dissemination or
other use of or taking action in reliance upon this information by
persons or entities other than the intended recipient is
prohibited. If you have received this email in error please contact
the sender and delete the
material from any computer.
***
*



Re: Tcp / OSA question

2006-08-18 Thread Adam Thornton

On Aug 18, 2006, at 10:50 AM, Macioce, Larry wrote:


Here are the numbers straight form thehorses mouth:
Mem:   1234464k total,  1224416k used,10048k free, 2284k  
buffers
Swap:  1048936k total,   939956k used,   108980k free,   268164k  
cached

The VM lpar if 3g and yes the entire guest apps are in java.
The thing about the free and buffer is that they are dynamic  
numbers, so

I wouldn't know where to change them.


Well, looking at this, you're hitting swap pretty hard, which is  
alarming, but you also have 268M of memory that's being used as DASD  
cache.  I think this implies that the memory needed by the app is  
very spiky: it needs a whole bunch (perhaps reading large files or  
database tables into memory all at once?), and then gives it back.


So I'd try making the size of the guest 1GB and seeing if that  
helps.  I would also recommend breaking that swap into at least three  
different tiers at different priorities; maybe 300MB or so of VDISK  
swap at the highest priority, 400 MB of DASD swap at a lower  
priority, and 500 MB of DASD swap below that.  (Note that your swap  
will grow to cover the main storage you're taking away).


The *real* problem is probably that your java apps are ill-behaved;  
if your Java programmers are like the latte-sipping little-black- 
rectangular-glasses-wearing goatee-stroking^W^W^W^W^Wtypical, you  
won't have much luck convincing them to write something that doesn't  
assume it's running in an environment where memory and CPU cycles are  
free.  So I don't think you can do much more than try to tune your  
way around the symptoms.


If the apps are defensively coded, then you might be able to get  
somewhere by restricting the Java heap size and thereby forcing the  
app to *not* slurp entire huge files or tables all at once.  OTOH, if  
they are not well coded, this will just break them, as they will run  
out of memory when they *try* to slurp in data and hit the wall; in  
this case, the developer's solution to running out of memory in his  
app will not have been to restrict the size of the data in core at  
any one time, but just to raise the heap size until it all fit.  Of  
course, as your data grows, this approach becomes less scalable.  And  
without access to the actual app, this is all guesswork anyway.


So my recommendation would be: make that a 1024MB guess, and tier  
your swap space, with at least the first 200MB being swap-to-VDISK.


Adam