Interesting!  Thanks for your response Vic.
I'm not sure it is working as designed.  Eventually, when we use up our
swap, WAS crashes OOM (that's *our* real issue, at least our biggest one
anyway :).  But if we are able to swapoff/swapon and recover that space
without crashing WAS that kind a says to me that it didn't need it
anyway - course I haven't tried that whilst workload was running
through...  Maybe it is destructive.

We plan to experiment some with the vm.swapiness and see if that helps.
I guess in the very least, we can add enough vdisks and enough VM paging
packs to get through week without a recycle until we figure this out as
long as response time & cpu savings remain this good with 6.1.


Marcy Cortes 
 
"This message may contain confidential and/or privileged information. If
you are not the addressee or authorized to receive this for the
addressee, you must not use, copy, disclose, or take any action based on
this message or any information herein. If you have received this
message in error, please advise the sender immediately by reply e-mail
and delete this message. Thank you for your cooperation."


-----Original Message-----
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Vic Cross
Sent: Monday, October 29, 2007 7:58 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] Swap oddities

On Sun, 28 Oct 2007 08:41:16 am Marcy Cortes wrote:

> So, if I'm understanding right, those would be dirty pages no longer 
> needed hanging out there in swap?

That's right -- but you'll get arguments on the definition of "no longer
needed".  Having sent a page to the swap device, Linux will keep it out
there even if the page gets swapped in.  The reason: if the page again
needs to be swapped out, and it wasn't modified while it was swapped
back in, you save an I/O (so the claim is that it's not that it's "no
longer needed", it's that it's "not needed right now but might be again
soon").

I read about this and other interesting behaviours at
http://linux-mm.org -- it seems that the operation of Linux's memory
management has generated enough discussion for someone to start a wiki
on it. :)

The real issue in terms of VDISK is that even if we could eliminate the
"keep it in case we need it" behaviour of Linux, there's no way for
Linux to inform CP that a page of a VDISK is no longer needed and can be
de-allocated.  Even doing swapon/swapoff, with an intervening mkswap,
even chccwdev the thing off from Linux and back on again, won't tell CP
that it can flush the disk -- AFAIK, only DELETE/DEFINE would do it.

> I thought the point of the priortized swap was that it'd keep reusing 
> those on the highest numbered disks before starting down to the next 
> disk.  It was well into the 3rd disk
> (they are like 250M, 500M, 1G, 1G).   (at least I think it used to
work
> that way!).  Could there be a linux bug here?

>From what I've seen, Linux is working as designed unfortunately.  The
hierarchy of swap devices was a theory (tested by others much more
skilled and equipped than me, even though I drew the funny pictures of
it in the ISP/ASP Redbook).  Regardless, it was only meant as an
indicator for how big your *central storage* needs to be; as soon as the
guest touched the second disk it was a flag to increase the central.
(Can't increase central?  Divide the workload across a number of
guests.)  Ideally you *never* want to swap; having a swap device that's
almost as fast as memory helps mitigate the cost of swapping, but using
that fast swap is not a habit to keep up.

It's also quite possible that your smaller devices became fragmented and
unable to satisfy a request for a large number of contiguous pages.
Such fragmentation would make it ever more likely that the later devices
would get swapped-onto as your uptime wore on.

> Seems like vm.swappiness=0 (or a least a lower number than the default

> of 60) would be a good setting for Linux under VM. Has anyone studied 
> this?

/proc/sys/vm/swappiness was introduced with kernel 2.6 [1].  The doco
suggests that using swappiness=0 makes the kernel behave like it used to
in the 2.4 (and earlier) days -- sacrifice cache to reduce swapping.  I
have seen SLES 9 systems (with 2.6 kernels) appear to use far more
memory than equivalent SLES
8 systems (kernel 2.4), so from experience a low value is useful for the
z/VM environment [2].

CMM is meant to be the remedy to all of this of course.  Now we can give
all our Linux guests a central storage allocation beyond their wildest
dreams (I'm kidding), and let VMRM handle the dirty work for us.  I
could imagine that we could be a bit more relaxed about our
vm.swappiness value then -- we still don't want each of our penguins to
buffer up its disks, but perhaps the consequences aren't as severe when
allocations are more fluid and more effective sharing is taking
place[3].  Unfortunately I haven't used CMM in anger as I'm a little
light on systems to play with nowadays.

Cheerio,
Vic Cross

[1] "Swappiness" controls the likelihood that a given page of memory
will be retained as cache if the kernel needs memory -- it's a range
from 100 (means cache pages are preserved and non-cache pages are
swapped out to satisfy the
request) to 0 (means cache pages are flushed to free memory to satisfy
the request).
[2] If only to preserve the way that we used to tune our guests prior to
2.6. :) [3] We might even be able to do the Embedded Linux thing and
disable swapping entirely!

--
This message has been scanned for viruses and dangerous content by
MailScanner, and is believed to be clean.

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions, send
email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit http://www.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to