Interesting! Thanks for your response Vic. I'm not sure it is working as designed. Eventually, when we use up our swap, WAS crashes OOM (that's *our* real issue, at least our biggest one anyway :). But if we are able to swapoff/swapon and recover that space without crashing WAS that kind a says to me that it didn't need it anyway - course I haven't tried that whilst workload was running through... Maybe it is destructive.
We plan to experiment some with the vm.swapiness and see if that helps. I guess in the very least, we can add enough vdisks and enough VM paging packs to get through week without a recycle until we figure this out as long as response time & cpu savings remain this good with 6.1. Marcy Cortes "This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation." -----Original Message----- From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of Vic Cross Sent: Monday, October 29, 2007 7:58 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: [LINUX-390] Swap oddities On Sun, 28 Oct 2007 08:41:16 am Marcy Cortes wrote: > So, if I'm understanding right, those would be dirty pages no longer > needed hanging out there in swap? That's right -- but you'll get arguments on the definition of "no longer needed". Having sent a page to the swap device, Linux will keep it out there even if the page gets swapped in. The reason: if the page again needs to be swapped out, and it wasn't modified while it was swapped back in, you save an I/O (so the claim is that it's not that it's "no longer needed", it's that it's "not needed right now but might be again soon"). I read about this and other interesting behaviours at http://linux-mm.org -- it seems that the operation of Linux's memory management has generated enough discussion for someone to start a wiki on it. :) The real issue in terms of VDISK is that even if we could eliminate the "keep it in case we need it" behaviour of Linux, there's no way for Linux to inform CP that a page of a VDISK is no longer needed and can be de-allocated. Even doing swapon/swapoff, with an intervening mkswap, even chccwdev the thing off from Linux and back on again, won't tell CP that it can flush the disk -- AFAIK, only DELETE/DEFINE would do it. > I thought the point of the priortized swap was that it'd keep reusing > those on the highest numbered disks before starting down to the next > disk. It was well into the 3rd disk > (they are like 250M, 500M, 1G, 1G). (at least I think it used to work > that way!). Could there be a linux bug here? >From what I've seen, Linux is working as designed unfortunately. The hierarchy of swap devices was a theory (tested by others much more skilled and equipped than me, even though I drew the funny pictures of it in the ISP/ASP Redbook). Regardless, it was only meant as an indicator for how big your *central storage* needs to be; as soon as the guest touched the second disk it was a flag to increase the central. (Can't increase central? Divide the workload across a number of guests.) Ideally you *never* want to swap; having a swap device that's almost as fast as memory helps mitigate the cost of swapping, but using that fast swap is not a habit to keep up. It's also quite possible that your smaller devices became fragmented and unable to satisfy a request for a large number of contiguous pages. Such fragmentation would make it ever more likely that the later devices would get swapped-onto as your uptime wore on. > Seems like vm.swappiness=0 (or a least a lower number than the default > of 60) would be a good setting for Linux under VM. Has anyone studied > this? /proc/sys/vm/swappiness was introduced with kernel 2.6 [1]. The doco suggests that using swappiness=0 makes the kernel behave like it used to in the 2.4 (and earlier) days -- sacrifice cache to reduce swapping. I have seen SLES 9 systems (with 2.6 kernels) appear to use far more memory than equivalent SLES 8 systems (kernel 2.4), so from experience a low value is useful for the z/VM environment [2]. CMM is meant to be the remedy to all of this of course. Now we can give all our Linux guests a central storage allocation beyond their wildest dreams (I'm kidding), and let VMRM handle the dirty work for us. I could imagine that we could be a bit more relaxed about our vm.swappiness value then -- we still don't want each of our penguins to buffer up its disks, but perhaps the consequences aren't as severe when allocations are more fluid and more effective sharing is taking place[3]. Unfortunately I haven't used CMM in anger as I'm a little light on systems to play with nowadays. Cheerio, Vic Cross [1] "Swappiness" controls the likelihood that a given page of memory will be retained as cache if the kernel needs memory -- it's a range from 100 (means cache pages are preserved and non-cache pages are swapped out to satisfy the request) to 0 (means cache pages are flushed to free memory to satisfy the request). [2] If only to preserve the way that we used to tune our guests prior to 2.6. :) [3] We might even be able to do the Embedded Linux thing and disable swapping entirely! -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390