> On Jan 27, 2010, at 11:42 AM, RayLicon wrote:
> > Has anyone done research into the performance of
> SWAP on the traditional partitioned based SWAP device
> as compared to a SWAP area set up on ZFS with a zvol?
> 
> If you have to use the swap device, you have no
> performance. Case closed.

Umm, I think that very much depends on the use case.  I know one where high 
swap rates make a big difference, but it is kind of specific to HPC clusters 
such as the 12,000 core cluster that Sun is installing downstairs from where I 
work.  See http://nf.nci.org.au for more on the machine.

On this machine, and the previous 1920 core Altix 3000 cluster, heavy swapping 
is an integral part of the compute job preemption strategy.  Large parallel 
compute jobs (certainly up to hundreds of CPUs, not sure of the largest jobs 
people are running) can be scheduled on nodes running small jobs by pre-empting 
the small jobs (using SIGSTOP sent to the job process group) and allowing the 
new job to push the pre-empted job out to swap.  The jobs scheduled and 
actually running on each node are always constrained to the amount of physical 
memory, but there will also be large amounts (multiple GB) of swap actively 
used to hold scheduled but pre-empted jobs.   The high swap rates achievable 
with Linux are a big part of why this strategy works well.  Once the large 
parallel job finishes, the small jobs get restarted and have to page themselves 
back in.

The key distinction is between pathological swapping because the sum of your 
process working sets exceeded physical memory, but all those processes are 
still getting the CPU and causing page faults; and in the job preemption case, 
heavy swapping to clear out stuff that is no longer part of a working set 
because the process(es) being swapped out are suspended.  The latter is much 
more like "ye olde schoole" UNIX whole process swapping.  Pagein rates are also 
a big deal when those suspended processes get sent a SIGCONT to start them 
running.

You might say this is a rare use case, but it's something that the cluster 
admin always mentions as a real deal-breaker, whenever a gentle Solaris vs 
Linux discussion starts up...

-Jason
-- 
This message posted from opensolaris.org

Reply via email to