On 04/24/2013 08:24 AM, Robert Haas wrote:

Are you referring to the fact that vm.zone_reclaim_mode = 1 is an
idiotic default?

Well... it is. But even on systems where it's not the default or is explicitly disabled, there's just something hideously wrong with NUMA in general. Take a look at our numa distribution on a heavily loaded system:

available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
node 0 size: 36853 MB
node 0 free: 14315 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
node 1 size: 36863 MB
node 1 free: 300 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

What the hell? Seriously? Using numactl and starting in interleave didn't fix this, either. It just... arbitrarily ignores a huge chunk of memory for no discernible reason.

The memory pressure code in Linux is extremely fucked up. I can't find it right now, but the memory management algorithm makes some pretty ridiculous assumptions once you pass half memory usage, regarding what is in active and inactive cache.

I hate to rant, but it gets clearer to me every day that Linux is optimized for desktop systems, and generally only kinda works for servers. Once you start throwing vast amounts of memory, CPU, and processes at it though, things start to get unpredictable.

That all goes back to my earlier threads that disabling process autogrouping via the kernel.sched_autogroup_enabled setting, magically gave us 20-30% better performance. The optimal setting for a server is clearly to disable process autogrouping, and yet it's enabled by default, and strongly advocated by Linus himself as a vast improvement.

I get it. It's better for desktop systems. But the LAMP stack alone has probably a couple orders of magnitude more use cases than Joe Blow's Pentium 4 in his basement. Yet it's the latter case that's optimized for.

Servers are getting shafted in a lot of cases, and it's actually starting to make me angry.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to