Hi,
For anyone interested, I just posted a new 1.6 branch called "bagLRU" at * https://github.com/rajiv-kapoor/memcached/tree/bagLRU*<https://github.com/rajiv-kapoor/memcached/tree/bagLRU>that includes an optimized engine called “bag_lru_engine”. The Bag LRU engine implements striped locks for parallel hash table accesses and a scheme for handling requests that give us lock-free GETs. Additionally I have added a command line option to allow setting CPU affinity for Memcached worker threads. I have noticed that CPU affinity along with NIC queue (IRQ) affinity can help improve performance significantly. To use the Bag LRU engine use the “-E path_to_memcached_root/engines/.libs/bag_lru_engine.so” option to Memcached To affinitize Memcached worker threads to “CPUs” (cores or logical threads) use the command line option “-T n”. Where "n" is the increment used for identifying the next CPU to bind the Memcached thread to. For example, "-T 1" will bind CPUs sequentially to the Memcached threads starting with CPU 0. “-T 2” will skip every other CPU starting with CPU 0. Default is “-T 0” - meaning no thread affinity. Using mcblaster to generate a GETs only load, I have tested the Bag LRU engine on a 16 core Sandybridge system from 1 to 16 threads and it shows almost linear scaling with number of cores. For this test thread pinning option used was “-T 1”. If you do try it out, I would love to get feedback on it. Thanks, \rajiv Intel, Corp.