No crash and error is found. I will keep it running on the dev environment.
On Tue, Jan 20, 2015 at 2:37 PM, dormando <dorma...@rydia.net> wrote: > Thanks! > > No crashes is interesting/useful at least? No errors or other problems? > > I'm still hoping someone can side-by-side in production with the > recommended settings. I can come up with synthetic tests all day and it > doesn't educate in the same way. > > On Tue, 20 Jan 2015, Zhiwei Chan wrote: > > > test result: > > I run this test last night, the result as following: > > 1. environment: > > [root@jason3 code]# lsb_release -a > > LSB Version: > > > :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch > > Distributor ID: CentOS > > Description: CentOS release 6.5 (Final) > > Release: 6.5 > > Codename: Final > > [root@jason3 code]# free > > total used free shared buffers cached > > Mem: 8003888 3434536 4569352 0 263324 1372600 > > -/+ buffers/cache: 1798612 6205276 > > Swap: 8142840 11596 8131244 > > [root@jason3 code]# cat /proc/cpuinfo > > processor : 0 > > vendor_id : GenuineIntel > > cpu family : 6 > > model : 58 > > model name : Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz > > stepping : 9 > > cpu MHz : 1600.000 > > cache size : 8192 KB > > .... 4 core. > > > > 2. running option: > > [root@jason3 code]# ps -ef|grep memcached- > > root 7898 1 11 Jan19 ? 02:12:46 ./memcached-master -c > 10240 -o tail_repair_time=7200 -m 64 -u root -p 33333 -d > > root 8092 1 11 Jan19 ? 02:11:22 ./memcached-lrurework -d > -c 10240 -o lru_maintainer lru_crawler -m 64 -u root -p 44444 > > root 10265 9447 0 11:30 pts/1 00:00:00 grep memcached- > > root 10325 1 11 Jan19 ? 02:06:14 ./memcached-release -d > -c 10240 -m 64 -u root -p 55555 -o slab_reassign lru_crawler slab_automove=3 > > release_mem_sleep=1 release_mem_start=40 release_mem_stop=80 > lru_crawler_interval=3600 > > > > memcached-master : the most update memcached of master branch. with port > 33333 > > memcached-lrurework: the most update lrurework branch of dormado's > memcached, with port 44444 > > memcached-release: the most update master branch + release memory path. > with port 55555 > > > > 3. What is the traffic mode? > > It simulates the traffic distribution of one of our pools, with the > expire-time and value-length distribution as following: > > #the expire of keys > > expire_time = [1,5,10,30,60,300,600,3600,86400,0] > > expire_time_weight = [1,1, 2, 5, 8, 5, 6, 5, 3,1] > > > > #the len of value > > value_len = [4,10,50,100,200,500,1000,2000,5000,10000] > > value_len_weight = [3, 4, 5, 8, 8, 10, 5, 5, 2, 1] > > > > Using the the python script "compare_test.py"to excute: python > ./compare_test.py 192.168.116.213:33333,192.168.116.213:44444, > 192.168.116.213:55555 > > > > I run the test process on the machine that run memcached process, so > that it is easy to get heavy workload. > > > > I got a test result of last 12 hours, watch at Cacti. it seems that > there is no different for this traffic mode. > > gets/sets = 9:1 > > hit_rate ~ 50% > > [IMAGE] > > I also print some detail statistics info in the test script: > > > > ​Cache list: ['192.168.116.213:33333', '192.168.116.213:44444', ' > 192.168.116.213:55555'] > > send_key_number: 127306 ------->unique keys number > > test_loop: 0 --->loop forever, no limit > > weight of get/set command: [10, 1] ----------> the weight of get/set > command. Note: if get a key miss, it will set the key immediately, not count > > into this weight. > > show_interval: 10 ---the interval of showing statistics info. > > stats_interval: 5 ---the interval of getting the stats of memcached. > > show_stats_interval:[60, 3600, 43200] -----------the time-range of > showing in second. e.g. "60" means "last 60s", and 3600 means "last 3600s" > > len of keys: [4, 10, 50, 100, 200, 500, 1000, 2000, 5000, 10000] > -------->possible length of keys we will set to memcached. > > weight of keys'len: [3, 4, 5, 8, 8, 10, 5, 5, 2, 1] > ------->weight of different length of value. > > expire-time of keys: [1, 5, 10, 30, 60, 300, 600, 3600, 86400, 0] > --->possible expire-time we used in set command. Independent with the > length > > of value. > > weight of keys'expire-time: [1, 1, 2, 5, 8, 5, 6, 5, 3, 1] -->weight of > different expire-time. > > ... > > > > #28190284 command: 281902842 -------->the first number has no > meaning; the second number is the command number we send to memcached. > > > > All the following number is recorded as increment, except the second > number of the items. > > 192.168.116.213:33333 > > [60s] gets: 523063, hit: 49%, updates: 52141, dels: 0, > items: -8/69423, read: 53891331, write: 215106364, OOMs: 0, evict: 6626 > > [3600s] gets: 29664649, hit: 49%, updates: 2966798, dels: 0, > items: 13/69423, read: 3038408576, write: 12218798832, OOMs: 0, evict: > > 356323 > > 192.168.116.213:44444 > > [60s] gets: 523007, hit: 50%, updates: 52202, dels: 0, > items: -62/69348, read: 53528995, write: 218847446, OOMs: 0, evict: 6539 > > [3600s] gets: 29667232, hit: 50%, updates: 2964220, dels: 0, > items: -14/69348, read: 3030860658, write: 12405356058, OOMs: 0, evict: > > 359460 > > 192.168.116.213:55555 > > [60s] gets: 523093, hit: 49%, updates: 52116, dels: 0, > items: 28/69396, read: 52993446, write: 215231210, OOMs: 0, evict: 6491 > > [3600s] gets: 29669464, hit: 49%, updates: 2961988, dels: 0, > items: -25/69396, read: 3038356827, write: 12219764097, OOMs: 0, evict: > > 355644 > > ... > > > > > > > > On Fri, Jan 16, 2015 at 9:29 PM, Zhiwei Chan <z.w.chan.ja...@gmail.com> > wrote: > > Our maintain team trend to be conservative, especially on the > basic software relative to performance. so I think it is rare possible > > to post it to the production recently. But I write a pretty > convenient tools in Python for an A/B test. The tool can fake traffic of > > random expire-time and random length, and also can specify the > weights of different expire-time and length, and lots of other > > functions. It is almost completed, and I can post a result next > Monday. > > > > On Fri, Jan 16, 2015 at 11:12 AM, dormando <dorma...@rydia.net> wrote: > > If you want? > > > > What would make you confident enough to try the branch in > production? Or > > do you rely on your other patches and that's not really possible? > > > > On Thu, 15 Jan 2015, Zhiwei Chan wrote: > > > > > I try to use real traffic of application to make a compare > test, but it seems that not all of guys use the cache-client with > > consistent hash in > > > dev environment. The result is that the traffic is not > distributed well as I supposed. > > > Should I fake the traffic and make a compare test instead of > real traffic? e.g., fake the random expire-time keys traffic to > > set and get for > > > memcached. > > > > > > --------------- > > > host mc56 installs the most update LRU-rework branch's memcached > with option likes "/usr/local/bin/memcached -u nobody -d -c > > 10240 -o > > > lru_maintainer lru_crawler -m 64 -p 11811"; > > > host mc57 install the version 1.4.20_7_gb118a6c's memcached, > with option likes "/usr/bin/memcached -u nobody -d -c 10240 -o > > tail_repair_time=7200 > > > -m 64 -p 11811", > > > > > > I sum up the stats of all memcache instances on the host and > make followings analysis: > > > > > > Inline image 1 > > > > > > On Wed, Jan 14, 2015 at 1:58 AM, dormando <dorma...@rydia.net> > wrote: > > > Last update to the branch was 3 days ago. I'm not planning > on doing any > > > more work on it at the moment, so people have a chance to > test it. > > > > > > thanks! > > > > > > On Tue, 13 Jan 2015, Zhiwei Chan wrote: > > > > > > > I compile directly using your branch on the test server, > and please tell me if it need update and re-compile. > > > > > > > > On Tue, Jan 13, 2015 at 4:20 AM, dormando < > dorma...@rydia.net> wrote: > > > > That sounds like an okay place to start. Can you > please make sure the > > > > other dev server is running the very latest > version of the branch? A lot > > > > changed since last friday... a few pretty bad bugs. > > > > > > > > Please use the startup options described in the > middle of the PR. > > > > > > > > If anyone's brave enough to try the latest branch > on one production > > > > instance (if they have a low traffic one > somewhere, maybe?) that'd be > > > > good. I ran the branch under a load tester for a > few hours, it passes > > > > tests, etc. If I merge it, it'll just go into > people's productions without > > > > ever having a production test first, so hopefully > someone can try it? > > > > > > > > thanks > > > > > > > > On Mon, 12 Jan 2015, Zhiwei Chan wrote: > > > > > > > > > I have run it since last Friday, so far no > crash. As I have finished the haproxy works today, I will try a > > compare test for > > > this > > > > LRU works > > > > > tomorrow as following: There are two > servers(Centos 5.8, 8cores, 8G memory) in the dev environment, Both of > > server run 32 > > > > memcached > > > > > instances(processes) with maxmum memory of 128M. > One server runs version 1.4.21, the other runs this branch. > > There are lots > > > of > > > > "pools" using these > > > > > memcached server, and all of pools use tow > memcached instances on different server. The client of pools use > > Consistent Hash > > > algorithm > > > > to distribute > > > > > keys to their 2 memcached instances. I will > watch the hit-rate and other performance using Cacti. > > > > > I think it will work, but usually there is not > much traffic in our dev environment. Please tell me if any > > other advice. > > > > > > > > > > > > > > > 2015-01-08 4:21 GMT+08:00 dormando < > dorma...@rydia.net>: > > > > > Hey, > > > > > > > > > > To all three of you: Just run it anywhere > you can (but not more than one > > > > > machine, yet?), with the options > prescribed in the PR. Ideally you have > > > > > graphs of the hit ratio and maybe cache > fullness and can compare > > > > > before/after. > > > > > > > > > > And let me know if it hangs or crashes, > obviously. If so a backtrace > > > > > and/or coredump would be fantastic. > > > > > > > > > > On Thu, 8 Jan 2015, Zhiwei Chan wrote: > > > > > > > > > > > I will deploy it to one of our test > environment on CentOS 5.8, for a comparison test with the 1.4.21, > > although the > > > > workloads is > > > > > not as heavy as > > > > > > product environment. Tell me if any I > could help. > > > > > > > > > > > > 2015-01-07 23:30 GMT+08:00 Eric > McConville <erichasem...@gmail.com>: > > > > > > Same here. Do you want any > findings posted to the mailing list, or the PU thread? > > > > > > > > > > > > On Wed, Jan 7, 2015 at 5:56 AM, Ryan > McCullagh <m...@ryanmccullagh.com> wrote: > > > > > > I'm willing to help out in any way > possible. What can I do? > > > > > > > > > > > > -----Original Message----- > > > > > > From: memcached@googlegroups.com > [mailto:memcached@googlegroups.com] On > > > > > > Behalf Of dormando > > > > > > Sent: Wednesday, January 7, 2015 > 3:52 AM > > > > > > To: memcached@googlegroups.com > > > > > > Subject: memory efficiency / LRU > refactor branch > > > > > > > > > > > > Yo, > > > > > > > > > > > > > https://github.com/memcached/memcached/pull/97 > > > > > > > > > > > > Opening to a wider audience. I > need some folks willing to poke at it and see > > > > > > if their workloads fair better or > worse with respect to hit ratios. > > > > > > > > > > > > The rest of the work remaining on > my end is more testing, and some TODO's > > > > > > noted in the PR. The remaining > work is relatively small aside from the page > > > > > > mover idea. It hasn't been > crashing or hanging in my testing so far, but > > > > > > that might still happen. > > > > > > > > > > > > I can't/won't merge this until I > get some evidence that it's useful. > > > > > > Hoping someone out there can lend > a hand. I don't know what the actual > > > > > > impact would be, but for some > workloads it could be large. Even for folks > > > > > > who have set all items to never > expire, it could still potentially improve > > > > > > hit ratios by better protecting > active items. > > > > > > > > > > > > It will work best if you at least > have a mix of items with TTL's that expire > > > > > > in reasonable amounts of time. > > > > > > > > > > > > thanks, > > > > > > -Dormando > > > > > > > > > > > > -- > > > > > > > > > > > > --- > > > > > > You received this message because you > are subscribed to the Google Groups "memcached" group. > > > > > > To unsubscribe from this group and stop > receiving emails from it, send an email to > > > memcached+unsubscr...@googlegroups.com. > > > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > --- > > > > > > You received this message because you > are subscribed to the Google Groups "memcached" group. > > > > > > To unsubscribe from this group and stop > receiving emails from it, send an email to > > > memcached+unsubscr...@googlegroups.com. > > > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > --- > > > > > > You received this message because you > are subscribed to the Google Groups "memcached" group. > > > > > > To unsubscribe from this group and stop > receiving emails from it, send an email to > > > memcached+unsubscr...@googlegroups.com. > > > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > --- > > > > > You received this message because you are > subscribed to the Google Groups "memcached" group. > > > > > To unsubscribe from this group and stop > receiving emails from it, send an email to > > memcached+unsubscr...@googlegroups.com. > > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > --- > > > > You received this message because you are subscribed to > the Google Groups "memcached" group. > > > > To unsubscribe from this group and stop receiving emails > from it, send an email to > > memcached+unsubscr...@googlegroups.com. > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > > > > > > -- > > > > > > --- > > > You received this message because you are subscribed to the > Google Groups "memcached" group. > > > To unsubscribe from this group and stop receiving emails from > it, send an email to memcached+unsubscr...@googlegroups.com. > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to memcached+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > > -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.