listen_disabled_num doesn't seem to be a likely culprit... stats STAT pid 11435 STAT uptime 4457974 STAT time 1417457018 STAT version 1.4.5 STAT pointer_size 64 STAT rusage_user 19038.393825 STAT rusage_system 42581.905202 STAT curr_connections 264 STAT total_connections 1572308 STAT connection_structures 402 STAT cmd_get 658366591 STAT cmd_set 649621925 STAT cmd_flush 0 STAT get_hits 328785935 STAT get_misses 329580656 STAT delete_misses 20884653 STAT delete_hits 100083 STAT incr_misses 2779284 STAT incr_hits 44211787 STAT decr_misses 0 STAT decr_hits 0 STAT cas_misses 0 STAT cas_hits 0 STAT cas_badval 0 STAT auth_cmds 0 STAT auth_errors 0 STAT bytes_read 12821501027510 STAT bytes_written 3338632258667 STAT limit_maxbytes 4294967296 STAT accepting_conns 1 STAT listen_disabled_num 0 STAT threads 4 STAT conn_yields 441 STAT bytes 2786601635 STAT curr_items 5046673 STAT total_items 48200778 STAT evictions 0 STAT reclaimed 30123302 END
The web servers are very lightly loaded and have approximately 20GB free memory all the time. The utility showed nothing: # time ./mc_conn_tester.pl Averages: (conn: 0.00045183) (set: 0.00047043) (get: 0.00031982) real 53m25.697s user 0m17.721s sys 0m11.093s Even though we saw 14 failures during this time period. Will look more to see if this is a problem on our end On Sat, Nov 29, 2014 at 4:46 PM, dormando <dorma...@rydia.net> wrote: > Hey, > > http://memcached.org/timeouts - sounds like you've already done some tcp > dumping, so checking the stats as mentioned in here and running the test > script a bit should illuminate things a bit. > > On Fri, 21 Nov 2014, kgo...@bepress.com wrote: > > > A couple months ago, we moved our memcached nodes from a dedicated VM to > having one each on our four baremetal web servers (mod_perl). > > Since we moved, we've been seeing 10-20 failures per hour across our > entire environment, where $c->set returns false. > > > > I just spend some time with tcpdump and wireshark watching the memcached > traffic over port 11211. The keys that are failing are *not* in the > > tcpdump, so I'm thinking Cache::Memcached has lost a connection or got a > non-functioning socket somehow? > > > > Does anything in this scenario give anybody any ideas of what might be > going wrong? > > > > Each memcached node has about 250 connections at any given time and is > handling up to 350 gets/sets per second. The load on these webservers is > > around "1" (eight-core boxes). Their total network traffic is about 30 > Mb/sec, and memcached traffic is about 3 Mb/sec. There's nothing in > > memcached's logs. > > This is debian 6 (squeeze). > > > > $ dpkg -l | grep memcached > > ii libcache-memcached-perl 1.29-1 > Perl module for using memcached servers > > ii memcached 1.4.5-1+deb6u1 > A high-performance memory object caching system > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to memcached+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > -- Joe Steffee Linux Systems Administrator bepress -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.