No command can take up much time. If all other commands hang up, it's either a long-running stats command like I listed before, or a hang bug (though I don't know why it would recover on its own). We've fixed a lot of those since .13, so I'd still advocate upgrading at least some instances to see if they become immune to it.
On Thu, 7 Aug 2014, Claudio Santana wrote: > > I think this issue has something to do with our access pattern (although we > run very limited commands and not very high traffic > either). > > We always start having issues on the same instance (I guess because of the > system accessing a specific key). When we notice the > issue we bounce the instance within 15/20 mins, I don't know if you think > this is not enough time to recover. > > Sometimes the issue "moves" to other instaces in other servers (our client > doesn't rebalance so the system is trying to access > completely different keys). On the other servers sometimes the issue goes > away on its own or the spike is not at 100pct. > > On Aug 7, 2014 6:36 PM, "dormando" <dorma...@rydia.net> wrote: > Those three stats commands aren't problematic. The others I listed are. > Sadly there aren't stats counters for them, I think... Are you sure it's > not completely crashing after the CPU spike? it actually recovers on its > own? > > On Thu, 7 Aug 2014, Claudio Santana wrote: > > > > > I run every minute stats, stats items and stats slabs. > > > > the only commands executed are remove, incr, add, get, set and cas. > > > > I'm running now with 6 threads per instance with 3 per server and > haven't had the issue again, not that this > change fixed it. > > > > I'll definitely update. > > > > On Aug 7, 2014 6:13 PM, "dormando" <dorma...@rydia.net> wrote: > > Please upgrade. If you have problems with the latest version we > can look > > into it more. > > > > You can also look at command counters for odd commands being > given: make > > sure nobody's running flushes, or "stats sizes", or "stats > cachedump" > > since those can cause CPU spikes and hangs. > > > > With 1.4.20 you can use "stats conns" to see what the > connections are > > doing during the cpu spike. > > > > On Thu, 7 Aug 2014, Claudio Santana wrote: > > > > > Forgot to say I'm running version 1.4.13 libevent > 2.0.16-stable > > > > > > > > > > > > On Thu, Aug 7, 2014 at 6:08 PM, Claudio Santana > <claudio.sant...@gmail.com> wrote: > > > Sorry for the late response. > > > > > > My CPU utilization normally is min 2.5% to 6.5% max. > > > > > > So it's interesting you ask this. The reason why I submitted > the 1st question is because I've experienced > some > > random CPU > > > utilization spikes. From this about 6% CPU utilization all of > the sudden it spikes to 100% and I can see > the > > offending > > > process is one of the Memcached instances. Sadly this CPU > spike is accompanied by all requests timing out > causing > > the > > > whole system to become unusable. > > > > > > I collect minute by minute stats of all these memcached > instances and according to my stats this issue > happens > > within 2 > > > minutes. I can see in the number of commands there's no > increase in number of commands being issued right > before > > the CPU > > > spike nor increase in the number of bytes in/out. > > > > > > Does anybody have any ideas of what could be going on? > > > > > > I have all Memcached stats collected by minute in Graphite, I > can provide other stats that could help > explain this > > issue > > > if necessary. > > > > > > > > > On Mon, Aug 4, 2014 at 9:36 PM, dormando <dorma...@rydia.net> > wrote: > > > You could run one instance with one thread and serve > all of that just > > > fine. have you actually looked at graphs of the CPU > usage of the host? > > > memcached should be practically idle with load that low. > > > > > > One with -t 6 or -t 8 would do it just fine. > > > > > > On Mon, 4 Aug 2014, Claudio Santana wrote: > > > > > > > Dormando, thanks for the quick response. Sorry for > the confusion, I don't have exact metrics per > second > > but > > > per minute 1.12 > > > > million sets and 1.8 million gets which translates to > 18,666 sets per minute and 30,000 gets per > second. > > > > > > > > These stats are per Memcached instance which I > currently run 3 on each server. > > > > > > > > Claudio. > > > > > > > > > > > > On Mon, Aug 4, 2014 at 6:22 PM, dormando > <dorma...@rydia.net> wrote: > > > > On Mon, 4 Aug 2014, Claudio Santana wrote: > > > > > > > > > I have this Memcached cluster where 3 > instances of Memcached run in a single server. These > servers > > > have 24 cores, > > > > each instance > > > > > is configured to have 8 threads each. Each > individual instance serves have about 5000G > gets/sets > > a > > > day and about > > > > 3k current > > > > > connections. > > > > > > > > I don't know what "5000G gets/sets a day" translates > to in per-second (nor > > > > what the G-unit even is?), can you define this? > > > > > > > > > What would be better? consolidate these 3 instances > to a single instance per server with 24 > threads? > > I've > > > read in a few > > > > articles > > > > > that Memcached's performance starts suffering with > more than 4-6 threads per instance, is this > generally > > > true? > > > > > > > > > > How about keeping the 3 instances per server and > decreasing the number of threads to say 4 or 6? > or > > > creating 4 instances > > > > in the > > > > > same servers instead of 3 and decreasing the number > of threads per instance to 6 so there is one > thread > > > per core. > > > > > > > > > > Is there a guide you could recommend to configure > the right number of threads and strategies to > get the > > > most out of a > > > > Memcached > > > > > server/instance? > > > > > > > > > > Thanks, > > > > > Claudio > > > > > > > > > > -- > > > > > > > > > > --- > > > > > You received this message because you are > subscribed to the Google Groups "memcached" group. > > > > > To unsubscribe from this group and stop receiving > emails from it, send an email to > > > > memcached+unsubscr...@googlegroups.com. > > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > > > > > > > -- > > > > > > > > --- > > > > You received this message because you are subscribed > to the Google Groups "memcached" group. > > > > To unsubscribe from this group and stop receiving > emails from it, send an email to > > > memcached+unsubscr...@googlegroups.com. > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > -- > > > > > > > > --- > > > > You received this message because you are subscribed > to the Google Groups "memcached" group. > > > > To unsubscribe from this group and stop receiving > emails from it, send an email to > > > memcached+unsubscr...@googlegroups.com. > > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > > > -- > > > > > > --- > > > You received this message because you are subscribed to > the Google Groups "memcached" group. > > > To unsubscribe from this group and stop receiving > emails from it, send an email to > > > memcached+unsubscr...@googlegroups.com. > > > For more options, visit > https://groups.google.com/d/optout. > > > > > > > > > > > > -- > > > > > > --- > > > You received this message because you are subscribed to the > Google Groups "memcached" group. > > > To unsubscribe from this group and stop receiving emails from > it, send an email to > > memcached+unsubscr...@googlegroups.com. > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > -- > > > > --- > > You received this message because you are subscribed to the > Google Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from > it, send an email to > > memcached+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from it, > send an email to > memcached+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > > > -- > > --- > You received this message because you are subscribed to the Google > Groups "memcached" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to > memcached+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > > -- > > --- > You received this message because you are subscribed to the Google Groups > "memcached" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to memcached+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > > -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.