This helps a lot. I think 600 seconds seems like a fine idle-reap timeout.
I need to investigate why some lookups take a second or more. Maybe there's a mutex contention on my end somewhere. Thanks! -Josh On Thu, Sep 27, 2012 at 2:08 PM, Jeff Trawick <traw...@gmail.com> wrote: > On Thu, Sep 27, 2012 at 1:55 PM, Joshua Marantz <jmara...@google.com> > wrote: > > That one call-site is HTTP_24/src/modules/cache/mod_socache_memcache.c, > > right? That was where I stole my args from. > > no, subversion > > > As the TCP/IP layer is a lower level abstraction than bathe apr_memcache > > interface, I'm still not clear on exactly what that means. Does a value > of > > 600 mean that a single multiget must complete in 600 microseconds > otherwise > > it fails with APR_TIMEUP? > > ttl only affects connections which are not currently used; it does not > control I/O timeouts > > > > That might explain the behavior I saw. > > > > I've now jacked that up by x1e6 to 600 seconds and I don't see timeouts, > > but I'm hoping someone can bridge the gap between the socket-level > > explanation and the apr_memcache API call. > > > > I was assuming that apr_memcache created the TCP/IP connection when I > called > > apr_memcache_server_create, and there even 600 seconds seems too short. > Is > > the functionality more like it will create connections on-demand and > leave > > them running for N microseconds, re-using the connection for multiple > > requests until TTL microseconds have elapsed since creation? > > create on demand > reuse existing idle connections when possible > when performing maintenance on the idle connections, clean up any > which were idle for N microseconds > > If a connection is always reused before it is idle for N microseconds, > it will live as long as memcached allows. > > > If that's the case then I guess that every 10 minutes one of my cache > > lookups may have high latency to re-establish the connection, is that > right? > > I've been histogramming this under load and seeing some long tail > requests > > with very high latency. My median latency is only 143us which is great. > > My 90%, 95% and 99% are all around 5ms, which is fine as well. But I've > got > > a fairly significant number of long-tail lookups that take hundreds of > ms or > > even seconds to finish, and one crazy theory is that this is all > reconnect > > cost. > > > > It would be nice if the TTL were interpreted as a maximum idle time > before > > the connection is reaped, rather than stuttering response-time on a very > > active channel. > > It is. The ttl is interpreted by the reslist layer, which won't touch > objects until they're returned to the list. > > > > > This testing is all using a single memcached running on localhost. > > > > -Josh > > > > > > On Thu, Sep 27, 2012 at 11:24 AM, Jeff Trawick <traw...@gmail.com> > wrote: > >> > >> On Thu, Sep 27, 2012 at 11:15 AM, Joshua Marantz <jmara...@google.com> > >> wrote: > >> > On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis <i...@bnoordhuis.nl> > >> > wrote: > >> >> > >> >> If dlsym() is called with the special handle NULL, it is > interpreted > >> >> as > >> >> a > >> >> reference to the executable or shared object from which the call is > >> >> being > >> >> made. Thus a shared object can reference its own symbols. > >> >> > >> >> And that's how it works on Linux, Solaris, NetBSD and probably > OpenBSD > >> >> as > >> >> well. > >> > > >> > > >> > Cool, thanks. > >> >> > >> >> > Do you have a feel for the exact meaning of that TTL parameter to > >> >> > apr_memcache_server_create? > >> >> > >> >> You mean what units it uses? Microseconds (at least, in 2.4). > >> > > >> > > >> > Actually what I meant was what that value is used for in the library. > >> > The > >> > phrase "time to live of client connection" confuses me. Does it > really > >> > mean > >> > "the maximum number of seconds apr_memcache is willing to wait for a > >> > single > >> > operation? Or does it mean *both*, implying that a fresh TCP/IP > >> > connection > >> > is made for every new operation, but will stay alive for only a > certain > >> > number of seconds. > >> > >> TCP/IP connections, once created, will be retained for the specified > >> (ttl) number of seconds. They'll be created when needed. > >> > >> The socket connect timeout is hard-coded to 1 second, and there's no > >> timeout for I/O. > >> > >> > > >> > > >> > It is a little disturbing from a module-developer perspective to have > >> > the > >> > meaning of that parameter change by a factor of 1M between versions. > >> > Would > >> > it be better to revert the recent change and instead change the doc to > >> > match > >> > the current behavior? > >> > >> The doc was already changed to match the behavior, but I missed that. > >> The caller I know of used the wrong unit, and I'll submit a patch to > >> fix that in the caller, as well as revert my screw-up from yesterday. > >> > >> > > >> > -Josh > >> > > >> > >> > >> > >> -- > >> Born in Roswell... married an alien... > >> http://emptyhammock.com/ > > > > > > > > -- > Born in Roswell... married an alien... > http://emptyhammock.com/ >