Re: aprmemcache question
This helps a lot. I think 600 seconds seems like a fine idle-reap timeout. I need to investigate why some lookups take a second or more. Maybe there's a mutex contention on my end somewhere. Thanks! -Josh On Thu, Sep 27, 2012 at 2:08 PM, Jeff Trawick wrote: > On Thu, Sep 27, 2012 at 1:55 PM, Joshua Marantz > wrote: > > That one call-site is HTTP_24/src/modules/cache/mod_socache_memcache.c, > > right? That was where I stole my args from. > > no, subversion > > > As the TCP/IP layer is a lower level abstraction than bathe apr_memcache > > interface, I'm still not clear on exactly what that means. Does a value > of > > 600 mean that a single multiget must complete in 600 microseconds > otherwise > > it fails with APR_TIMEUP? > > ttl only affects connections which are not currently used; it does not > control I/O timeouts > > > > That might explain the behavior I saw. > > > > I've now jacked that up by x1e6 to 600 seconds and I don't see timeouts, > > but I'm hoping someone can bridge the gap between the socket-level > > explanation and the apr_memcache API call. > > > > I was assuming that apr_memcache created the TCP/IP connection when I > called > > apr_memcache_server_create, and there even 600 seconds seems too short. > Is > > the functionality more like it will create connections on-demand and > leave > > them running for N microseconds, re-using the connection for multiple > > requests until TTL microseconds have elapsed since creation? > > create on demand > reuse existing idle connections when possible > when performing maintenance on the idle connections, clean up any > which were idle for N microseconds > > If a connection is always reused before it is idle for N microseconds, > it will live as long as memcached allows. > > > If that's the case then I guess that every 10 minutes one of my cache > > lookups may have high latency to re-establish the connection, is that > right? > > I've been histogramming this under load and seeing some long tail > requests > > with very high latency. My median latency is only 143us which is great. > > My 90%, 95% and 99% are all around 5ms, which is fine as well. But I've > got > > a fairly significant number of long-tail lookups that take hundreds of > ms or > > even seconds to finish, and one crazy theory is that this is all > reconnect > > cost. > > > > It would be nice if the TTL were interpreted as a maximum idle time > before > > the connection is reaped, rather than stuttering response-time on a very > > active channel. > > It is. The ttl is interpreted by the reslist layer, which won't touch > objects until they're returned to the list. > > > > > This testing is all using a single memcached running on localhost. > > > > -Josh > > > > > > On Thu, Sep 27, 2012 at 11:24 AM, Jeff Trawick > wrote: > >> > >> On Thu, Sep 27, 2012 at 11:15 AM, Joshua Marantz > >> wrote: > >> > On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis > >> > wrote: > >> >> > >> >> If dlsym() is called with the special handle NULL, it is > interpreted > >> >> as > >> >> a > >> >> reference to the executable or shared object from which the call is > >> >> being > >> >> made. Thus a shared object can reference its own symbols. > >> >> > >> >> And that's how it works on Linux, Solaris, NetBSD and probably > OpenBSD > >> >> as > >> >> well. > >> > > >> > > >> > Cool, thanks. > >> >> > >> >> > Do you have a feel for the exact meaning of that TTL parameter to > >> >> > apr_memcache_server_create? > >> >> > >> >> You mean what units it uses? Microseconds (at least, in 2.4). > >> > > >> > > >> > Actually what I meant was what that value is used for in the library. > >> > The > >> > phrase "time to live of client connection" confuses me. Does it > really > >> > mean > >> > "the maximum number of seconds apr_memcache is willing to wait for a > >> > single > >> > operation? Or does it mean *both*, implying that a fresh TCP/IP > >> > connection > >> > is made for every new operation, but will stay alive for only a > certain > >> > number of seconds. > >> > >> TCP/IP connections, once created, will be retained for the specified > >> (ttl) number of seconds. They'll be created when needed. > >> > >> The socket connect timeout is hard-coded to 1 second, and there's no > >> timeout for I/O. > >> > >> > > >> > > >> > It is a little disturbing from a module-developer perspective to have > >> > the > >> > meaning of that parameter change by a factor of 1M between versions. > >> > Would > >> > it be better to revert the recent change and instead change the doc to > >> > match > >> > the current behavior? > >> > >> The doc was already changed to match the behavior, but I missed that. > >> The caller I know of used the wrong unit, and I'll submit a patch to > >> fix that in the caller, as well as revert my screw-up from yesterday. > >> > >> > > >> > -Josh > >> > > >> > >> > >> > >> -- > >> Born in Roswell... married an alien... > >> http://emptyhammock.com/ > > > > > > > > -- > Born in R
Re: aprmemcache question
On Thu, Sep 27, 2012 at 1:55 PM, Joshua Marantz wrote: > That one call-site is HTTP_24/src/modules/cache/mod_socache_memcache.c, > right? That was where I stole my args from. no, subversion > As the TCP/IP layer is a lower level abstraction than bathe apr_memcache > interface, I'm still not clear on exactly what that means. Does a value of > 600 mean that a single multiget must complete in 600 microseconds otherwise > it fails with APR_TIMEUP? ttl only affects connections which are not currently used; it does not control I/O timeouts > That might explain the behavior I saw. > > I've now jacked that up by x1e6 to 600 seconds and I don't see timeouts, > but I'm hoping someone can bridge the gap between the socket-level > explanation and the apr_memcache API call. > > I was assuming that apr_memcache created the TCP/IP connection when I called > apr_memcache_server_create, and there even 600 seconds seems too short. Is > the functionality more like it will create connections on-demand and leave > them running for N microseconds, re-using the connection for multiple > requests until TTL microseconds have elapsed since creation? create on demand reuse existing idle connections when possible when performing maintenance on the idle connections, clean up any which were idle for N microseconds If a connection is always reused before it is idle for N microseconds, it will live as long as memcached allows. > If that's the case then I guess that every 10 minutes one of my cache > lookups may have high latency to re-establish the connection, is that right? > I've been histogramming this under load and seeing some long tail requests > with very high latency. My median latency is only 143us which is great. > My 90%, 95% and 99% are all around 5ms, which is fine as well. But I've got > a fairly significant number of long-tail lookups that take hundreds of ms or > even seconds to finish, and one crazy theory is that this is all reconnect > cost. > > It would be nice if the TTL were interpreted as a maximum idle time before > the connection is reaped, rather than stuttering response-time on a very > active channel. It is. The ttl is interpreted by the reslist layer, which won't touch objects until they're returned to the list. > > This testing is all using a single memcached running on localhost. > > -Josh > > > On Thu, Sep 27, 2012 at 11:24 AM, Jeff Trawick wrote: >> >> On Thu, Sep 27, 2012 at 11:15 AM, Joshua Marantz >> wrote: >> > On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis >> > wrote: >> >> >> >> If dlsym() is called with the special handle NULL, it is interpreted >> >> as >> >> a >> >> reference to the executable or shared object from which the call is >> >> being >> >> made. Thus a shared object can reference its own symbols. >> >> >> >> And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD >> >> as >> >> well. >> > >> > >> > Cool, thanks. >> >> >> >> > Do you have a feel for the exact meaning of that TTL parameter to >> >> > apr_memcache_server_create? >> >> >> >> You mean what units it uses? Microseconds (at least, in 2.4). >> > >> > >> > Actually what I meant was what that value is used for in the library. >> > The >> > phrase "time to live of client connection" confuses me. Does it really >> > mean >> > "the maximum number of seconds apr_memcache is willing to wait for a >> > single >> > operation? Or does it mean *both*, implying that a fresh TCP/IP >> > connection >> > is made for every new operation, but will stay alive for only a certain >> > number of seconds. >> >> TCP/IP connections, once created, will be retained for the specified >> (ttl) number of seconds. They'll be created when needed. >> >> The socket connect timeout is hard-coded to 1 second, and there's no >> timeout for I/O. >> >> > >> > >> > It is a little disturbing from a module-developer perspective to have >> > the >> > meaning of that parameter change by a factor of 1M between versions. >> > Would >> > it be better to revert the recent change and instead change the doc to >> > match >> > the current behavior? >> >> The doc was already changed to match the behavior, but I missed that. >> The caller I know of used the wrong unit, and I'll submit a patch to >> fix that in the caller, as well as revert my screw-up from yesterday. >> >> > >> > -Josh >> > >> >> >> >> -- >> Born in Roswell... married an alien... >> http://emptyhammock.com/ > > -- Born in Roswell... married an alien... http://emptyhammock.com/
Re: aprmemcache question
That one call-site is HTTP_24/src/modules/cache/mod_socache_memcache.c, right? That was where I stole my args from. As the TCP/IP layer is a lower level abstraction than bathe apr_memcache interface, I'm still not clear on exactly what that means. Does a value of 600 mean that a single multiget must complete in 600 microseconds otherwise it fails with APR_TIMEUP? That might explain the behavior I saw. I've now jacked that up by x1e6 to 600 seconds and I don't see timeouts, but I'm hoping someone can bridge the gap between the socket-level explanation and the apr_memcache API call. I was assuming that apr_memcache created the TCP/IP connection when I called apr_memcache_server_create, and there even 600 seconds seems too short. Is the functionality more like it will create connections on-demand and leave them running for N microseconds, re-using the connection for multiple requests until TTL microseconds have elapsed since creation? If that's the case then I guess that every 10 minutes one of my cache lookups may have high latency to re-establish the connection, is that right? I've been histogramming this under load and seeing some long tail requests with very high latency. My median latency is only 143us which is great. My 90%, 95% and 99% are all around 5ms, which is fine as well. But I've got a fairly significant number of long-tail lookups that take hundreds of ms or even seconds to finish, and one crazy theory is that this is all reconnect cost. It would be nice if the TTL were interpreted as a maximum idle time before the connection is reaped, rather than stuttering response-time on a very active channel. This testing is all using a single memcached running on localhost. -Josh On Thu, Sep 27, 2012 at 11:24 AM, Jeff Trawick wrote: > On Thu, Sep 27, 2012 at 11:15 AM, Joshua Marantz > wrote: > > On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis > wrote: > >> > >> If dlsym() is called with the special handle NULL, it is interpreted > as > >> a > >> reference to the executable or shared object from which the call is > >> being > >> made. Thus a shared object can reference its own symbols. > >> > >> And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD > as > >> well. > > > > > > Cool, thanks. > >> > >> > Do you have a feel for the exact meaning of that TTL parameter to > >> > apr_memcache_server_create? > >> > >> You mean what units it uses? Microseconds (at least, in 2.4). > > > > > > Actually what I meant was what that value is used for in the library. > The > > phrase "time to live of client connection" confuses me. Does it really > mean > > "the maximum number of seconds apr_memcache is willing to wait for a > single > > operation? Or does it mean *both*, implying that a fresh TCP/IP > connection > > is made for every new operation, but will stay alive for only a certain > > number of seconds. > > TCP/IP connections, once created, will be retained for the specified > (ttl) number of seconds. They'll be created when needed. > > The socket connect timeout is hard-coded to 1 second, and there's no > timeout for I/O. > > > > > > > It is a little disturbing from a module-developer perspective to have the > > meaning of that parameter change by a factor of 1M between versions. > Would > > it be better to revert the recent change and instead change the doc to > match > > the current behavior? > > The doc was already changed to match the behavior, but I missed that. > The caller I know of used the wrong unit, and I'll submit a patch to > fix that in the caller, as well as revert my screw-up from yesterday. > > > > > -Josh > > > > > > -- > Born in Roswell... married an alien... > http://emptyhammock.com/ >
Re: aprmemcache question
On Thu, Sep 27, 2012 at 11:15 AM, Joshua Marantz wrote: > On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis wrote: >> >> If dlsym() is called with the special handle NULL, it is interpreted as >> a >> reference to the executable or shared object from which the call is >> being >> made. Thus a shared object can reference its own symbols. >> >> And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD as >> well. > > > Cool, thanks. >> >> > Do you have a feel for the exact meaning of that TTL parameter to >> > apr_memcache_server_create? >> >> You mean what units it uses? Microseconds (at least, in 2.4). > > > Actually what I meant was what that value is used for in the library. The > phrase "time to live of client connection" confuses me. Does it really mean > "the maximum number of seconds apr_memcache is willing to wait for a single > operation? Or does it mean *both*, implying that a fresh TCP/IP connection > is made for every new operation, but will stay alive for only a certain > number of seconds. TCP/IP connections, once created, will be retained for the specified (ttl) number of seconds. They'll be created when needed. The socket connect timeout is hard-coded to 1 second, and there's no timeout for I/O. > > > It is a little disturbing from a module-developer perspective to have the > meaning of that parameter change by a factor of 1M between versions. Would > it be better to revert the recent change and instead change the doc to match > the current behavior? The doc was already changed to match the behavior, but I missed that. The caller I know of used the wrong unit, and I'll submit a patch to fix that in the caller, as well as revert my screw-up from yesterday. > > -Josh > -- Born in Roswell... married an alien... http://emptyhammock.com/
Re: aprmemcache question
On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis wrote: > On Thu, Sep 27, 2012 at 4:29 PM, Joshua Marantz wrote: >> Thanks Ben, >> >> That might be an interesting hack to try, although I wonder whether some of >> our friends running mod_pagespeed on FreeBSD might run into trouble with >> it. I did confirm that my prefork build has APR built with >> APR_HAS_THREADS, which for some reason I had earlier thought was not the >> case. > > It should work, provided you linked against libapr. The FreeBSD man > page says this: > > If dlsym() is called with the special handle NULL, it is interpreted as a > reference to the executable or shared object from which the call is being > made. Thus a shared object can reference its own symbols. > > And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD as > well. > >> Do you have a feel for the exact meaning of that TTL parameter to >> apr_memcache_server_create? > > You mean what units it uses? Microseconds (at least, in 2.4). Right. I screwed up on changing that yesterday. The APR doc was already fixed long ago to indicate it was microseconds instead of seconds, the Subversion code hasn't been fixed to respect that, and the bug that was opened to fix the code to use seconds put me in the wrong frame of mind :( What does ttl mean for this particular API? All resources in the resource list are cleaned up when the memcache server is deleted/pool is cleared/destroyed. Individual resources are returned to the list at the end of individual memcache operations. When a resource is returned to the list, "old" resources are destroyed, where "old" is determined by the ttl. Destroying a memcache resource means it sends the "quit" message to memcached and closes the socket. So ttl sets a limit on how long a particular connection to memcached can be used. -- Born in Roswell... married an alien... http://emptyhammock.com/
Re: aprmemcache question
On Thu, Sep 27, 2012 at 10:58 AM, Ben Noordhuis wrote: > If dlsym() is called with the special handle NULL, it is interpreted as a > reference to the executable or shared object from which the call is being > made. Thus a shared object can reference its own symbols. > > And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD as > well. > Cool, thanks. > > Do you have a feel for the exact meaning of that TTL parameter to > > apr_memcache_server_create? > > You mean what units it uses? Microseconds (at least, in 2.4). > Actually what I meant was what that value is used for in the library. The phrase "time to live of client connection" confuses me. Does it really mean "the maximum number of seconds apr_memcache is willing to wait for a single operation? Or does it mean *both*, implying that a fresh TCP/IP connection is made for every new operation, but will stay alive for only a certain number of seconds. It is a little disturbing from a module-developer perspective to have the meaning of that parameter change by a factor of 1M between versions. Would it be better to revert the recent change and instead change the doc to match the current behavior? -Josh
Re: aprmemcache question
On Thu, Sep 27, 2012 at 4:29 PM, Joshua Marantz wrote: > Thanks Ben, > > That might be an interesting hack to try, although I wonder whether some of > our friends running mod_pagespeed on FreeBSD might run into trouble with > it. I did confirm that my prefork build has APR built with > APR_HAS_THREADS, which for some reason I had earlier thought was not the > case. It should work, provided you linked against libapr. The FreeBSD man page says this: If dlsym() is called with the special handle NULL, it is interpreted as a reference to the executable or shared object from which the call is being made. Thus a shared object can reference its own symbols. And that's how it works on Linux, Solaris, NetBSD and probably OpenBSD as well. > Do you have a feel for the exact meaning of that TTL parameter to > apr_memcache_server_create? You mean what units it uses? Microseconds (at least, in 2.4).
Re: aprmemcache question
Thanks Ben, That might be an interesting hack to try, although I wonder whether some of our friends running mod_pagespeed on FreeBSD might run into trouble with it. I did confirm that my prefork build has APR built with APR_HAS_THREADS, which for some reason I had earlier thought was not the case. Do you have a feel for the exact meaning of that TTL parameter to apr_memcache_server_create? -Josh On Thu, Sep 27, 2012 at 8:53 AM, Ben Noordhuis wrote: > On Thu, Sep 27, 2012 at 4:05 AM, Joshua Marantz > wrote: > > RE "failing the build of my module" -- the dominant usage is via > > precompiled binaries we supply. Is there an apr query for determining > > whether apr was compiled with threads I could do on startup? > > I don't think there's an official way but you know apr was compiled > with APR_HAS_THREADS when dlsym(NULL, "apr_os_thread_current") != > NULL. > > Using dlsym() like that is not quite compatible with POSIX but it > works on all the major Unices. >
Re: aprmemcache question
On Thu, Sep 27, 2012 at 4:05 AM, Joshua Marantz wrote: > RE "failing the build of my module" -- the dominant usage is via > precompiled binaries we supply. Is there an apr query for determining > whether apr was compiled with threads I could do on startup? I don't think there's an official way but you know apr was compiled with APR_HAS_THREADS when dlsym(NULL, "apr_os_thread_current") != NULL. Using dlsym() like that is not quite compatible with POSIX but it works on all the major Unices.