On Tue, Mar 12, 2013 at 1:56 PM, Jeff Trawick <traw...@gmail.com> wrote: > On Mon, Mar 11, 2013 at 3:50 PM, Joshua Marantz <jmara...@google.com> wrote: >> >> ping! >> >> Please don't hesitate to push back and tell me if I can supply the patch >> or >> update in some easier-to-digest form. In particular, while I have >> rigorously stress-tested this change using mod_pagespeed's unit test, >> system-test, and load-test framework, I don't really understand what the >> testing flow is for APR. I'd be happy to add unit-tests for that if >> someone points me to a change-list or patch-file that does it properly. >> >> -Josh > > > I'll try hard to work on this in the next couple of days. It would be great > to have fixes in APR-Util 1.5.x, which we hope to work on later this week.
Attached is a first pass at getting your patch to apr trunk. Can you confirm that I didn't break your tests? (It passes the apr tests, but that may not mean much.) > >> >> >> On Thu, Nov 1, 2012 at 8:04 AM, Joshua Marantz <jmara...@google.com> >> wrote: >> >> > I have completed a solution to this problem, which can be a drop-in >> > update >> > for the existing apr_memcache.c. It is now checked in for my module as >> > >> > http://code.google.com/p/modpagespeed/source/browse/trunk/src/third_party/aprutil/apr_memcache2.c >> > . >> > >> > It differs from the solution in >> > https://issues.apache.org/bugzilla/show_bug.cgi?id=51065 in that: >> > >> > - It doesn't require an API change; it but it enforces the 50ms >> > timeout that already exists for apr_multgetp for all operations. >> > - It works under my load test (which I found is not true of the patch >> > in 51065). >> > >> > For my own purposes, I will be shipping my module with apr_memcache2 so >> > I >> > get the behavior I want regardless of what version of Apache is >> > installed. >> > But I'd like to propose my patch for apr_memcache.c. The patch is >> > attached, and I've also submitted it as an alternative patch to bug >> > 51065. >> > >> > If you agree with the strategy I used to solve this problem, then please >> > let me know if I can help with any changes required to get this into the >> > main distribution, >> > >> > >> > On Mon, Oct 22, 2012 at 5:21 PM, Joshua Marantz >> > <jmara...@google.com>wrote: >> > >> >> I've had some preliminary success with my own variant of apr_memcache.c >> >> (creatively called apr_memcache2.c). Rather than setting the socket >> >> timeout, I've been mimicing the timeout strategy I saw in >> >> apr_memcache_multgetp, by adding a new helper method: >> >> >> >> static apr_status_t wait_for_server_or_timeout(apr_pool_t* temp_pool, >> >> apr_memcache2_conn_t* >> >> conn) { >> >> apr_pollset_t* pollset; >> >> apr_status_t rv = apr_pollset_create(&pollset, 1, temp_pool, 0); >> >> if (rv == APR_SUCCESS) { >> >> apr_pollfd_t pollfd; >> >> pollfd.desc_type = APR_POLL_SOCKET; >> >> pollfd.reqevents = APR_POLLIN; >> >> pollfd.p = temp_pool; >> >> pollfd.desc.s = conn->sock; >> >> pollfd.client_data = NULL; >> >> apr_pollset_add(pollset, &pollfd); >> >> apr_int32_t queries_recvd; >> >> const apr_pollfd_t* activefds; >> >> rv = apr_pollset_poll(pollset, MULT_GET_TIMEOUT, >> >> &queries_recvd, >> >> &activefds); >> >> if (rv == APR_SUCCESS) { >> >> assert(queries_recvd == 1); >> >> assert(activefds->desc.s == conn->sock); >> >> assert(activefds->client_data == NULL); >> >> } >> >> } >> >> return rv; >> >> } >> >> >> >> And calling that before many of the existing calls to get_server_line >> >> as: >> >> >> >> rv = wait_for_server_or_timeout_no_pool(conn); >> >> if (rv != APR_SUCCESS) { >> >> ms_release_conn(ms, conn); >> >> return rv; >> >> } >> >> >> >> This is just an experiment; I think I can streamline this by >> >> pre-populating the pollfd structure as part of the apr_memcache_conn_t >> >> (actually now apr_memcache2_conn_t). >> >> >> >> I have two questions about this: >> >> >> >> 1. I noticed the version of apr_memcache.c that ships with Apache 2.4 >> >> is >> >> somewhat different from the one that ships with Apache 2.2. In >> >> particular >> >> the 2.4 version cannot be compiled against the headers that come with a >> >> 2.2 >> >> distribution. Is there any downside to taking my hacked 2.2 >> >> apr_memcache.c >> >> and running it in Apache 2.4? Or should I maintain two hacks? >> >> >> >> 2. This seems wasteful in terms of system calls. I am making an extra >> >> call to poll, rather than relying on the socket timeout. The socket >> >> timeout didn't work as well as this though. Does anyone have any >> >> theories >> >> as to why, or what could be done to the patch in >> >> https://issues.apache.org/bugzilla/show_bug.cgi?id=51065 to work? >> >> >> >> -Josh >> >> >> >> >> >> On Fri, Oct 19, 2012 at 9:25 AM, Joshua Marantz >> >> <jmara...@google.com>wrote: >> >> >> >>> Following up: I tried doing what I suggested above: patching that >> >>> change >> >>> into my own copy of apr_memcache.c It was first of all a bad idea to >> >>> pull >> >>> in only part of apr_memcache.c because that file changed slightly >> >>> between >> >>> 2.2 and 2.4 and our module works in both. >> >>> >> >>> I was successful making my own version of apr_memcache (renaming >> >>> entry-points apr_memcache2*) that I could hack. But if I changed the >> >>> socket timeout from -1 to 10 seconds, then the system behaved very >> >>> poorly >> >>> under load test (though it worked fine in our unit-tests and >> >>> system-tests). >> >>> In other words, I think the proposed patch that Jeff pointed to above >> >>> is >> >>> not really working (as he predicted). This test was done without >> >>> SIGSTOPing the memcached; it would timeout under our load anyway and >> >>> thereafter behave poorly. >> >>> >> >>> I'm going to follow up on that bugzilla entry, but for now I'm going >> >>> to >> >>> pursue my own complicated mechanism of timing out the calls from my >> >>> side. >> >>> >> >>> -Josh >> >>> >> >>> >> >>> On Thu, Oct 18, 2012 at 10:46 AM, Joshua Marantz >> >>> <jmara...@google.com>wrote: >> >>> >> >>>> Thanks Jeff, that is very helpful. We are considering a course of >> >>>> action and before doing any work toward this, I'd like to understand >> >>>> the >> >>>> pitfalls from people that understand Apache better than us. >> >>>> >> >>>> Here's our reality: we believe we need to incorporate memcached for >> >>>> mod_pagespeed <http://modpagespeed.com> to scale effectively for very >> >>>> large sites & hosting providers. We are fairly close (we think) to >> >>>> releasing this functionality as beta. However, in such large sites, >> >>>> stuff >> >>>> goes wrong: machines crash, power failure, fiber cut, etc. When it >> >>>> does we >> >>>> want to fall back to serving partially unoptimized sites rather than >> >>>> hanging the servers. >> >>>> >> >>>> I understand the realities of backward-compatible APIs. My >> >>>> expectation >> >>>> is that this would take years to make it into an APR distribution we >> >>>> could >> >>>> depend on. We want to deploy this functionality in weeks. The >> >>>> workarounds >> >>>> we have tried backgrounding the apr_memcache calls in a thread and >> >>>> timing >> >>>> out in mainline are complex and even once they work 100% will be very >> >>>> unsatisfactory (resource leaks; Apache refusing to exit cleanly on >> >>>> 'apachectl stop') if this happens more than (say) once a month. >> >>>> >> >>>> Our plan is to copy the patched implementation of >> >>>> apr_memcache_server_connect and the static methods it calls into a >> >>>> new .c >> >>>> file we will link into our module, naming the new entry-point >> >>>> something >> >>>> else (apr_memcache_server_connect_with_timeout seems good). From a >> >>>> CS/SE perspective this is offensive and we admit it, but from a >> >>>> product >> >>>> quality perspective we believe this beats freezes and >> >>>> complicated/imperfect >> >>>> workarounds with threads. >> >>>> >> >>>> So I have two questions for the Apache community: >> >>>> >> >>>> 1. What are the practical problems with this approach? Note that >> >>>> in any case a new APR rev would require editing/ifdefing our code >> >>>> anyway, >> >>>> so I think immunity from APR updates such as this patch being >> >>>> applied is >> >>>> not a distinguishing drawback. >> >>>> 2. Is there an example of the correct solution to the technical >> >>>> problem Jeff highlighted: "it is apparently missing a call to >> >>>> adjust the socket timeout and to discard the connection if the >> >>>> timeout is reached". That sounds like a pattern that might be >> >>>> found elsewhere in the Apache HTTPD code base. >> >>>> >> >>>> Thanks in advance for your help! >> >>>> -Josh >> >>>> >> >>>> >> >>>> On Wed, Oct 17, 2012 at 8:16 PM, Jeff Trawick >> >>>> <traw...@gmail.com>wrote: >> >>>> >> >>>>> On Wed, Oct 17, 2012 at 3:36 PM, Joshua Marantz >> >>>>> <jmara...@google.com> >> >>>>> wrote: >> >>>>> > Is there a mechanism to time out individual operations? >> >>>>> >> >>>>> No, the socket connect timeout is hard-coded at 1 second and the >> >>>>> socket I/O timeout is disabled. >> >>>>> >> >>>>> Bugzilla bug >> >>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=51065 >> >>>>> has a patch, though it is apparently missing a call to adjust the >> >>>>> socket timeout and to discard the connection if the timeout is >> >>>>> reached. More importantly, the API can only be changed in future >> >>>>> APR >> >>>>> 2.0; alternate, backwards-compatible API(s) could be added in future >> >>>>> APR-Util 1.6. >> >>>>> >> >>>>> > >> >>>>> > If memcached freezes, then it appears my calls to 'get' will block >> >>>>> until >> >>>>> > memcached wakes up. Is there any way to set a timeout for that >> >>>>> > call? >> >>>>> > >> >>>>> > I can repro this in my unit tests by sending a SIGSTOP to >> >>>>> > memcached >> >>>>> before >> >>>>> > doing a 'get'? >> >>>>> > >> >>>>> > Here are my observations: >> >>>>> > >> >>>>> > apr_memcache_multgetp seems to time out in bounded time if I >> >>>>> > SIGSTOP >> >>>>> the >> >>>>> > memcached process. Yes! >> >>>>> > >> >>>>> > apr_memcache_getp seems to hang indefinitely if I SIGSTOP the >> >>>>> memcached >> >>>>> > process. >> >>>>> > >> >>>>> > apr_memcache_set seems to hang indefinitely if I SIGSTOP the >> >>>>> memcached >> >>>>> > process. >> >>>>> > >> >>>>> > apr_memcache_delete seems to hang indefinitely if I SIGSTOP the >> >>>>> memcached >> >>>>> > process. >> >>>>> > >> >>>>> > apr_memcache_stats seems to hang indefinitely if I SIGSTOP the >> >>>>> memcached >> >>>>> > process. >> >>>>> > >> >>>>> > That last one really sucks as I am using that to print the status >> >>>>> > of >> >>>>> all my >> >>>>> > cache shards to the log file if I detected a problem :( >> >>>>> > >> >>>>> > >> >>>>> > Why does apr_memcache_multgetp do what I want and not the others? >> >>>>> Can I >> >>>>> > induce the others to have reasonable timeout behavior? >> >>>>> > >> >>>>> > When I SIGSTOP memcached this makes it hard to even restart >> >>>>> > Apache, >> >>>>> at >> >>>>> > least with graceful-stop. >> >>>>> > >> >>>>> > >> >>>>> > On a related note, the apr_memcache >> >>>>> > documentation< >> >>>>> >> >>>>> http://apr.apache.org/docs/apr-util/1.4/group___a_p_r___util___m_c.html >> >>>>> >is >> >>>>> > very thin. I'd be happy to augment it with my observations on its >> >>>>> > usage >> >>>>> > and the meaning of some of the arguments if that was desired. How >> >>>>> would I >> >>>>> > go about that? >> >>>>> >> >>>>> Check out APR trunk from Subversion, adjust the doxygen docs in the >> >>>>> include files, build (make dox) and inspect the results, submit a >> >>>>> patch to dev@apr.apache.org. >> >>>>> >> >>>>> > >> >>>>> > -Josh >> >>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Born in Roswell... married an alien... >> >>>>> http://emptyhammock.com/ >> >>>>> >> >>>> >> >>>> >> >>> >> >> >> > > > > > > -- > Born in Roswell... married an alien... > http://emptyhammock.com/ -- Born in Roswell... married an alien... http://emptyhammock.com/
Index: memcache/apr_memcache.c =================================================================== --- memcache/apr_memcache.c (revision 1456414) +++ memcache/apr_memcache.c (working copy) @@ -30,8 +30,18 @@ apr_bucket_brigade *bb; apr_bucket_brigade *tb; apr_memcache_server_t *ms; + apr_pollset_t *pollset; /* Set-up to poll only for this one connection */ }; +/** + * Indicates whether a lock is held when calling helper functions from either + * state. + */ +typedef enum { + LOCK_NOT_HELD, + LOCK_HELD +} caller_lock_status_t; + /* Strings for Client Commands */ #define MC_EOL "\r\n" @@ -109,15 +119,20 @@ #define MULT_GET_TIMEOUT 50000 -static apr_status_t make_server_dead(apr_memcache_t *mc, apr_memcache_server_t *ms) +static apr_status_t mark_server_dead(apr_memcache_server_t *ms, + caller_lock_status_t caller_lock_status) { #if APR_HAS_THREADS - apr_thread_mutex_lock(ms->lock); + if (caller_lock_status == LOCK_NOT_HELD) { + apr_thread_mutex_lock(ms->lock); + } #endif ms->status = APR_MC_SERVER_DEAD; ms->btime = apr_time_now(); #if APR_HAS_THREADS - apr_thread_mutex_unlock(ms->lock); + if (caller_lock_status == LOCK_NOT_HELD) { + apr_thread_mutex_unlock(ms->lock); + } #endif return APR_SUCCESS; } @@ -143,7 +158,7 @@ return rv; } -static apr_status_t mc_version_ping(apr_memcache_server_t *ms); +static apr_status_t mc_version_ping_lock_held(apr_memcache_server_t *ms); APR_DECLARE(apr_memcache_server_t *) apr_memcache_find_server_hash(apr_memcache_t *mc, const apr_uint32_t hash) @@ -181,10 +196,10 @@ #if APR_HAS_THREADS apr_thread_mutex_lock(ms->lock); #endif - /* Try the dead server, every 5 seconds */ + /* Try the dead server, every 5 seconds, keeping the lock */ if (curtime - ms->btime > apr_time_from_sec(5)) { ms->btime = curtime; - if (mc_version_ping(ms) == APR_SUCCESS) { + if (mc_version_ping_lock_held(ms) == APR_SUCCESS) { make_server_live(mc, ms); #if APR_HAS_THREADS apr_thread_mutex_unlock(ms->lock); @@ -282,9 +297,48 @@ APR_DECLARE(apr_status_t) apr_memcache_disable_server(apr_memcache_t *mc, apr_memcache_server_t *ms) { - return make_server_dead(mc, ms); + return mark_server_dead(ms, LOCK_NOT_HELD); } +/* + * Cleans up connections and/or bad servers as required. + * + * This function should only be called after a fatal error communicating + * with a server. + */ +static void disable_server_and_connection(apr_memcache_server_t *ms, + caller_lock_status_t lock_status, + apr_memcache_conn_t *conn) { + if (conn != NULL) { + ms_bad_conn(ms, conn); + } + mark_server_dead(ms, lock_status); +} + +/* + * Polls a socket to see whether the next I/O operation call will block. + * The status returned is based on apr_pollset_poll: + * http://apr.apache.org/docs/apr/trunk/group__apr__poll.html#ga6b31d7b3a7b2d356370403dd2b79ecf3 + * + * In practice this appears to return APR_TIMEUP on timeout (hardcoded to 50ms). + * + * Returns APR_SUCCESSS if it's OK to read. Otherwise it releases the + * connection and returns the status. + */ +static apr_status_t poll_server_releasing_connection_on_failure(apr_memcache_server_t *ms, + caller_lock_status_t lock_status, + apr_memcache_conn_t *conn) { + apr_int32_t queries_recvd; + const apr_pollfd_t *activefds; + + apr_status_t rv = apr_pollset_poll(conn->pollset, MULT_GET_TIMEOUT, + &queries_recvd, &activefds); + if (rv != APR_SUCCESS) { + disable_server_and_connection(ms, lock_status, conn); + } + return rv; +} + static apr_status_t conn_connect(apr_memcache_conn_t *conn) { apr_status_t rv = APR_SUCCESS; @@ -339,6 +393,7 @@ conn->p = np; conn->tp = tp; + rv = apr_socket_create(&conn->sock, APR_INET, SOCK_STREAM, 0, np); if (rv != APR_SUCCESS) { @@ -346,6 +401,26 @@ return rv; } + rv = apr_pollset_create(&conn->pollset, 1, np, 0); + if (rv == APR_SUCCESS) { + /* + * Populate an 'is readable' pollset for this socket to allow early + * timeout detection. + */ + apr_pollfd_t pollfd; + + pollfd.desc_type = APR_POLL_SOCKET; + pollfd.reqevents = APR_POLLIN; + pollfd.p = np; + pollfd.desc.s = conn->sock; + pollfd.client_data = NULL; + rv = apr_pollset_add(conn->pollset, &pollfd); + } + if (rv != APR_SUCCESS) { /* apr_pollset_create or apr_pollset_add */ + apr_pool_destroy(np); + return rv; + } + conn->buffer = apr_palloc(conn->p, BUFFER_SIZE); conn->blen = 0; conn->ms = ms; @@ -588,6 +663,31 @@ return apr_brigade_cleanup(conn->tb); } +static apr_status_t sendv_and_get_server_line_with_timeout(apr_memcache_server_t *ms, + apr_memcache_conn_t *conn, + struct iovec* vec, + int vec_size, + caller_lock_status_t lock_status) { + apr_size_t written; + apr_status_t rv = apr_socket_sendv(conn->sock, vec, vec_size, &written); + if (rv != APR_SUCCESS) { + disable_server_and_connection(ms, lock_status, conn); + return rv; + } + + rv = poll_server_releasing_connection_on_failure(ms, lock_status, conn); + if (rv != APR_SUCCESS) { + return rv; + } + + rv = get_server_line(conn); + if (rv != APR_SUCCESS) { + disable_server_and_connection(ms, LOCK_NOT_HELD, conn); + } + + return rv; +} + static apr_status_t storage_cmd_write(apr_memcache_t *mc, char *cmd, const apr_size_t cmd_size, @@ -601,7 +701,6 @@ apr_memcache_server_t *ms; apr_memcache_conn_t *conn; apr_status_t rv; - apr_size_t written; struct iovec vec[5]; apr_size_t klen; @@ -641,22 +740,13 @@ vec[4].iov_base = MC_EOL; vec[4].iov_len = MC_EOL_LEN; - rv = apr_socket_sendv(conn->sock, vec, 5, &written); + rv = sendv_and_get_server_line_with_timeout(ms, conn, vec, 5, + LOCK_NOT_HELD); if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); return rv; } - rv = get_server_line(conn); - - if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); - return rv; - } - if (strcmp(conn->buffer, MS_STORED MC_EOL) == 0) { rv = APR_SUCCESS; } @@ -730,7 +820,6 @@ apr_memcache_server_t *ms; apr_memcache_conn_t *conn; apr_uint32_t hash; - apr_size_t written; apr_size_t klen = strlen(key); struct iovec vec[3]; @@ -756,21 +845,12 @@ vec[2].iov_base = MC_EOL; vec[2].iov_len = MC_EOL_LEN; - rv = apr_socket_sendv(conn->sock, vec, 3, &written); - + rv = sendv_and_get_server_line_with_timeout(ms, conn, vec, 3, + LOCK_NOT_HELD); if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); return rv; } - rv = get_server_line(conn); - if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); - return rv; - } - if (strncmp(MS_VALUE, conn->buffer, MS_VALUE_LEN) == 0) { char *flags; char *length; @@ -862,7 +942,6 @@ apr_memcache_server_t *ms; apr_memcache_conn_t *conn; apr_uint32_t hash; - apr_size_t written; struct iovec vec[3]; apr_size_t klen = strlen(key); @@ -890,21 +969,12 @@ vec[2].iov_base = conn->buffer; vec[2].iov_len = klen; - rv = apr_socket_sendv(conn->sock, vec, 3, &written); - + rv = sendv_and_get_server_line_with_timeout(ms, conn, vec, 3, + LOCK_NOT_HELD); if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); return rv; } - rv = get_server_line(conn); - if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); - return rv; - } - if (strncmp(MS_DELETED, conn->buffer, MS_DELETED_LEN) == 0) { rv = APR_SUCCESS; } @@ -931,7 +1001,6 @@ apr_memcache_server_t *ms; apr_memcache_conn_t *conn; apr_uint32_t hash; - apr_size_t written; struct iovec vec[3]; apr_size_t klen = strlen(key); @@ -959,21 +1028,12 @@ vec[2].iov_base = conn->buffer; vec[2].iov_len = klen; - rv = apr_socket_sendv(conn->sock, vec, 3, &written); - + rv = sendv_and_get_server_line_with_timeout(ms, conn, vec, 3, + LOCK_NOT_HELD); if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); return rv; } - rv = get_server_line(conn); - if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - apr_memcache_disable_server(mc, ms); - return rv; - } - if (strncmp(MS_ERROR, conn->buffer, MS_ERROR_LEN) == 0) { rv = APR_EGENERAL; } @@ -1030,7 +1090,6 @@ { apr_status_t rv; apr_memcache_conn_t *conn; - apr_size_t written; struct iovec vec[2]; rv = ms_find_conn(ms, &conn); @@ -1046,19 +1105,12 @@ vec[1].iov_base = MC_EOL; vec[1].iov_len = MC_EOL_LEN; - rv = apr_socket_sendv(conn->sock, vec, 2, &written); - + rv = sendv_and_get_server_line_with_timeout(ms, conn, vec, 2, + LOCK_NOT_HELD); if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); return rv; } - rv = get_server_line(conn); - if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); - return rv; - } - if (strncmp(MS_VERSION, conn->buffer, MS_VERSION_LEN) == 0) { *baton = apr_pstrmemdup(p, conn->buffer+MS_VERSION_LEN+1, conn->blen - MS_VERSION_LEN - 2); @@ -1073,10 +1125,9 @@ return rv; } -apr_status_t mc_version_ping(apr_memcache_server_t *ms) +static apr_status_t mc_version_ping_lock_held(apr_memcache_server_t *ms) { apr_status_t rv; - apr_size_t written; struct iovec vec[2]; apr_memcache_conn_t *conn; @@ -1093,14 +1144,11 @@ vec[1].iov_base = MC_EOL; vec[1].iov_len = MC_EOL_LEN; - rv = apr_socket_sendv(conn->sock, vec, 2, &written); - + rv = sendv_and_get_server_line_with_timeout(ms, conn, vec, 2, LOCK_HELD); if (rv != APR_SUCCESS) { - ms_bad_conn(ms, conn); return rv; } - rv = get_server_line(conn); ms_release_conn(ms, conn); return rv; } @@ -1664,6 +1712,11 @@ return rv; } + rv = poll_server_releasing_connection_on_failure(ms, LOCK_NOT_HELD, conn); + if (rv != APR_SUCCESS) { + return rv; + } + ret = apr_pcalloc(p, sizeof(apr_memcache_stats_t)); do {