Re: Scoreboard 3.0
On May 11, 2016 4:16 PM, "Eric Covener"wrote: > > On Wed, May 11, 2016 at 8:28 AM, Stefan Eissing > wrote: > > I got some such feedback when h2 connections truly went async and no longer showed up as occupying a slot. I share your idea that in an event based server, threads are not the primary dimension but connections are. > > Threads (slots) are the primary dimension for what you need to monitor > to not get overrun. The state of connections NOT running on a thread > (keepalive, write completion, lingering close) seems to me to still be > a distant second for routine work. I see value in listing individual > write-completion requests, since they could be an issue from the users > POV but would be de-emphasized in current server-status. In the next iteration for trunk I see a value in having both the basic activity flag per worker thread so we can see those states at a glance as we can today, as well as a summary table per process of the number of recognized workers and inactive Conn/req per process by state. Those should be detailed whether or not an extendedstatus is enabled, or not. That table should help us understand the nature of any server overload.
bugzilla spam
Infra is aware and firefighting. -- Eric Covener cove...@gmail.com
Re: Scoreboard 3.0
On Wed, May 11, 2016 at 8:28 AM, Stefan Eissingwrote: > I got some such feedback when h2 connections truly went async and no longer > showed up as occupying a slot. I share your idea that in an event based > server, threads are not the primary dimension but connections are. Threads (slots) are the primary dimension for what you need to monitor to not get overrun. The state of connections NOT running on a thread (keepalive, write completion, lingering close) seems to me to still be a distant second for routine work. I see value in listing individual write-completion requests, since they could be an issue from the users POV but would be de-emphasized in current server-status.
Re: Detecting client aborts and stream resets
Zitat von Stefan Eissing: Thanks for the patch! I applied it to trunk in r1743335, will be part of next 1.5.4 release. I only omitted the last change as I do not want to set aborted on the main connection every time the session closes. Ok, that's fine for me. Thanks a lot Stefan! Regards, Michael
Re: Scoreboard 3.0
On Wed, May 11, 2016 at 10:28 AM, Stefan Eissing < stefan.eiss...@greenbytes.de> wrote: > > > Am 11.05.2016 um 17:03 schrieb William A Rowe Jr: > > > > It seems like a lot of the scoreboard confusion lies in the event > > MPM lies in the work-performed-by-thread model of the thread > > based scoreboard. > > > > Let's call this existing model 'ExtendedStatus threads' > > > > What it seems some users and committers are looking for is > > a different model, let's call this 'ExtendedStatus connections'. > > I got some such feedback when h2 connections truly went async and no > longer showed up as occupying a slot. I share your idea that in an event > based server, threads are not the primary dimension but connections are. > I think they are both representations, I'm curious if there is a use case for configuring both in verbose/extended mode at the same time. > The information which thread handled a connection at which point in time > seems irrelevant. If a connection occupies a thread for more than x > seconds, it certainly becomes interesting. Also the waiting time, should > all threads be occupied, is interesting as well as statistics about how > many keepalive connections were closed early due to worker exhaustion. +1 - e.g. track the time at which the connection awoke based on the actual triggers in the pollset, even while the dispatcher is still trying to find a slot for each of the now-ready sockets. But until we poll again and we won't until we dispatch all of the sockets which just woke us, we won't be seeing other connections which are potentially now awake. > If we adopted an ExtendedStatus connections, the scoreboard > > table would be much larger than the number of event threads, > > and would report all of the open connections to the server (at > > least, all those that had passed the httpready threshold of our > > awareness). It can't be a simple ring, because the slots would > > expire at different times relative to their injection, so we would > > probably want to track the oldest request. The most optimal > > implementation would have score pools per-process so that > > connection score slot reuse is a simple thread lock, but we > > could alternately perform this tracking across processes. > > > > The first question that comes to mind is whether there is any > > value in supporting ExtendedStatus threads+connections? > > Two entire detailed lists residing in two different score shms? > > Note that the traditional thread status flag scoreboard would > > certainly be preserved, for a thread summary, even when the > > user chooses ExtendedStatus connections. > > > > Feedback and more ideas? > > It seems the information which request had been processed is so > interesting (which I totally destroyed in my ignorance) that scoreboard > should make every attempt to keep it until it is replaced by something new. > This is fine for the "static" HTML view that there is now. > > What I would find really cool is a feature that records connection/worker > changes and lets me play that back in a browser. Maybe even a live feed of > (a subset of) activities. Something that you see in the background on Star > Trek, only it would be real. > Turning the conventional per-thread score and per-connection score into two score-notice handlers, and allowing the user to then load other modules which register against the score notice hook, would accomplish this... It could record individual deltas or snapshots at timed intervals. Not sure > what is more feasible. I once did a status monitor of a system with various > props that reported green/yellow/red and had that in a horizontal scroll in > a browser. Customer was not happy as it became apparent how often his > system was in red though...but that cannot happen to us! > I'm thinking ActiveMQ or a similar message broker might be our friend here for creating such an app. Not "logging", per say, but every actual server connection/request state change notifications. Digging the idea :)
Re: End of the road of 2.2.x maintenance?
> On 10 May 2016, at 21:38, William A Rowe Jrwrote: > > > Are we ready to start the 12 month countdown as of the next/final bug > fix release of 2.2, and highlight this in both the 2.2 and 2.4 announce > broadcasts? > > I'm hoping we conclude some fixes of 2.4 scoreboard regressions and > get to the point of releasing 2.4 with mod_proxy_http2 sometime within > the next month or so, and that we can reach a consensus about how > we will proceed on the 2.2 branch, before we get to that release. > > Feedback desired As a big consumer of Apache 2.2 in my day job, where we are obliged to track Apache’s policies very closely, I would prefer to delay this a bit. When Apache announces the formal end-of-life date of 2.2, we will be required to engineer the migration of 6000+ wildly diverse sites to Apache 2.4 to meet internal audit policies. I would propose the 12 month countdown starts no earlier than Jan 2017 (as a consumer). What’s the cost of maintaining (but maybe not updating) Apache 2.2? Cheers, Mark
Re: End of the road of 2.2.x maintenance?
On 05/10/2016 10:38 PM, William A Rowe Jr wrote: > It's been a year, and seems to be a good time to revisit this topic > while those folks who are present at ApacheCon can discuss f2f > the merits of bringing the 2.2.x chapter to a close, and share their > thoughts back here on-list. > > According to http://w3techs.com/technologies/history_details/ws-apache/2 > the inflection point of a majority of 2.4 instances over 2.2 appears > to occur about 9 months from now. > > OpenSUSE 13.1 adopted 2.4 way back in Nov of '13. > > Ubuntu - 14.04 LTS, and Debian 8 (Jessie) switched to 2.4 in April '14. > > RHEL / CentOS 7 are well over a year old, adopted 2.4 in June '14. > Fedora 18 shipped 2.4 way back in Jan '13. > > E.g. every user of the broadly distributed Linux releases will have had > three full years to adopt 2.4 by June of 2017. I expect the BSD world > looks similar (modulo any Apache License 2.0 stupidity we are all > too familiar with.) If someone in the BSD, Solaris and other spheres > wants to chime in here with those milestones, that would be great. > > I am prepared to RM a final bug-fix release of 2.2 in conjunction with > the next 2.4 release effort, to gather in any final requests for fixes > before we move to a 12-month, security-fixes-only window on that branch. > Once those 12 months expire, as we've done with 1.3 and 2.0, there's > the possibility that relatively few committers would collect some critical > patches/apply-to-2.2.xx final security fixes, but no further releases would > occur. > > Are we ready to start the 12 month countdown as of the next/final bug > fix release of 2.2, and highlight this in both the 2.2 and 2.4 announce > broadcasts? > > I'm hoping we conclude some fixes of 2.4 scoreboard regressions and > get to the point of releasing 2.4 with mod_proxy_http2 sometime within > the next month or so, and that we can reach a consensus about how > we will proceed on the 2.2 branch, before we get to that release. > > Feedback desired, Sounds like a sensible plan. Regards Rüdiger
Re: Scoreboard 3.0
> Am 11.05.2016 um 17:03 schrieb William A Rowe Jr: > > It seems like a lot of the scoreboard confusion lies in the event > MPM lies in the work-performed-by-thread model of the thread > based scoreboard. > > Let's call this existing model 'ExtendedStatus threads' > > What it seems some users and committers are looking for is > a different model, let's call this 'ExtendedStatus connections'. I got some such feedback when h2 connections truly went async and no longer showed up as occupying a slot. I share your idea that in an event based server, threads are not the primary dimension but connections are. The information which thread handled a connection at which point in time seems irrelevant. If a connection occupies a thread for more than x seconds, it certainly becomes interesting. Also the waiting time, should all threads be occupied, is interesting as well as statistics about how many keepalive connections were closed early due to worker exhaustion. > If we adopted an ExtendedStatus connections, the scoreboard > table would be much larger than the number of event threads, > and would report all of the open connections to the server (at > least, all those that had passed the httpready threshold of our > awareness). It can't be a simple ring, because the slots would > expire at different times relative to their injection, so we would > probably want to track the oldest request. The most optimal > implementation would have score pools per-process so that > connection score slot reuse is a simple thread lock, but we > could alternately perform this tracking across processes. > > The first question that comes to mind is whether there is any > value in supporting ExtendedStatus threads+connections? > Two entire detailed lists residing in two different score shms? > Note that the traditional thread status flag scoreboard would > certainly be preserved, for a thread summary, even when the > user chooses ExtendedStatus connections. > > Feedback and more ideas? It seems the information which request had been processed is so interesting (which I totally destroyed in my ignorance) that scoreboard should make every attempt to keep it until it is replaced by something new. This is fine for the "static" HTML view that there is now. What I would find really cool is a feature that records connection/worker changes and lets me play that back in a browser. Maybe even a live feed of (a subset of) activities. Something that you see in the background on Star Trek, only it would be real. It could record individual deltas or snapshots at timed intervals. Not sure what is more feasible. I once did a status monitor of a system with various props that reported green/yellow/red and had that in a horizontal scroll in a browser. Customer was not happy as it became apparent how often his system was in red though...but that cannot happen to us! -Stefan
Scoreboard 3.0
It seems like a lot of the scoreboard confusion lies in the event MPM lies in the work-performed-by-thread model of the thread based scoreboard. Let's call this existing model 'ExtendedStatus threads' What it seems some users and committers are looking for is a different model, let's call this 'ExtendedStatus connections'. If we adopted an ExtendedStatus connections, the scoreboard table would be much larger than the number of event threads, and would report all of the open connections to the server (at least, all those that had passed the httpready threshold of our awareness). It can't be a simple ring, because the slots would expire at different times relative to their injection, so we would probably want to track the oldest request. The most optimal implementation would have score pools per-process so that connection score slot reuse is a simple thread lock, but we could alternately perform this tracking across processes. The first question that comes to mind is whether there is any value in supporting ExtendedStatus threads+connections? Two entire detailed lists residing in two different score shms? Note that the traditional thread status flag scoreboard would certainly be preserved, for a thread summary, even when the user chooses ExtendedStatus connections. Feedback and more ideas? Cheers, Bill
Re: svn commit: r1742794 - /httpd/httpd/branches/2.4.x/STATUS
On Tue, May 10, 2016 at 11:36 PM, William A Rowe Jrwrote: > On Tue, May 10, 2016 at 7:26 PM, Yann Ylavic wrote: > >> >> The case where this happens is for keepalive/lingering-close handling. >> Suppose thread t1 handles request r1 on connection c1, and t1 is >> released after r1 (when c1 is kept alive). >> > So thread t1 has traditionally reported 'K'eepalive state, but that is based on the fact that t1 is still 'busy' waiting on the next request from c1, in the classic prefork or worker models. In the event case, thread t1 should now be reported as Idle, because that worker no longer has any work assigned to it; however, the score slot should continue to display t1's last request r1 until a new connection or request occupies that slot, at which time everything related to r1 must be zeroed out (replaced again by c1 where appropriate). What we might want in the score is to track "unaffiliated connections" with no thread assigned, in the state they were last in, but right now we have no such reporting mechanism. > Same for t2 with r2 on c2. >> >> Now the following cases (not exhaustive): >> 1. r3 arrives on c1, t2 handles the event (read). >> 2. c2 is closed/reset remotely, t1 handles the event (read). >> 2. c1 is closed/reset remotely, t2 handles the event (read). >> >> Without my patch: >> 1. t2 reports r3 on c1, "losing" r2 on c2 (no more reported by the >> scoreboard), while t1 still reports r1 on c1 (although "superseded" by > > r3). > > That behavior is by design, because t2 isn't performing any further work on r2 from c2 - it is performing work for c1. At this moment, c1 should appear in both score slots, t1 still reports r1, but in an idle state so that we know that isn't what the thread is doing right now, while t2 also reports c1, which it picked up, in an active processing state. The t2 should not report r1 because r1 is "done and finished" and t2 never performed work on r1. At this time, c2 shouldn't appear in the scoreboard at this time; as I'd observed above, that scoreboard functionality doesn't exist right now. > 2. t1 reads/closes c2 and clobbers r1 (blank). >> > That's fine, t1's only work on c2 was to read a request, which it did not find. t1 is working on c2. t1 did not handle the request r1. > 3. t2 reads/closes c1 and clobbers r3 (blank). >> => t1 and t2 scoreboard's request/vhost are blank. >> > At the time we read no request, if we have not yet yielded the thread, the existing score slot for t2 should simply be updated with the status of closing, no change to the request or vhost because these fields should not be updated if no request is resolved. If t2 yields after completing r3, when it resumes it has no request. This is true whether you have jumped threads, or managed to land back in the same thread as you illustrate above for c2. > With my patch: > 1. t2 reports r3 on c1, "losing" r2 on c2 (no more reported by the > scoreboard), while t1 still reports r1 on c1 (although "superseded" by > r3). > AIUI the propsoed trunk patch introduces an edge case where t2 will briefly report r1 while t1 is still reporting r1, until the request is read complete, and we are toggling to write status... is this correct? > 2. t1 reads/closes c2 and restores r2. > Incorrect behavior, IMO. t1 performed no work on r2. > 3. t2 reads/closes c1 and restores r3. > => t1 and t2 scoreboard's request/vhost contain relevent (last) info. > So in bullet 2. above, it isn't the last relevant activity of the thread t1. You can argue that r3 was the most recent activity performed on t2, but we would want to perform an identity check that the score slot has simply resumed the same connection, that it had not jumped threads, and that there was not intervening activity on that thread. In httpd 2.4 and prior, the scoreboard is relating what each thread's action is. In httpd 3.0, you could certainly propose that the score is connection oriented, with one slot per connection irrespective of which thread (if any) is handling the request. Then the proposed behavior makes a great deal of sense. I agree that we should simply not update the scoreboard with a fresh "W" update when we have no request to write (the client disconnected and went away). That's why I cast a -1 on the test you added for r->the_request, that isn't the right way to know that there is no request being written. I need to still dig further but am really glad you pointed our attention at the offending status update. I expect we can dodge that update altogether for a no-request state. Believe that we may be missing an update to the scoreboard at the moment we wake up an event thread, so I'm looking in that corner as well. In httpd 3.0 we will have to recall what request is in-flight since we will be able to set-aside a request and wake it back up on a different thread, and update the scoreboard appropriately, but I still don't believe that's the correct behavior on 2.4 maintenance
Re: Detecting client aborts and stream resets
Thanks for the patch! I applied it to trunk in r1743335, will be part of next 1.5.4 release. I only omitted the last change as I do not want to set aborted on the main connection every time the session closes. Cheers, Stefan > Am 10.05.2016 um 14:37 schrieb Michael Kaufmann: > > Zitat von William A Rowe Jr : > >> On Wed, May 4, 2016 at 3:44 PM, Michael Kaufmann >> wrote: >> >>> William is right, this is not a good idea. The ->aborted flag should serve this purpose of telling anyone interested that this connection is not longer delivering. I will make a github release soon where that is working and you can test. >>> Thank you Stefan! It is now working for stream resets, but it's not yet >>> working if the client closes the whole TCP connection. >>> >> >> As expected... this is why I pointed out in my first reply that you don't >> want a single-protocol solution to this puzzle. > > Of course I'd also prefer a general solution. > > I have created a patch for mod_http2: With the patch, it sets c->aborted on > the master connection and also on the "dummy" connections (streams) when the > client closes the TCP connection. > > @Stefan: It would be great if you could check this code and add it to > mod_http2 :-) > > > Index: modules/http2/h2_mplx.c > === > --- modules/http2/h2_mplx.c (revision 1743146) > +++ modules/http2/h2_mplx.c (working copy) > @@ -488,6 +488,15 @@ > return 1; > } > > +static int task_abort_connection(void *ctx, void *val) > +{ > +h2_task *task = val; > +if (task->c) { > +task->c->aborted = 1; > +} > +return 1; > +} > + > apr_status_t h2_mplx_release_and_join(h2_mplx *m, apr_thread_cond_t *wait) > { > apr_status_t status; > @@ -537,6 +546,8 @@ > * and workers *should* return in a timely fashion. > */ > for (i = 0; m->workers_busy > 0; ++i) { > +h2_ihash_iter(m->tasks, task_abort_connection, m); > + > m->join_wait = wait; > status = apr_thread_cond_timedwait(wait, m->lock, > apr_time_from_sec(wait_secs)); > > @@ -591,6 +602,7 @@ > AP_DEBUG_ASSERT(m); > if (!m->aborted && enter_mutex(m, ) == APR_SUCCESS) { > m->aborted = 1; > +m->c->aborted = 1; > h2_ngn_shed_abort(m->ngn_shed); > leave_mutex(m, acquired); > } > > > > >> See my later reply about detecting connection tear-down, patches welcome. > > Sounds good! If someone sends a patch, I will help to test it.