On Thu, Apr 28, 2016 at 4:37 AM, Yann Ylavic <ylavic....@gmail.com> wrote:

> On Thu, Apr 28, 2016 at 4:30 AM, William A Rowe Jr <wr...@rowe-clan.net>
> wrote:
> > On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <ylavic....@gmail.com>
> wrote:
> >>
> >> On Wed, Apr 27, 2016 at 8:41 PM,  <wr...@apache.org> wrote:
> >> >   Ensure http2 follows http in the meaning of
> >> >   status WRITE (meaning 'in the request processing
> >> >   phase' even if still consuming the request body,
> >> >   not literally in a 'now writing' state).
> >>
> >> This is indeed consistent with how we report http1 state currently,
> >> but maybe it could be more intuitive to report READ until the body is
> >> consumed in http1 rather than changing http2?
> >> Unless we want to minimize scoreboard updates, for performance
> reasons...
> >
> >
> > Well, we always want to be considerate of performance.
> >
> > That said, we absolutely must not change the semantic meanings of
> > the server-status results on the maintenance branch.  Change those
> > meanings on trunk for a future 2.6/3.0?  Sure... but 2.4.x needs to
> > behave as 2.4.x has behaved from the start.
>
> Agreed, we can't change it for http1 in 2.4.x, but maybe http2 has
> other considerations WRT reading/writing on the master connection,
> reporting READ or WRITE for http2 is not really a compatibility issue
> IMHO.
>

It is, yes, the server-status has well understood semantics and changing
them on a released branch is never acceptable. The master worker already
has special semantics which are new. No reason to change the slave
workers between http/1 and http/2 as they are doing the same things.


> >> >   Ensure a number of MPMs and the h2 connection io
> >> >   no longer clobber the request status line during
> >> >   state-only changes.  While at it, clean up some
> >> >   very ugly formatting and unnecessary decoration,
> >> >   and avoid the wordy _from_conn() flavor when we
> >> >   are not passing a connection_rec.
> >>
> >> Why ap_update_child_status_from_conn(), given a real or NULL c, would
> >> clobber the request line?
> >
> >
> > Given a real conn_rec record, we have no request line.  Therefore the
> result
> > is NULL.  There is no purpose in modifying the conn_rec related fields
> once
> > correctly recorded.  The next chance they have for being modified in any
> > meaningful way is during an ssl renegotiation or the processing of the
> Host:
> > and X-F-F: headers during the read_request phase.
>
> That's more an optimization (not rewrite existing values with the same
> ones), but driven by how http1 in *2.4.x* works currently (this relies
> on the same worker handling the connection for the lifetime of the
> request).
>



> When SUSPENDED is used (trunk does it already at several places) we
> must overwrite the fields with actual values.
>

SUSPENDED within a request cannot be supported on 2.4.x as modules
are allowed to make thread storage and thread-local storage assumptions.
E.g. it breaks modules in an unexpected manner, and it is our responsibility
to inform users of this by bumping the version major.  This wasn't seen as
a big of an issue when we introduced event and launched 2.2.x, because
far more third party modules are request handlers, and are not connection-
based.  But we still bumped the version minor when we allowed the
connection to float between threads. This wasn't changed in maintained
release branch.

On trunk, it will have to become the responsibility of the MPM to restore
the right score slot state before reentering the processing loop on a given
worker thread, multiple times per connection or request, as that MPM
sends requests across threads.


> >> The request line is untouched unless a non-NULL r is passed to
> >> update_child_status_internal(), and ap_update_child_status_from_conn()
> >> sets r to NULL.
> >
> > That was incorrect - to the extend that you've changed it since 2.4.1,
> > such a change must be reverted.
>
> I meant ap_update_child_status_from_conn() gives a NULL r to
> update_child_status_internal(), so it won't change the existing
> request line (as opposed to setting it to the NULL value in the
> scoreboard).


What I meant is that the request line *MUST* be touched - wiped
clean - for any status_from_conn(c) request.

When we say we preserve the most recent state of the worker in
the scoreboard, that old request line is gone.  Every time we are
doing a conn-based refresh of that worker status, the req-based
data is no longer relevant and needs to go away.

> At the time we process a new connection (before, and again after we
> > have parsed any SNI host handshake) - there is NO REQUEST.  Any
> > lingering request value must be blanked out at that time.  Once the
> > request URI is parsed we again invoke update_child_status, this time
> > with the request_rec with a now-known request string.
>
> So there is a window where some values need to be blanked or restored
> from elsewhere, so to avoid mixing previous values of the worker with
> new ones not yet available (see below).
>

It's a straightforward flow as illustrated below, yes.


> >> AIUI, what sets the request line to NULL in mpm_worker is when
> >> ap_read_request() fails (mostly after connection closed remotely or
> >> keep-alive timeout, actually any empty/NULL r->the_request from
> >> read_request_line() would do it).
> >> This may also happen in mpm_event but since the worker (scoreboard
> >> entry) has probably changed in between, this is possibly less visible.
> >
> >
> > You don't understand it.
> >
> > Envision this sequence, which is how the scoreboard was designed;
> >
> > Initialization -> entire score record is voided
> >
> > Connection -> score entry tagged READ, what is known of the
> > connecting client and the target vhost is recorded
> >
> > Connection SNI -> score entry updated with new target vhost
> >
> > Request read -> score entry again updated with new target vhost,
> > the request identifier is updated, score entry tagged WRITE
> >
> > Request body is read, Response body is prepared
> >
> > Request complete -> score entry tagged LOGGING, and NO
> > other fields need to be updated
> >
> > Request logged, left in keepalive state -> score entry tagged
> > KEEPALIVE, NO other fields need to be updated
> >
> > Connection closed -> score entry tagged IDLE, NO OTHER
> > FIELDS SHOULD BE UPDATED until the worker is reused
> > (even in the case of event MPM).
>
> OK, so the worker scoreboard entry is up to date with the latest
> connection/request handled.
>

Right


> Now let's consider the case in KEEAPALIVE state where a new request
> arrives (possibly on another connection for mpm_event), or in
> READY/closed state (for any MPM) where a new connection+request is
> handled by the same worker.
>

When we transition to READ just before calling ap_read_request(), we
> update the client IP in the scoreboard, but keep the request line
> (from the previous request) untouched, mixed data...
>

No, no, no.  That's not what httpd did until it was broken.

See
http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/server/scoreboard.c?r1=1239125&r2=1741240&diff_format=h

for the delta between 2.4.1 and 2.4.broken (current state).

         else if (c) {
-            apr_cpystrn(ws->client, ap_get_remote_host(c, NULL,
-                        REMOTE_NOLOOKUP, NULL), sizeof(ws->client));
-            ws->request[0]='\0';
-            ws->vhost[0]='\0';

Please don't point me to the most recent release and state it as the
correct behavior, when the patches that got us there was ill-advised
and not thought out, as illustrated by PR 59333.


> For mpm_event this is a very transient state since the worker was
> scheduled only because the connection is POLLIN ready, so
> ap_read_request() won't block.
>

Good... ready to read and set the request in a processing (W) state.
If we are *reading* (and not polling for something to read) the request
should be flipped to the (R) state with the original logic that fills back
in the connection info and empties the request line.


> For mpm_worker though, this may last KeepAliveTimeout.
> Moreover for both MPMs, if the connection is closed by the client the
> worker score will stay as is until next (real) request is read on
> another connection, or for mpm worker if keepalive expires the request
> line will be set to the value NULL (unless patch below is applied).
>

When we enter the keepalive state, status should be set to (K) without
otherwise modifying the previously processed conn/req fields. If the
connection must be sent away, we toggle the state to (C), once it is
gone, we toggle an idle state for that slot.

When we pick back up the connection on another thread in the event
MPM, the correct behavior would be to call status_from_conn() before
handing it back to core in 2.4.x.  On trunk, the behavior should be
enhanced to call status with a request_rec instead if that request
had been SUSPENDed on another thread.

When we resume a connection on another thread, we must revert
the breakage and clear the new score slot request_rec upon the
connection_rec context, remember the prior request isn't interesting
*on this thread* and this is not the thread that serviced the prior
request.

That's what is observed in PR 59333, and I don't see where your commit
> change this.
> We need to either accept/document mixing, or blank the request line
> (but it will be very noticeable by users as it is already in 2.4.20),
> or restore the latest request line (and vhost) on the connection
> (hence also store it by connection, in conn_rec).
>
> >> So how about:
> >>
> >> Index: server/protocol.c
> >> ===================================================================
> >> --- server/protocol.c    (revision 1741108)
> >> +++ server/protocol.c    (working copy)
> >> @@ -1093,13 +1093,14 @@ request_rec *ap_read_request(conn_rec *conn)
> >>              access_status = r->status;
> >>              r->status = HTTP_OK;
> >>              ap_die(access_status, r);
> >> -            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
> >> +            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG,
> >> +                                   r->the_request ? r : NULL);
>

No.  If the_request is blank the right answer is blank.  We don't log
a non-request.


> Or even do not update the status at all if !r->the_request (and/or
> !r->the_request[0]?) since it will pass to CLOSING very soon.
>

You don't want to know what's going on when httpd threads get
jammed in the logging phase blocking on some logging backend?

Reply via email to