On Wed, Apr 27, 2016 at 6:16 PM, Yann Ylavic <ylavic....@gmail.com> wrote:

> I was offline today so couldn't comment on the different messages on
> the subject, so I'll try to summarize (here) my understanding, so
> far...
>
> On Wed, Apr 27, 2016 at 8:41 PM,  <wr...@apache.org> wrote:
> > Author: wrowe
> > Date: Wed Apr 27 18:41:49 2016
> > New Revision: 1741310
> >
> > URL: http://svn.apache.org/viewvc?rev=1741310&view=rev
> > Log:
> >
> >   Ensure http2 follows http in the meaning of
> >   status WRITE (meaning 'in the request processing
> >   phase' even if still consuming the request body,
> >   not literally in a 'now writing' state).
>
> This is indeed consistent with how we report http1 state currently,
> but maybe it could be more intuitive to report READ until the body is
> consumed in http1 rather than changing http2?
> Unless we want to minimize scoreboard updates, for performance reasons...
>

Well, we always want to be considerate of performance.

That said, we absolutely must not change the semantic meanings of
the server-status results on the maintenance branch.  Change those
meanings on trunk for a future 2.6/3.0?  Sure... but 2.4.x needs to
behave as 2.4.x has behaved from the start.


> >   Ensure a number of MPMs and the h2 connection io
> >   no longer clobber the request status line during
> >   state-only changes.  While at it, clean up some
> >   very ugly formatting and unnecessary decoration,
> >   and avoid the wordy _from_conn() flavor when we
> >   are not passing a connection_rec.
>
> Why ap_update_child_status_from_conn(), given a real or NULL c, would
> clobber the request line?
>

Given a real conn_rec record, we have no request line.  Therefore the result
is NULL.  There is no purpose in modifying the conn_rec related fields once
correctly recorded.  The next chance they have for being modified in any
meaningful way is during an ssl renegotiation or the processing of the
Host:
and X-F-F: headers during the read_request phase.


> The request line is untouched unless a non-NULL r is passed to
> update_child_status_internal(), and ap_update_child_status_from_conn()
> sets r to NULL.
>

That was incorrect - to the extend that you've changed it since 2.4.1,
such a change must be reverted.

At the time we process a new connection (before, and again after we
have parsed any SNI host handshake) - there is NO REQUEST.  Any
lingering request value must be blanked out at that time.  Once the
request URI is parsed we again invoke update_child_status, this time
with the request_rec with a now-known request string.


> AIUI, what sets the request line to NULL in mpm_worker is when
> ap_read_request() fails (mostly after connection closed remotely or
> keep-alive timeout, actually any empty/NULL r->the_request from
> read_request_line() would do it).
> This may also happen in mpm_event but since the worker (scoreboard
> entry) has probably changed in between, this is possibly less visible.
>

You don't understand it.

Envision this sequence, which is how the scoreboard was designed;

Initialization -> entire score record is voided

Connection -> score entry tagged READ, what is known of the
connecting client and the target vhost is recorded

Connection SNI -> score entry updated with new target vhost

Request read -> score entry again updated with new target vhost,
the request identifier is updated, score entry tagged WRITE

Request body is read, Response body is prepared

Request complete -> score entry tagged LOGGING, and NO
other fields need to be updated

Request logged, left in keepalive state -> score entry tagged
KEEPALIVE, NO other fields need to be updated

Connection closed -> score entry tagged IDLE, NO OTHER
FIELDS SHOULD BE UPDATED until the worker is reused
(even in the case of event MPM).




So how about:
>
> Index: server/protocol.c
> ===================================================================
> --- server/protocol.c    (revision 1741108)
> +++ server/protocol.c    (working copy)
> @@ -1093,13 +1093,14 @@ request_rec *ap_read_request(conn_rec *conn)
>              access_status = r->status;
>              r->status = HTTP_OK;
>              ap_die(access_status, r);
> -            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
> +            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG,
> +                                   r->the_request ? r : NULL);
>

That could be updated, or we could equally pass NULL always, but it doesn't
impact the correctness of the proposed backport.


>              ap_run_log_transaction(r);
>              r = NULL;
>              apr_brigade_destroy(tmp_bb);
>              goto traceout;
>          case HTTP_REQUEST_TIME_OUT:
> -            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
> +            ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, NULL);
>              if (!r->connection->keepalives)
>                  ap_run_log_transaction(r);
>              apr_brigade_destroy(tmp_bb);
> ?
>
>
> Regarding this commit, the goal seems more about not clobbering the
> *client IP* until we read the next/real request on each connection,
> but this implies that we report more the last requests handled by each
> worker than the workers states themselves...
> Isn't the access log more appropriate for that purpose finally,
> whereas the scoreboard is more about workers' activity?
>

The goal is to avoid clobbering any of the fields, including the request.
The patch satisfies that goal, the original behavior of voiding out the old
request line during any _from_conn() update should be restored as you
just identified above.

You are missing the point, the scoreboard is not to be abused until that
worker has *new* work to do, and even the previous request on that worker
needs to persist in the keepalive case until a new request arrives on that
connection.


> However I agree that without doing something like this (I tried a
> different approach in PR 59333, not working yet... but it could with
> more work IMHO), with mpm_event we may find mixed data (from different
> requests/connections) in the same scoreboard entry :/
>

This is why it is important to only record the client/vhost/request datum
at the time they are first known, and not to keep tickling those values,
for no effect.


> I think Stefan tried to address this (observed by testing http2
> scores?) in 2.4.20 by blanking the fields before reading each request
> (unfortunately that broke the usual reporting, but I think the
> intention was laudable).
>

Not questioning anyone's motives, as Stefan points out, the design
of the scoreboard isn't blatantly obvious.  My patch would have also
corrected the unclear doxygen description, but there is not such
description as we are all aware :)

So now, either we choose to not report workers' activity in between
> requests (there are just multi-purpose workers/request-pollers after
> all), or we save some "last request" informations in each conn_rec and
> restore them on the worker (score) handling the connection at each
> time (this is more reporting the workers' activity than the requests
> lines, which will pass from one worker to the other, even possibly
> duplicating requests lines with low activity).
>

If that slot does other work, it should be recorded.  But while that slot
sits unused, the last activity for that score slot must be preserved.

Reply via email to