Re: Exposing more loggable data from the proxy

2013-10-25 Thread Jeff Trawick
On Fri, Oct 25, 2013 at 9:16 AM, Jeff Trawick  wrote:

> On Fri, Oct 25, 2013 at 9:07 AM, Daniel Ruggeri wrote:
>
>> As I stand up a simple IPv6 test proxy that supports both AF_INET and
>> AF_INET6 addresses, I was looking for a way to log what addr family (and
>> maybe the IP address) mod_proxy settled on for each request in the
>> access_log. I'm not seeing a way to do that (but correct me if I'm
>> missing something) and was poking through the code and got to thinking
>> that there are all kinds of data bits that'd be interesting to have
>> available in the ENV.
>>
>> I'm thinking it'd be worth adding a directive (ProxyAddEnvironment?)
>> that adds these ENV entries to each r->subprocess_env:
>>  * Host header sent to backend (useful when dynamic targets are used)
>>  * Target DNS name if set
>>  * Target IP address
>>  * Target Address family
>>  * Target port
>>  * Target connection protocol
>>  * Flag for SSL enabled
>>
>> All of the data is readily available once a connection is acquired in
>> ap_proxy_acquire_connection sans the HTTP Host header.
>> Aside from logging, exporting these as ENV entries to the request allows
>> us to do all sorts of stuff in other modules, too
>>
>> Any thoughts? Is there something I should include or exclude before I
>> begin?
>>
>> --
>> Daniel Ruggeri
>>
>>
> (unrefined, right out of my ...  head)
>
> useful to have a convention (if not API) for how this info is made
> available for logging, etc., so that other modules can play the same game
> (e.g., mod_jk, FastCGI, whatever)
>
> what about a plugin with optional functions that has APIs for recording
> backend state that is meaningful across variety of "gateway" modules?  for
> now maybe it is just for logging, but it could save in shared memory for
> extraction in mod_status or other reports
>
> maybe that solution is a bit farfetched, but I guess the theme is that
> creating a proxy-specific solution can be a wasted effort given the same
> need for any number of other "gateway" modules
>
> --
> Born in Roswell... married an alien...
> http://emptyhammock.com/
>


A little more refined...  (see attached .h file)

It wouldn't take too many lines of code to provide logging (custom access
log format with various format strings).  A "few" more and it is in shared
memory for a scoreboard display so you can see where your requests are
getting stalled when talking to LDAP servers or app servers or whatever...

Having a callable API for modules will help with consistency of
representation and largely eliminate questions of what can/should be
represented.

-- 
Born in Roswell... married an alien...
http://emptyhammock.com/
/* Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/**
 * @file  mod_backend_state.h
 * @brief State collection and report services for modules which communicate
 * with backend servers
 *
 * @defgroup MOD_BACKEND_STATE mod_backend_state
 * @ingroup  APACHE_MODS
 * @{
 */

#ifndef MOD_BACKEND_STATE_H
#define MOD_BACKEND_STATE_H

#include "httpd.h"
#include "apr_optional.h"

#ifdef __cplusplus
extern "C" {
#endif

typedef enum {
BS_PHASE_CERT_VAL,
BS_PHASE_AUTHN,
BS_PHASE_AUTHZ,
BS_PHASE_HANDLER,
BS_PHASE_FILTER_IN,
BS_PHASE_FILTER_OUT,
} backend_state_phase;

typedef enum {
BS_SUBPHASE_GENERIC,
BS_SUBPHASE_LOOKUP,
BS_SUBPHASE_CONNECT,
BS_SUBPHASE_WRITE_FIRST,
BS_SUBPHASE_WRITE_MORE,
BS_SUBPHASE_READ_FIRST,
BS_SUBPHASE_READ_MORE,
BS_SUBPHASE_CLOSING,
} backend_state_subphase;

#define BS_PROTOCOL_HTTPS "https"
#define BS_PROTOCOL_HTTP "http"
#define BS_PROTOCOL_AJP "ajp"
#define BS_PROTOCOL_FASTCGI "fcgi"
/* or some custom string */

/**
 * State reporting function.
 * Call this when a peer is determined or when connection/request
 * processing phase or subphase changes.
 */
APR_DECLARE_OPTIONAL_FN(int,backend_state,(conn_rec *c,
   request_rec *r,
   backend_state_phase, 
   backend_state_subphase,
   apr_sockaddr_t *peer,
   const char *protocol));

/** 
 * Data transfer reporting function.
 * Call this to report

Re: Exposing more loggable data from the proxy

2013-10-25 Thread Daniel Ruggeri
On 10/25/2013 8:16 AM, Jeff Trawick wrote:
> (unrefined, right out of my ...  head)

:-)


> useful to have a convention (if not API) for how this info is made
> available for logging, etc., so that other modules can play the same
> game (e.g., mod_jk, FastCGI, whatever)

I think what I'd be proposing is to follow in the footsteps that mod_ssl
set but MAKE it the general convention. A lot of env variables are
available so you can do all sorts of stuff. The *other* convention could
simply be the notes mechanism but I think there's better ways. In any
event, there should be some good documentation to let server admins know
what's available - that's the main reason I even suggested adding a
directive (so the list of ENV entries are exposed like mod_ssl has
documented)


> what about a plugin with optional functions that has APIs for
> recording backend state that is meaningful across variety of "gateway"
> modules?  for now maybe it is just for logging, but it could save in
> shared memory for extraction in mod_status or other reports
>
> maybe that solution is a bit farfetched, but I guess the theme is that
> creating a proxy-specific solution can be a wasted effort given the
> same need for any number of other "gateway" modules

The main reason I lean toward env is because a few modules already use
env variables in meaningful ways today (big use-cases for me are log,
rewrite, headers, expr).
So... in that case, I think unrefined kinda works in our favor.



On 10/25/2013 9:10 AM, Yann Ylavic wrote:
> How about setting backend->r to r->backend (when applyable)?

At a cursory look, that'd probably suffice if I wanted to log FROM
mod_proxy. I wasn't clear, but my initial goal was to just add a
%{proxy_addr_family}e chunk in the access log format so the request and
target info is handy on a single line. As mentioned to Jeff, I'd also
like a little more utility from the work involved. Plus, as you say, I
haven't looked to see how well those lifecycles mesh.



On 10/25/2013 9:21 AM, Eric Covener wrote:
> I think the how simple the data is compared to that underlying
> structure, I would rather see it in a flat table that other proxy-like
> things could emulate.  Similar to SSL variables.

That was kind of the idea... the stuff I'm looking for is pretty basic
but I wanted to shoot an email along the lines of, "Anyone else want a
beer while I'm in the fridge?" to see if there were some other
attributes that are worth gathering. And yes, SSL led me to the suggestion.

--
Daniel Ruggeri




Re: deflate_in_filter double ap_get_brigade() call

2013-10-25 Thread Yann Ylavic
On Fri, Oct 25, 2013 at 4:38 PM, Yann Ylavic  wrote:

> The deflate_in_filter() (in trunk) currently does :
>

Same in 2.4.x.


deflate_in_filter double ap_get_brigade() call

2013-10-25 Thread Yann Ylavic
Helo,

The deflate_in_filter() (in trunk) currently does :

rv = ap_get_brigade(f->next, ctx->bb, AP_MODE_READBYTES, block, 10);
if (rv != APR_SUCCESS) {
return rv;
}

/* zero length body? step aside */
bkt = APR_BRIGADE_FIRST(ctx->bb);
if (APR_BUCKET_IS_EOS(bkt)) {
ap_remove_input_filter(f);
return ap_get_brigade(f->next, bb, mode, block, readbytes);
}

Is it intended to call ap_get_brigade() a second time when the EOS is there
the first time?
The second call is likely to return APR_EOF which is quite different than
EOS brigade...

Regards.


Re: Exposing more loggable data from the proxy

2013-10-25 Thread Eric Covener
On Fri, Oct 25, 2013 at 10:10 AM, Yann Ylavic  wrote:
> How about setting backend->r to r->backend (when applyable)?
>
> For now backend->r is pooled to backend's connection, and has a different
> lifetime than r but maybe it worth having it destroyed with r (setting
> things related to the origin's connection to NULL, when released, to avoid
> invalid accesses)...
>
> Hence r->backend->useragent_addr/useragent_ip (useragent is the origin here)
> are accessible everywhere, and so are backend->r->headers_* and everything
> related to the request/response send/received to/from the backend.
>
> I may be missing lot of things with backend->r's lifetime, since it is now
> attached to origin->pool, surely for good reasons...
> Also, not all proxy modules use backend->r, but maybe these can create a
> fake request too with everything NULL except sockaddr (and other relevant
> things, if any).

I think the how simple the data is compared to that underlying
structure, I would rather see it in a flat table that other proxy-like
things could emulate.  Similar to SSL variables.


Re: Exposing more loggable data from the proxy

2013-10-25 Thread Yann Ylavic
How about setting backend->r to r->backend (when applyable)?

For now backend->r is pooled to backend's connection, and has a different
lifetime than r but maybe it worth having it destroyed with r (setting
things related to the origin's connection to NULL, when released, to avoid
invalid accesses)...

Hence r->backend->useragent_addr/useragent_ip (useragent is the origin
here) are accessible everywhere, and so are backend->r->headers_* and
everything related to the request/response send/received to/from the
backend.

I may be missing lot of things with backend->r's lifetime, since it is now
attached to origin->pool, surely for good reasons...
Also, not all proxy modules use backend->r, but maybe these can create a
fake request too with everything NULL except sockaddr (and other relevant
things, if any).



On Fri, Oct 25, 2013 at 3:07 PM, Daniel Ruggeri wrote:

> As I stand up a simple IPv6 test proxy that supports both AF_INET and
> AF_INET6 addresses, I was looking for a way to log what addr family (and
> maybe the IP address) mod_proxy settled on for each request in the
> access_log. I'm not seeing a way to do that (but correct me if I'm
> missing something) and was poking through the code and got to thinking
> that there are all kinds of data bits that'd be interesting to have
> available in the ENV.
>
> I'm thinking it'd be worth adding a directive (ProxyAddEnvironment?)
> that adds these ENV entries to each r->subprocess_env:
>  * Host header sent to backend (useful when dynamic targets are used)
>  * Target DNS name if set
>  * Target IP address
>  * Target Address family
>  * Target port
>  * Target connection protocol
>  * Flag for SSL enabled
>
> All of the data is readily available once a connection is acquired in
> ap_proxy_acquire_connection sans the HTTP Host header.
> Aside from logging, exporting these as ENV entries to the request allows
> us to do all sorts of stuff in other modules, too
>
> Any thoughts? Is there something I should include or exclude before I
> begin?
>
> --
> Daniel Ruggeri
>
>


Re: error log providers, multiple vhosts, mod_syslog

2013-10-25 Thread Jeff Trawick
I'm hoping to get back to this discussion "soon."  I also have a 2.4.x
patch I've played with that is roughly in sync with trunk (though syslog
logging support isn't a provider in the 2.4.x version).


On Thu, Oct 17, 2013 at 8:33 AM, Plüm, Rüdiger, Vodafone Group <
ruediger.pl...@vodafone.com> wrote:

>
>
> > -Original Message-
> > From: Jan Kaluža
> > Sent: Donnerstag, 17. Oktober 2013 13:21
> > To: dev@httpd.apache.org
> > Subject: Re: error log providers, multiple vhosts, mod_syslog
> >
> > There's another problem with log providers and vhosts and I think I have
> > no idea how to fix it without doing dirty hacks...
> >
> > The problem is with ap_open_logs function, which does following:
> >
> > 1. Main server log is opened (open_error_log()). If this log uses error
> > log provider, s_main->error_log is set to NULL.
> >
> > 2. When there is no s_main->error_log, stderr is redirected to /dev/null.
>
> Hmm. This points out another issue when using an error log provider for
> the main server log:
> We lose everything that the server or other programs like CGI-scripts
> write to the stderr FD as it
> is simply written to /dev/null. Don't we need to have a separate process
> in this case that
> like a piped logger reads from the reading end of the "stderr pipe" and
> writes it
> via ap_server_conf->errorlog_provider->writer to the log?
>
> >
> > 3. Error logs for vhosts are opened (another open_error_log()). If there
> > is some problem when opening these logs, any ap_log_error() call is sent
> > to /dev/null.
>
> Wouldn't it try to call s->errorlog_provider->writer in line 1196 and
> Segfault if s->errorlog_provider
> is NULL because of some bad "classic" configuration that simply failed?
>
> Regards
>
> Rüdiger
>



-- 
Born in Roswell... married an alien...
http://emptyhammock.com/


Re: stop copying footers to r->headers_in?

2013-10-25 Thread Eric Covener
On Fri, Oct 25, 2013 at 9:12 AM, Yann Ylavic  wrote:
> On Thu, Oct 24, 2013 at 7:42 PM, Eric Covener  wrote:
>>
>> On Wed, Oct 23, 2013 at 8:04 AM, Yann Ylavic  wrote:
>> >> 1) add r->footers_in and use it in 2.2 and up by default
>> > Do that mean no API/ABI change ?
>>
>> In the sense that it needs to be backportable, yes -- but it will mean
>> a behavior change to "existing" APIs. Depends on how you interpret it
>> I guess, but I think the confusion over trailers/headers is something
>> that needs to be forcible corrected in those service releases.
>
>
> Concretely, is request_rec::trailers_in/out (at the very end of the struct)
> something backportable or should the footers be stored elsewhere?

yes, that's fine.

>
> According to [my understanding of] the rfc2616 and/or
> draft-ietf-httpbis-p1-messaging-24 about the HTTP trailer (chunking, TE and
> Trailer sections), they are hop-by-hop (anyway negociable/negociated
> hop-by-hop).
> They could then be stored in something more related to the request's
> connection(s), for *example* in the http_ctx_t used by ap_http_filter for
> input, which could probably be shared with protocol output filters too for
> the trailers_out.
> There is still the need to access them from r, for *example* (still) with
> something like the request_config of the http_module.
> But this looks like a big hack compared to the (simple)
> r->trailers_in/out...
>
> So, is the backportability something to care about for trunk (now) or things
> should be kept [as] simple [as possible] there and complications arise when
> backporting (using a different storage/access for example)?

Generally would consider it as it goes into trunk, and jump through
some level of hoops to keep them common when possible.

>
> My plans were to use request_rec for now and change this later if it oughts
> to, but it could be useful to point/suggest me a backportable way for that
> before I hit the wall...

+1
-- 
Eric Covener
cove...@gmail.com


Re: Exposing more loggable data from the proxy

2013-10-25 Thread Jeff Trawick
On Fri, Oct 25, 2013 at 9:07 AM, Daniel Ruggeri wrote:

> As I stand up a simple IPv6 test proxy that supports both AF_INET and
> AF_INET6 addresses, I was looking for a way to log what addr family (and
> maybe the IP address) mod_proxy settled on for each request in the
> access_log. I'm not seeing a way to do that (but correct me if I'm
> missing something) and was poking through the code and got to thinking
> that there are all kinds of data bits that'd be interesting to have
> available in the ENV.
>
> I'm thinking it'd be worth adding a directive (ProxyAddEnvironment?)
> that adds these ENV entries to each r->subprocess_env:
>  * Host header sent to backend (useful when dynamic targets are used)
>  * Target DNS name if set
>  * Target IP address
>  * Target Address family
>  * Target port
>  * Target connection protocol
>  * Flag for SSL enabled
>
> All of the data is readily available once a connection is acquired in
> ap_proxy_acquire_connection sans the HTTP Host header.
> Aside from logging, exporting these as ENV entries to the request allows
> us to do all sorts of stuff in other modules, too
>
> Any thoughts? Is there something I should include or exclude before I
> begin?
>
> --
> Daniel Ruggeri
>
>
(unrefined, right out of my ...  head)

useful to have a convention (if not API) for how this info is made
available for logging, etc., so that other modules can play the same game
(e.g., mod_jk, FastCGI, whatever)

what about a plugin with optional functions that has APIs for recording
backend state that is meaningful across variety of "gateway" modules?  for
now maybe it is just for logging, but it could save in shared memory for
extraction in mod_status or other reports

maybe that solution is a bit farfetched, but I guess the theme is that
creating a proxy-specific solution can be a wasted effort given the same
need for any number of other "gateway" modules

-- 
Born in Roswell... married an alien...
http://emptyhammock.com/


Re: stop copying footers to r->headers_in?

2013-10-25 Thread Yann Ylavic
On Thu, Oct 24, 2013 at 7:42 PM, Eric Covener  wrote:

> On Wed, Oct 23, 2013 at 8:04 AM, Yann Ylavic  wrote:
> >> 1) add r->footers_in and use it in 2.2 and up by default
> > Do that mean no API/ABI change ?
>
> In the sense that it needs to be backportable, yes -- but it will mean
> a behavior change to "existing" APIs. Depends on how you interpret it
> I guess, but I think the confusion over trailers/headers is something
> that needs to be forcible corrected in those service releases.
>

Concretely, is request_rec::trailers_in/out (at the very end of the struct)
something backportable or should the footers be stored elsewhere?

According to [my understanding of] the rfc2616 and/or
draft-ietf-httpbis-p1-messaging-24 about the HTTP trailer (chunking, TE and
Trailer sections), they are hop-by-hop (anyway negociable/negociated
hop-by-hop).
They could then be stored in something more related to the request's
connection(s), for *example* in the http_ctx_t used by ap_http_filter for
input, which could probably be shared with protocol output filters too for
the trailers_out.
There is still the need to access them from r, for *example* (still) with
something like the request_config of the http_module.
But this looks like a big hack compared to the (simple)
r->trailers_in/out...

So, is the backportability something to care about for trunk (now) or
things should be kept [as] simple [as possible] there and complications
arise when backporting (using a different storage/access for example)?

My plans were to use request_rec for now and change this later if it oughts
to, but it could be useful to point/suggest me a backportable way for that
before I hit the wall...


> Thanks again for the help!
>

And you for your feedbacks.

Regards.


Exposing more loggable data from the proxy

2013-10-25 Thread Daniel Ruggeri
As I stand up a simple IPv6 test proxy that supports both AF_INET and
AF_INET6 addresses, I was looking for a way to log what addr family (and
maybe the IP address) mod_proxy settled on for each request in the
access_log. I'm not seeing a way to do that (but correct me if I'm
missing something) and was poking through the code and got to thinking
that there are all kinds of data bits that'd be interesting to have
available in the ENV.

I'm thinking it'd be worth adding a directive (ProxyAddEnvironment?)
that adds these ENV entries to each r->subprocess_env:
 * Host header sent to backend (useful when dynamic targets are used)
 * Target DNS name if set
 * Target IP address
 * Target Address family
 * Target port
 * Target connection protocol
 * Flag for SSL enabled

All of the data is readily available once a connection is acquired in
ap_proxy_acquire_connection sans the HTTP Host header.
Aside from logging, exporting these as ENV entries to the request allows
us to do all sorts of stuff in other modules, too

Any thoughts? Is there something I should include or exclude before I begin?

--
Daniel Ruggeri



Re: Erorr "scoreboard is full, not at MaxRequestWorkers" without traffic

2013-10-25 Thread Thomas Eckert
Regarding normal traffic handling, where the thread count might change over
time, I can see your point. But with a 'static' state, such as right after
start up with no traffic, how would

  (idle_thread_count < min_spare_threads)

be triggered ? I would expect the thread count to be unchanging as there
are no reasons to shut down active threads - both since (MinSpareThreads <
MaxSpareThreads) and (idle_thread_count == min_spare_threads).



On Fri, Oct 18, 2013 at 5:19 PM, Jim Jagielski  wrote:

> The mojo for this, in both worker and event is:
>
> (active_thread_count >= ap_daemons_limit * threads_per_child)
>
> which is only worried about if:
>
> (idle_thread_count < min_spare_threads)
>
> On Oct 18, 2013, at 10:22 AM, Thomas Eckert 
> wrote:
>
> > Hey folks,
> >
> > there's been quite a few issues with the "scoreboard is full, not at
> MaxRequestWorkers" error log message. From what I've found all of them
> involved some sort of heavy traffic. I've been wondering about the root
> cause for this but now I have a reverse proxy showing this message right
> after starting up - no traffic involved at all.
> >
> > I'm getting the message for a simple reverse proxy with about a hundred
> virtual hosts which is strange because
> >
> > StartServers 3
> > ServerLimit 30
> > MaxClients 1500
> > MinSpareThreads 25
> > MaxSpareThreads 75
> > ThreadsPerChild 50
> > MaxRequestsPerChild 0
> >
> > doesn't look like it cannot handle 100 hosts without traffic. The vhosts
> themselves look like
> >
> > 
> > ServerName my_domain_3
> > SSLProxyEngine On
> > RequestHeader set X-Forwarded-Proto http
> > 
> > ProxyPass balancer://58be3a18b1c6590f106e3536d31d0447/
> lbmethod=bybusyness
> > ProxyPassReverse
> balancer://58be3a18b1c6590f106e3536d31d0447/
> > Require all granted
> > 
> >  >
> > which is really basic. There is only one virtual host with a "real"
> configuration (setup is part of a scalability test).
> >
> > Again, there is no traffic involved, it's happening right after start
> up. I checked the processes and the start up is clean, meaning there are no
> old left-overs that might interfere somehow.
> >
> > Any hints on how to debug this ?
>
>