Re: Problem with mod_fcgid handling ErrorDocuments

2010-07-08 Thread Edgar Frank
2010/07/07 Graham Dumpleton:
>On 7 July 2010 11:43, Graham Dumpleton 
>wrote:
>> On 6 July 2010 22:56, Edgar Frank  wrote:
>>> Hi mod_fcgid developers,
>>>
>>> I'm currently exploring a potential problem with mod_fcgid.
>>> Let's assume a setup with mod_security and mod_fcgid
>>> (has nothing to do with mod_security itself - it just helps to
>>> trigger the problem).
>>>
>>> Now we have a large POST request which mod_security blocks
>>> (by SecRequestBodyLimit) with "413 Request Entity Too Large".
>>
>> Presumably it might also occur with LimitRequestBody and not even
>> need mod_security. Unless that is that this is a problem with
>> mod_security itself and how it handles a 413.
>>
>> So, is the issue with mod_fcgid. If there is an issue here with
>> ErrorDocument for a 413 where the handler is a proxy of some form,
>> then likely could affect other modules besides mod_fcgid.
>>
>> I would be investigating where ErrorDocument for 413 is handed off
>> to URL implemented by CGI or even mod_proxy to see what happens.

Sorry, this isn't clear to me. Could you please clarify?

Debugging showed, that when mod_security rejects the request because
it's oversized, ap_run_post_read_request in protocol.c:1011 returns 413,
in which case regular ap_die() handling kicks in in protocol.c:1012.
(line numbers refer to the 2.2.15 tag).
Then, in http_request.c:425 (internal_internal_redirect calls
ap_run_post_read_request) mod_security returns DECLINED, allowing an
errorpage to be served instead of recursively returning 413.

So I'd assume that it's not a mod_security issue?

>>> The ErrorDocument for 413 is configured to a Location which
>>> mod_fcgid serves. (Please don't argue that it's this way - I know
>>> the problems and I'm not happy with it, but it's not my decision
>>> to do it that way.)
>>>
>>> HTTPD issues a GET subrequest for the ErrorDocument and
>>> mod_fcgid kicks in. But now it starts consuming the request body
>>> we just blocked - or if the request body size is larger than
>>> FcgidMaxRequestLen, ErrorDocument generation fails.
>>>
>>> I wonder how to circumvent this. In fcgid_bridge.c:bridge_request
>>> I found:
>>>
>>> if (role == FCGI_RESPONDER) {
>>>  rc = add_request_body( [...] );
>>> }
>>>
>>> Could one change this to something like the following without
>>> causing trouble?
>>>
>>> if (role == FCGI_RESPONDER && !ap_is_HTTP_ERROR(r->status)) {
>>>  rc = add_request_body( [...] );
>>> }
>>>
>>> Or maybe something like a HTTP method check? (Is there a
>>> reliable way to detect if we're in ErrorDocument generation
>>> anyway?)
>
> Addressing this question, r->prev attribute should refer to original
> request and so can possibly check r->prev->status.
>
> Also, the following gets set:
>
> apr_table_setn(new->subprocess_env, "REDIRECT_STATUS",
> apr_itoa(r->pool, r->status));
>
> before internal redirect. This is more of interest where CGI, or
> FASTCGI script actually gets executed as that wouldn't have access to
> the original request object.

Oh, my fault - it's an internal redirect, not a subrequest. Thanks for
pointing this out.

At this point in mod_fcgid, why don't access r->prev->status? Or have I
mistaken you? This would save a table lookup and a string-to-int. Or did
you mean, in the CGI-script (after a request-body might have been
dropped)?

I though about it a while and came to the conclusion that one would have
to check ap_status_drops_connection( r->prev->status) (assuming, that
r->prev != NULL). If this is false, we'd have to consume the request
body from the input filter chain to allow future (keep-alive) requests
to be parsed correctly. If it's true, the input filter chain shouldn't
be touched (connection will be dropped anyway) and Content-Length,
Transfer-Encoding and Expect headers should be removed to not confuse
the (Fast)CGI script.

But this are just my thoughts and I'm interested, if I'm totally off or
you maybe agree. Looking forward to hear your comments.

Regards,
Edgar


Problem with mod_fcgid handling ErrorDocuments

2010-07-06 Thread Edgar Frank
Hi mod_fcgid developers,

I'm currently exploring a potential problem with mod_fcgid.
Let's assume a setup with mod_security and mod_fcgid
(has nothing to do with mod_security itself - it just helps to
trigger the problem).

Now we have a large POST request which mod_security blocks
(by SecRequestBodyLimit) with "413 Request Entity Too Large".

The ErrorDocument for 413 is configured to a Location which
mod_fcgid serves. (Please don't argue that it's this way - I know
the problems and I'm not happy with it, but it's not my decision
to do it that way.)

HTTPD issues a GET subrequest for the ErrorDocument and
mod_fcgid kicks in. But now it starts consuming the request body
we just blocked - or if the request body size is larger than
FcgidMaxRequestLen, ErrorDocument generation fails.

I wonder how to circumvent this. In fcgid_bridge.c:bridge_request
I found:

if (role == FCGI_RESPONDER) {
 rc = add_request_body( [...] );
}

Could one change this to something like the following without
causing trouble?

if (role == FCGI_RESPONDER && !ap_is_HTTP_ERROR(r->status)) {
 rc = add_request_body( [...] );
}

Or maybe something like a HTTP method check? (Is there a
reliable way to detect if we're in ErrorDocument generation
anyway?) But at this point we have put the Content-Length header
already into the stream to the FCGI backend, so one would also
have to take action earlier.

What do you think in general of handling this? I'd really
appreciate an elaborate answer - if you find it fix-worthy,
first ideas how to fix it - and if not, why not. 

Regards,
Edgar

FYI:
mod_fcgid 2.3.5
with httpd 2.2.15
on CentOS 5.4 x64
built from source with gcc 4.2.4


Re: [mod_fcgid] Feedback / Suggestions

2009-11-26 Thread Edgar Frank
On Tue, Nov 24, 2009, Jeff Trawick  wrote:
> On Tue, Nov 24, 2009 , Edgar Frank  wrote:
> > (1)
> > mod_fcgid should be capable of specifying an external FCGI server.
> > (2)
> > In conjunction with (1), mod_fcgid should be able to select the backend
> > server based on request data.
> 
> I'd much rather see effort put into mod_proxy_fcgi to support this use
> case.  I wish somebody, perhaps myself, had time to work on it.  It
> doesn't seem that hard a task.

Just an idea, I though about - what would you think about pulling
mod_proxy_fcgi up to the FCGI-code base of mod_fcgid? Could be kind of
difficult, as mod_fcgid is maintained separately at the moment, but IMHO
mod_fcgid has a much more sophisticated code in FCGI-communication (e.g. using
own bucket types) and adhering to the CGI standard.

If, in theory, both were sitting on the same code base, any fixes/changes/
enhancements in FCGI handling would affect and improve mod_proxy_fcgi, too.
This way, mod_proxy_fcgi would be the superior choice for externally spawned
workers (where dynamic worker selection really starts to make any sense) while
mod_fcgid cares about local workers and process management - and every module
would serve its very own purpose.

Okay, this has some downsides, too, like increased testing overhead, but in
general I like the idea.

I would be glad to hear your thoughts.

Regards,
Edgar



Re: [mod_fcgid] Feedback / Suggestions

2009-11-25 Thread Edgar Frank
2009/11/25 Olaf van der Spek 
> > Yes, you're right. In a FCGI_GET_VALUES request, the backend can send
> > arbitrary name-value-pairs. Unfortunately there is no standard way to tell
> > the frontend that this feature is supported. Maybe, making the name (and
> > expected value) of this name-value-pair configurable in mod_fcgid could
> > be a reasonable way.
> 
> Doesn't sound reasonable either. If you introduce such a feature, it
> should simply be coordinated with other FastCGI stakeholders.

You're right, again. I'm just wondering if such a change is likely to happen.
Last update was 2002. I'd like to propose a change to the FastCGI
stakeholders, if this makes any sense from a single, unrelated person.

Edgar


Re: [mod_fcgid] Feedback / Suggestions

2009-11-25 Thread Edgar Frank
2009/11/26 Olaf van der Spek 
> On Wed, Nov 25, 2009 at 11:37 AM, Edgar Frank  wrote:
> > Maybe, in implementing this in mod_fcgid and making it configurable,
> > Apache can serve more intelligent backends better.
> 
> Wouldn't it be better to have the backend tell the frontend that it
> supports this feature? Manual configuration should be avoided if
> possible.

Yes, you're right. In a FCGI_GET_VALUES request, the backend can send
arbitrary name-value-pairs. Unfortunately there is no standard way to tell the
frontend that this feature is supported. Maybe, making the name (and expected
value) of this name-value-pair configurable in mod_fcgid could be a reasonable
way.

Edgar



Re: [mod_fcgid] Feedback / Suggestions

2009-11-25 Thread Edgar Frank
2009/11/25 Graham Dumpleton 
> 2009/11/25 Edgar Frank :
> > While delving into the FCGI and CGI spec, I encountered another reason not
> > to stream client data directly. CGI wants an explicitly set CONTENT_LENGTH
> > and FCGI enforces than rather obsoletes this (last sentence in 6.2 of the
> > FCGI spec).
> > If the client sends for any reason a message body with no CONTENT_LENGTH
> > set or CONTENT_LENGTH to be ignored as defined by RFC2616, you have to
> > read the full message body to determine the correct content length
> > which should be transferred to the backend.
> 
> Things can get worse. Even if CONTENT_LENGTH is sent, if you have
> requests with compressed content which is decompressed by mod_deflate,
> the amount of content will not actually match what CONTENT_LENGTH says
> there will be as it reflects how things are before content is
> decompressed.

I implied this by originally meaning "read the full message body
through all input filters". Thanks for pointing this out.

> Don't know about FASTCGI in general, but for WSGI (Python higher level
> interface that can sit on CGI or FASTCGI) they have the stupid
> requirement that you take CONTENT_LENGTH as being precise and that you
> must not read more than CONTENT_LENGTH. If CONTENT_LENGTH isn't
> provided, WSGI says you are supposed to take it as meaning no data.
> 
> [...]
>
> Anyway, don't know if this is at all relevant to FASTCGI. As you point
> out though, the CONTENT_LENGTH requirement does at least prevent
> FASTCGI from handling chunked request content. WSGI specification has
> same stupid limitation.

At least if you adhere strictly to the spec, you have to do it this way for
FastCGI, too. Although FastCGI provides the means to explicitly tell the
backend when the end-of-stream is hit.
I can't say if this is relevant in the real world, as there is still a chance
of more intelligent backends. I'll try it with PHP as soon as I find time for
this.

Maybe, in implementing this in mod_fcgid and making it configurable, Apache
can serve more intelligent backends better.

Regards,
Edgar



Re: [mod_fcgid] Feedback / Suggestions

2009-11-25 Thread Edgar Frank
> On Tue, Nov 24, 2009 at 05:07 PM, Jeff Trawick  wrote:
> >>> Or otherwise, can someone explain the details to me why it is as it is?
> >>> Especially in terms of not pipeling data directly (maybe after a little
> >>> buffering to build proper FCGI packets)? The comment in
> >>> fcgid_bridge.c:452 (add_request_body) left me clueless. Why would this
> >>> keep the server in processing too long? Processing takes its time either
> >>> way, I'd assume. Looking forward to enlightment. :)
> >>
> >> I can only guess that the problem at hand when this was implemented
> >> was that some backend application processes were so expensive that
> >> that they couldn't be tied up until all data had been read from slow
> >> clients.
> >>
> > Yes, Jeff is right :)
> 
> This is a reasonable feature; once streaming to the app is implemented
> this alternate mechanism can be enabled with a per-request envvar
> (e.g., SetEnv in the directory or location).

Thanks for explaining this to me.

While delving into the FCGI and CGI spec, I encountered another reason not to
stream client data directly. CGI wants an explicitly set CONTENT_LENGTH and
FCGI enforces than rather obsoletes this (last sentence in 6.2 of the FCGI
spec).
If the client sends for any reason a message body with no CONTENT_LENGTH set
or CONTENT_LENGTH to be ignored as defined by RFC2616, you have to read the
full message body to determine the correct content length which should be
transferred to the backend.

Regards,
Edgar



Re: [mod_fcgid] Feedback / Suggestions

2009-11-24 Thread Edgar Frank
> > (1)
> > mod_fcgid should be capable of specifying an external FCGI server.
> > (2)
> > In conjunction with (1), mod_fcgid should be able to select the
> > backend server based on request data.
> 
> I'd much rather see effort put into mod_proxy_fcgi to support this use
> case.  I wish somebody, perhaps myself, had time to work on it.  It
> doesn't seem that hard a task.

mod_proxy_fcgi already supports this use case completely, if it weren't
for the two bugs I filed.

While I see BZ#48273 as not-so-big issue, BZ#48272 presents a potential
crash for the Apache process, which excludes it for production use.
As far as I have understood the problem in BZ#48272, this is a real
design issue in mod_proxy_fcgi, as it tries to stream the response by
design while ap_scan_header_err_core wants the complete script output
by design.

> In the interim, is mod_fastcgi really that bad?

Lets say, it works (fine) in terms that you can specify an external
server.

I took modifying mod_fastcgi into account to add the functionality for
dynamic backend-switching, but refrained from doing so as the source
code is made to work with Apache 1.x/2.0/2.2 with lots of #ifdefs, only
partial APR support (handful of things are done on OS level) and
additional Win32-kludges.

Additionally, mod_fastcgi reads its backend servers at configuration
time and access them at runtime readonly. Modifying this would mean to
add thread safty to the source, if you want to use it in the worker mpm.
Considering the sourcecode, this would be a bad job.



[mod_fcgid] Feedback / Suggestions

2009-11-24 Thread Edgar Frank
Hi dev,

I'd like to suggest to following changes / offer feedback for mod_fcgid:

(1)
mod_fcgid should be capable of specifying an external FCGI server.

Assume a scenario where the backend FCGI server (e.g. PHP with
php-fpm) runs for security considerations prespawned on a separate
machine with a firewall only letting FCGI traffic through.

This is possible with mod_proxy_fcgi or mod_fastcgi (with
mod_proxy_fcgi only being in trunk and mod_fastcgi kinda outdated).

(2)
In conjunction with (1), mod_fcgid should be able to select the backend
server based on request data.

Current, this kind of behaviour is possible with mod_proxy and
mod_proxy_fcgi using either interpolated environment variables and/or
mod_rewrite with the [P] suffix in a RewriteRule.

Unfortunately mod_proxy_fcgi currently exists only in trunk, seems to
be kind of abandoned and has some bugs (I filed two bug reports in BZ
under 48272 and 48273).

As mod_fcgid seems to be the current "weapon of choice", it would be
great to have this feature(s) in mod_fcgid (or mod_proxy_fcgi fixed
otherwise).

Admittedly I have a urgent need for this feature in a production
environment. If this feature can't be implemented (respectivly fixed)
in the near future (which can I fully understand as a software
developer), what would be your suggestion to get an Apache running this
way? I'd really hate it to abandon Apache because of this (and am
willing to do something against it as far as I still have time in my
current project)

(3)
mod_fcgid currently buffers the complete input from the client
(occasionaly in a temp-file if the request is large) before it passes it
through to a FCGI backend. Could this be made configurable in a way like
File|Memory|Pipeline? (file - as is | memory - buffer in memory always
| pipeline - directly pass the read data to the backend)

Or otherwise, can someone explain the details to me why it is as it is?
Especially in terms of not pipeling data directly (maybe after a little
buffering to build proper FCGI packets)? The comment in
fcgid_bridge.c:452 (add_request_body) left me clueless. Why would this
keep the server in processing too long? Processing takes its time either
way, I'd assume. Looking forward to enlightment. :)

(4)
Would it make sense to use the FCGI feature to multiplex several
requests over a single connection? Does any backend support this
feature anyway?
When thinking of an external FCGI backend with a socket connection and
very high Requests/s, this could keep open connections and
used/available ports much lower.


Really looking forward to your comments.
Regards,
Edgar Frank