mod_fcgid and streaming large request bodies

2013-12-27 Thread Justin Erenkrantz
Hi all!

I'm currently using FastCGI for an application - in particular, Ceph's
radosgw (S3 endpoint).  I was told to use mod_fastcgi as mod_fcgid doesn't
handle large request bodies appropriately.  Yet, when I looked at the
mod_fastcgi code, I shrieked in horror.  =)

In looking at add_request_body in fcgid_bridge.c, it looks like we read the
request body into memory and if it crosses a threshold, we spool it to
disk.  After everything is read from the client, then mod_fcgid begins to
write to the FastCGI socket.  In my use case, we would be PUT'ing large
request bodies that are many GB in size - so, I'd rather just stream it to
the FastCGI socket and take the hit on potentially blocking other incoming
requests.  (In our case, it makes no sense to spool the request body to
local disk.)

Anyway, I'm diving into the code a bit - but, I figured it might be useful
to see if anyone else has any thoughts about how to handle large request
bodies with mod_fcgid.

Happy holidays!  -- justin


Ceph patches for httpd

2013-12-27 Thread Justin Erenkrantz
Hi all!

There are two patches that the Ceph community has applied to their httpd
packages in combination with radosgw (S3 endpoint) - (see
https://github.com/ceph/apache2).

One of them is to allow Content-Length of '0' to be emitted from HEAD
requests:

https://github.com/ceph/apache2/commit/0d9948f1e483386adef0841896484db8422127b2

The use case here is that someone could store a zero-byte file inside of
radosgw.  Amazon's S3 clients expect to see a Content-Length on HEAD
requests - IOW, they don't infer the lack of a Content-Length as being '0'.
 If we weren't comfortable allowing this as a default, I'm guessing that we
could expose this as a directive override.

The other patch is to relax some of the checks around Expect headers as not
all S3 clients emit compliant headers:

https://github.com/ceph/apache2/commit/5ae1b4a081b05fcacf55e7114eec87d9b2a0a5da

Again, I guess that we could apply a directive here to relax the check.

If we go the directive route, both are relatively easy to whip up patches
for, but I wanted to get some feedback before I commit anything to trunk.

Cheers.  -- justin


Re: please sign new apache releases only with strong keys -- trimming the KEYS file

2013-12-27 Thread Daniel Kahn Gillmor
On 12/26/2013 06:18 PM, Nick Kew wrote:
 You're ahead of us.  Individual Apache folks like Jim have taken
 responsibility and moved to 4096-bit keys, but we haven't as a
 community had the discussion that might lead to pruning KEYS.
 My inclination is to say NO to requiring anyone to remove old keys,
 but YES to encouraging strong keys to sign all releases.

Thanks for considering this, Nick.

At the moment, some of your downstreams have the impression that KEYS
indicates all of the keys that apache might use to sign releases.  For
example, in  http://bugs.debian.org/732450#22 Arno Töll writes:

 Therefore, I thought a more complete patch would be a keyring which
 includes all signatures of people allowed to sign and release code on
 behalf of the httpd project.

Maybe you could update the preamble of KEYS to indicate that only strong
keys (and please clearly define strong if y'all are making this
policy) will be used to sign releases?

Nick again:
 That may be an issue for some Apache folks.  For myself, my newer
 (4096-bit) key has fewer sigs than my old 1024-bit[1], though not
 catastrophically so.  What is perhaps more of an issue is that hardly
 any of the signatures on the new key are from Apache folks, as I have
 (alas) not made it to Apachecon for a couple of years now.  Others may
 have a range of reasons for retaining older keys.

Your 4096-bit key (0x3CE3BAC2EB7BBC624D1D22D8F3B9D88CB87F79A9) appears
to be certified by nearly 60 other keys -- including at least a couple
debian developers and Nikos Mavrogiannopolous (the lead GnuTLS
developer).  I can't speak for all of debian, but i think a strong key
connected by a few paths to other established free software developers
is more reliable than a weak key connected by dozens of paths.

The keys themselves should not be the weak point in our cryptosystems.

 [1] Key IDs 40581837 and B87F79A9

(i recommend using full fingerprints instead of keyids if you have to
communicate about a specific key:
  https://www.debian-administration.org/users/dkg/weblog/105)

Regards,

--dkg




signature.asc
Description: OpenPGP digital signature


Re: please sign new apache releases only with strong keys -- trimming the KEYS file

2013-12-27 Thread Frederick Miller
Please remove me from this email list. Please unsubscribe me.  Thanks.


On Fri, Dec 27, 2013 at 10:49 AM, Daniel Kahn Gillmor d...@fifthhorseman.net
 wrote:

 On 12/26/2013 06:18 PM, Nick Kew wrote:
  You're ahead of us.  Individual Apache folks like Jim have taken
  responsibility and moved to 4096-bit keys, but we haven't as a
  community had the discussion that might lead to pruning KEYS.
  My inclination is to say NO to requiring anyone to remove old keys,
  but YES to encouraging strong keys to sign all releases.

 Thanks for considering this, Nick.

 At the moment, some of your downstreams have the impression that KEYS
 indicates all of the keys that apache might use to sign releases.  For
 example, in  http://bugs.debian.org/732450#22 Arno Töll writes:

  Therefore, I thought a more complete patch would be a keyring which
  includes all signatures of people allowed to sign and release code on
  behalf of the httpd project.

 Maybe you could update the preamble of KEYS to indicate that only strong
 keys (and please clearly define strong if y'all are making this
 policy) will be used to sign releases?

 Nick again:
  That may be an issue for some Apache folks.  For myself, my newer
  (4096-bit) key has fewer sigs than my old 1024-bit[1], though not
  catastrophically so.  What is perhaps more of an issue is that hardly
  any of the signatures on the new key are from Apache folks, as I have
  (alas) not made it to Apachecon for a couple of years now.  Others may
  have a range of reasons for retaining older keys.

 Your 4096-bit key (0x3CE3BAC2EB7BBC624D1D22D8F3B9D88CB87F79A9) appears
 to be certified by nearly 60 other keys -- including at least a couple
 debian developers and Nikos Mavrogiannopolous (the lead GnuTLS
 developer).  I can't speak for all of debian, but i think a strong key
 connected by a few paths to other established free software developers
 is more reliable than a weak key connected by dozens of paths.

 The keys themselves should not be the weak point in our cryptosystems.

  [1] Key IDs 40581837 and B87F79A9

 (i recommend using full fingerprints instead of keyids if you have to
 communicate about a specific key:
   https://www.debian-administration.org/users/dkg/weblog/105)

 Regards,

 --dkg





Re: Ceph patches for httpd

2013-12-27 Thread Ask Bjørn Hansen

On Dec 27, 2013, at 14:57, Justin Erenkrantz jus...@erenkrantz.com wrote:

 The use case here is that someone could store a zero-byte file inside of 
 radosgw.  Amazon's S3 clients expect to see a Content-Length on HEAD requests 
 - IOW, they don't infer the lack of a Content-Length as being '0'.  If we 
 weren't comfortable allowing this as a default, I'm guessing that we could 
 expose this as a directive override.

Why would that not be a reasonable default?


Ask

Re: Ceph patches for httpd

2013-12-27 Thread Frederick Miller
How can I unsubscribe from these emails?


On Fri, Dec 27, 2013 at 10:53 AM, Ask Bjørn Hansen a...@develooper.comwrote:


 On Dec 27, 2013, at 14:57, Justin Erenkrantz jus...@erenkrantz.com
 wrote:

  The use case here is that someone could store a zero-byte file inside of
 radosgw.  Amazon's S3 clients expect to see a Content-Length on HEAD
 requests - IOW, they don't infer the lack of a Content-Length as being '0'.
  If we weren't comfortable allowing this as a default, I'm guessing that we
 could expose this as a directive override.

 Why would that not be a reasonable default?


 Ask


Re: Ceph patches for httpd

2013-12-27 Thread Frederick Miller
Please stop sending me these emails.


On Fri, Dec 27, 2013 at 10:53 AM, Ask Bjørn Hansen a...@develooper.comwrote:


 On Dec 27, 2013, at 14:57, Justin Erenkrantz jus...@erenkrantz.com
 wrote:

  The use case here is that someone could store a zero-byte file inside of
 radosgw.  Amazon's S3 clients expect to see a Content-Length on HEAD
 requests - IOW, they don't infer the lack of a Content-Length as being '0'.
  If we weren't comfortable allowing this as a default, I'm guessing that we
 could expose this as a directive override.

 Why would that not be a reasonable default?


 Ask


how to unsubscribe form a mailing-list

2013-12-27 Thread Reindl Harald
Am 27.12.2013 16:53, schrieb Frederick Miller:
 Please remove me from this email list. Please unsubscribe me.  Thanks.

first do *not* hijack threads

please realize that any mailing list has a welcome message
which should be read after subscribe and not careless thrown
away as well as list headers are in each message

so nobody but you can unsubscribe you from a list

list-help: mailto:dev-h...@httpd.apache.org
list-unsubscribe: mailto:dev-unsubscr...@httpd.apache.org
List-Post: mailto:dev@httpd.apache.org
List-Id: dev.httpd.apache.org



signature.asc
Description: OpenPGP digital signature


Re: Ceph patches for httpd

2013-12-27 Thread Justin Erenkrantz
On Fri, Dec 27, 2013 at 10:53 AM, Ask Bjørn Hansen a...@develooper.comwrote:


 On Dec 27, 2013, at 14:57, Justin Erenkrantz jus...@erenkrantz.com
 wrote:

  The use case here is that someone could store a zero-byte file inside of
 radosgw.  Amazon's S3 clients expect to see a Content-Length on HEAD
 requests - IOW, they don't infer the lack of a Content-Length as being '0'.
  If we weren't comfortable allowing this as a default, I'm guessing that we
 could expose this as a directive override.

 Why would that not be a reasonable default?


As we approach 2014, I'd be fine with allowing '0' as a legitimate C-L
value.  Ryan has a comment in line 1237 that says:

---
/* This is a hack, but I can't find anyway around it.  The idea is that
 * we don't want to send out 0 Content-Lengths if it is a head request.
 * This happens when modules try to outsmart the server, and return
 * if they see a HEAD request.  Apache 1.3 handlers were supposed to
 * just return in that situation, and the core handled the HEAD.  In
 * 2.0, if a handler returns, then the core sends an EOS bucket down
 * the filter stack, and the content-length filter computes a C-L of
 * zero and that gets put in the headers, and we end up sending a
 * zero C-L to the client.  We can't just remove the C-L filter,
 * because well behaved 2.0 handlers will send their data down the
stack,
 * and we will compute a real C-L for the head request. RBB
 */
if (r-header_only
 (clheader = apr_table_get(r-headers_out, Content-Length))
 !strcmp(clheader, 0)) {
apr_table_unset(r-headers_out, Content-Length);
}
---

The commit log is:

r86976 | rbb | 2000-11-15 17:08:44 -0500 (Wed, 15 Nov 2000) | 6 lines

Do not send a content-length if and only if this is a HEAD request and
the content-length is 0.  The problem is that the C-L on a HEAD response
has to be the correct C-L, but if a handler returns saying the handled
the request without sending data, the core sends an EOS down the filter
stack, and we compute a 0 C-L.



+1 for just removing this whole block; but, I guess that there could
*still* be some broken filters/generators that set C-L to 0 when they
shouldn't.  -- justin


Re: mod_fcgid and streaming large request bodies

2013-12-27 Thread Justin Erenkrantz
On Fri, Dec 27, 2013 at 8:49 AM, Justin Erenkrantz
jus...@erenkrantz.com wrote:
 Anyway, I'm diving into the code a bit - but, I figured it might be useful
 to see if anyone else has any thoughts about how to handle large request
 bodies with mod_fcgid.

Here's a first-cut patch that compiles at least.  Cheers.  -- justin

Add FcgidStreamRequestBody directive to send request body as it arrives without
storing it in memory or on disk.

* modules/fcgid/mod_fcgid.c
  (fcgid_cmds): Add FcgidStreamRequestBody directive.
* modules/fcgid/fcgid_conf.h
  (fcgid_server_conf): Add stream_request_body/stream_request_body_set flags.
* modules/fcgid/fcgid_conf.c
  (DEFAULT_STREAM_REQUEST_BODY): Set to 0.
  (create_fcgid_server_config): Init stream_request_body flag.
  (merge_fcgid_server_config): Merge new stream_request_body flag.
  (set_stream_request_body): New config helper.
* modules/fcgid/fcgid_bridge.c
  (add_request_body): Add a forward declaration to reduce diff (for now);
  take fcgid_bucket_ctx param; if stream_request_body is set, don't copy
  bucket and then call proc_write_ipc to send out that brigade and clear it
  out before next loop iteration.
  (bridge_request): Delay reading the request body until later.
  (handle_request_ipc): Move add_request_body call to here and append the EOS
  bucket before we do the final write of the request body.

Index: modules/fcgid/fcgid_bridge.c
===
--- modules/fcgid/fcgid_bridge.c (revision 1553671)
+++ modules/fcgid/fcgid_bridge.c (working copy)
@@ -287,6 +287,10 @@ static int getsfunc_fcgid_BRIGADE(char *buf, int l
 return done;
 }

+static int add_request_body(request_rec *r, apr_pool_t *request_pool,
+apr_bucket_brigade *output_brigade,
+fcgid_bucket_ctx *bucket_ctx);
+
 static int
 handle_request_ipc(request_rec *r, int role,
apr_bucket_brigade *output_brigade,
@@ -295,9 +299,21 @@ handle_request_ipc(request_rec *r, int role,
 int cond_status;
 apr_status_t rv;
 apr_bucket_brigade *brigade_stdout;
+apr_bucket *bucket_eos;
 char sbuf[MAX_STRING_LEN];
 const char *location;

+if (role == FCGI_RESPONDER) {
+rv = add_request_body(r, r-pool, output_brigade, bucket_ctx);
+if (rv) {
+return rv;
+}
+}
+
+/* The eos bucket now */
+bucket_eos = apr_bucket_eos_create(r-connection-bucket_alloc);
+APR_BRIGADE_INSERT_TAIL(output_brigade, bucket_eos);
+
 /* Write output_brigade to fastcgi server */
 if ((rv = proc_write_ipc(bucket_ctx-ipc,
  output_brigade)) != APR_SUCCESS) {
@@ -304,6 +320,7 @@ handle_request_ipc(request_rec *r, int role,
 bucket_ctx-has_error = 1;
 return HTTP_INTERNAL_SERVER_ERROR;
 }
+apr_brigade_cleanup(output_brigade);

 /* Create brigade */
 brigade_stdout =
@@ -522,7 +539,8 @@ handle_request(request_rec * r, int role, fcgid_cm
 }

 static int add_request_body(request_rec *r, apr_pool_t *request_pool,
-apr_bucket_brigade *output_brigade)
+apr_bucket_brigade *output_brigade,
+fcgid_bucket_ctx *bucket_ctx)
 {
 apr_bucket *bucket_input, *bucket_header;
 apr_file_t *fd = NULL;
@@ -563,8 +581,6 @@ static int add_request_body(request_rec *r, apr_po
 return HTTP_INTERNAL_SERVER_ERROR;
 }

-
-
 while ((bucket_input = APR_BRIGADE_FIRST(input_brigade)) !=
APR_BRIGADE_SENTINEL(input_brigade)) {
 const char *data;
 apr_size_t len;
@@ -615,7 +631,9 @@ static int add_request_body(request_rec *r, apr_po
 return HTTP_INTERNAL_SERVER_ERROR;
 }

-if (request_size  sconf-max_mem_request_len) {
+if (sconf-stream_request_body) {
+bucket_stdin = bucket_input;
+} else if (request_size  sconf-max_mem_request_len) {
 apr_size_t wrote_len;
 static const char *fd_key = fcgid_fd;

@@ -701,6 +719,16 @@ static int add_request_body(request_rec *r, apr_po
 }
 APR_BRIGADE_INSERT_TAIL(output_brigade, bucket_header);
 APR_BRIGADE_INSERT_TAIL(output_brigade, bucket_stdin);
+
+if (sconf-stream_request_body) {
+/* Write output_brigade to fastcgi server */
+if ((rv = proc_write_ipc(bucket_ctx-ipc,
+ output_brigade)) != APR_SUCCESS) {
+bucket_ctx-has_error = 1;
+return HTTP_INTERNAL_SERVER_ERROR;
+}
+apr_brigade_cleanup(output_brigade);
+}
 }

 apr_brigade_cleanup(input_brigade);
@@ -731,7 +759,6 @@ static int add_request_body(request_rec *r, apr_po
 int bridge_request(request_rec * r, int role, fcgid_cmd_conf *cmd_conf)
 {
 

Re: mod_fcgid and streaming large request bodies

2013-12-27 Thread Jeff Trawick
On Fri, Dec 27, 2013 at 6:14 PM, Justin Erenkrantz jus...@erenkrantz.comwrote:

 On Fri, Dec 27, 2013 at 8:49 AM, Justin Erenkrantz
 jus...@erenkrantz.com wrote:
  Anyway, I'm diving into the code a bit - but, I figured it might be
 useful
  to see if anyone else has any thoughts about how to handle large request
  bodies with mod_fcgid.

 Here's a first-cut patch that compiles at least.  Cheers.  -- justin

 Add FcgidStreamRequestBody directive to send request body as it arrives
 without
 storing it in memory or on disk.


It would be quite valuable if there is a limit on how much can be pre-read
(0 for pure streaming).  Pre-reading the request body reduces the number of
application processes or threads required, and they are usually fatter
(sometimes dramatically so) than httpd workers.  While the current
implementation is truly ridiculous, there's some goodness that could be
kept.  I'll try to look through this in the next few days and see if it is
practical to keep the best of both versions.  If no round tuits are
available we should at least proceed with this, perhaps changing the
directive syntax to allow a hybrid implementation in the future with no
legacy cruft.  (Wild suggestion: FcgiPreReadRequestBody
Off|On|Unlimited|K, where On is defaultK)


 * modules/fcgid/mod_fcgid.c
   (fcgid_cmds): Add FcgidStreamRequestBody directive.
 * modules/fcgid/fcgid_conf.h
   (fcgid_server_conf): Add stream_request_body/stream_request_body_set
 flags.
 * modules/fcgid/fcgid_conf.c
   (DEFAULT_STREAM_REQUEST_BODY): Set to 0.
   (create_fcgid_server_config): Init stream_request_body flag.
   (merge_fcgid_server_config): Merge new stream_request_body flag.
   (set_stream_request_body): New config helper.
 * modules/fcgid/fcgid_bridge.c
   (add_request_body): Add a forward declaration to reduce diff (for now);
   take fcgid_bucket_ctx param; if stream_request_body is set, don't copy
   bucket and then call proc_write_ipc to send out that brigade and clear it
   out before next loop iteration.
   (bridge_request): Delay reading the request body until later.
   (handle_request_ipc): Move add_request_body call to here and append the
 EOS
   bucket before we do the final write of the request body.

 Index: modules/fcgid/fcgid_bridge.c
 ===
 --- modules/fcgid/fcgid_bridge.c (revision 1553671)
 +++ modules/fcgid/fcgid_bridge.c (working copy)
 @@ -287,6 +287,10 @@ static int getsfunc_fcgid_BRIGADE(char *buf, int l
  return done;
  }

 +static int add_request_body(request_rec *r, apr_pool_t *request_pool,
 +apr_bucket_brigade *output_brigade,
 +fcgid_bucket_ctx *bucket_ctx);
 +
  static int
  handle_request_ipc(request_rec *r, int role,
 apr_bucket_brigade *output_brigade,
 @@ -295,9 +299,21 @@ handle_request_ipc(request_rec *r, int role,
  int cond_status;
  apr_status_t rv;
  apr_bucket_brigade *brigade_stdout;
 +apr_bucket *bucket_eos;
  char sbuf[MAX_STRING_LEN];
  const char *location;

 +if (role == FCGI_RESPONDER) {
 +rv = add_request_body(r, r-pool, output_brigade, bucket_ctx);
 +if (rv) {
 +return rv;
 +}
 +}
 +
 +/* The eos bucket now */
 +bucket_eos = apr_bucket_eos_create(r-connection-bucket_alloc);
 +APR_BRIGADE_INSERT_TAIL(output_brigade, bucket_eos);
 +
  /* Write output_brigade to fastcgi server */
  if ((rv = proc_write_ipc(bucket_ctx-ipc,
   output_brigade)) != APR_SUCCESS) {
 @@ -304,6 +320,7 @@ handle_request_ipc(request_rec *r, int role,
  bucket_ctx-has_error = 1;
  return HTTP_INTERNAL_SERVER_ERROR;
  }
 +apr_brigade_cleanup(output_brigade);

  /* Create brigade */
  brigade_stdout =
 @@ -522,7 +539,8 @@ handle_request(request_rec * r, int role, fcgid_cm
  }

  static int add_request_body(request_rec *r, apr_pool_t *request_pool,
 -apr_bucket_brigade *output_brigade)
 +apr_bucket_brigade *output_brigade,
 +fcgid_bucket_ctx *bucket_ctx)
  {
  apr_bucket *bucket_input, *bucket_header;
  apr_file_t *fd = NULL;
 @@ -563,8 +581,6 @@ static int add_request_body(request_rec *r, apr_po
  return HTTP_INTERNAL_SERVER_ERROR;
  }

 -
 -
  while ((bucket_input = APR_BRIGADE_FIRST(input_brigade)) !=
 APR_BRIGADE_SENTINEL(input_brigade)) {
  const char *data;
  apr_size_t len;
 @@ -615,7 +631,9 @@ static int add_request_body(request_rec *r, apr_po
  return HTTP_INTERNAL_SERVER_ERROR;
  }

 -if (request_size  sconf-max_mem_request_len) {
 +if (sconf-stream_request_body) {
 +bucket_stdin = bucket_input;
 +} else if (request_size  sconf-max_mem_request_len) {
  apr_size_t wrote_len;
   

Re: mod_fcgid and streaming large request bodies

2013-12-27 Thread Justin Erenkrantz
On Fri, Dec 27, 2013 at 6:42 PM, Jeff Trawick traw...@gmail.com wrote:
 It would be quite valuable if there is a limit on how much can be pre-read
 (0 for pure streaming).  Pre-reading the request body reduces the number of
 application processes or threads required, and they are usually fatter
 (sometimes dramatically so) than httpd workers.  While the current
 implementation is truly ridiculous, there's some goodness that could be
 kept.  I'll try to look through this in the next few days and see if it is
 practical to keep the best of both versions.  If no round tuits are
 available we should at least proceed with this, perhaps changing the
 directive syntax to allow a hybrid implementation in the future with no
 legacy cruft.  (Wild suggestion: FcgiPreReadRequestBody
 Off|On|Unlimited|K, where On is defaultK)

I definitely agree that doing some type of pre-read may be often
useful before we start sending stuff via FastCGI.  Yet, the current
code structure doesn't lend itself to that very well - I guess what we
could do is convert the IPC to be opened lazily rather than always
open it.  If we do that, then I think the code changes would be
manageable.  (In our use case, reading the first say 64KB into RAM is
fine, but we definitely want to avoid spooling to disk.)

I'll see if I can find some round tuits...

Cheers.  -- justin