Re: RFC: Who owns a passed brigade?
Rici Lake wrote: I favour having ap_pass_brigade do it. ... as discussed in IRC. Yes, I like that too. However, I think it is a fallacy that a cleaned-up brigade is not too big to wait for pool cleanup. The brigade itself is about four pointers; and the corresponding pool cleanup is another four pointers, so that's a total of around 32 bytes, not a lot. But suppose that a brigade is created for every 8k of filtered data, Hmmm. Are there applications like that in the wild? I'd have thought that sounds like the kind of case where you definitely want to re-use one brigade, with a brigade_cleanup after each 8Kb pass_brigade. To create and destroy half a million brigades, including of course registering and removing their cleanups, seems well worth avoiding. So long as the creator is responsible (and the brigade guaranteed to be reusable), that is trivial for the programmer. -- Nick Kew
Re: RFC: Who owns a passed brigade?
On Thu, Apr 21, 2005 at 07:05:34PM -0500, Rici Lake wrote: On 21-Apr-05, at 5:51 PM, Nick Kew wrote: Rici Lake wrote: FWIW, I think the (apparent) practice, where the caller relinquishes ownership of the buckets but not the brigade itself, is more efficient since it avoids a lot of brigade construction and destruction. Agreed. And it works for any situation, as either party can do a cleanup to keep resource-use down, and a cleaned up brigade is not too big to wait for pool cleanup. Either party can do a cleanup does not give either party the responsibility. If neither party did the cleanup, that would be a problem. Lack of clear responsibility is always a problem, and not just with computer programs. The issue here is really which party can *destroy* a brigade, right? In the current code where brigades are never really destroyed the fact that apr_brigade_cleanup() is called everywhere is not really harmful, is it? It looked to me like it was going to much simpler to adapt filters shipped in httpd to match the creator-must-destroy model, than to match a consumer-must-destroy model, despite the ap_pass_brigade docstring. It's certainly easier to audit for. In creator-must-destroy it is still fine for a consumer to call apr_brigade_cleanup() on a brigade it is passed, of course; that's just equivalent to consuming all the buckets, after all. joe
Re: mod_cache caching the 301 Moved Permanently
Sander Striker wrote: [EMAIL PROTECTED] wrote: The problem seems to be, that the proxied backend server that is cached via mod_disk_cache originally delivers HTTP status 301 and the Location http://www.beach-clothing.com/where-to-buy/, but once cached mod_disk_cache delivers HTTP status 200 instead of 301 (but correctly redelivering the Location header). I have not proved this for myself so far, but this seems the problem to me. This wouldn't surprise me one bit. The 2.1 branch has seen quite a bit of churn in this area. Any chance you could give 2.1 a go and see if that works correctly? Sorry not enough time right now to check this on the living object. But I think what is missing is something like the following (untested) patch (done against 2.0.54, but code in trunk is the same at this part): --- mod_cache.c.orig2005-04-11 17:47:03.0 +0200 +++ mod_cache.c 2005-04-22 23:43:19.0 +0200 @@ -220,6 +220,8 @@ static int cache_out_filter(ap_filter_t ap_log_error(APLOG_MARK, APLOG_DEBUG, APR_SUCCESS, r-server, cache: running CACHE_OUT filter); +/* restore status of cached response */ +r-status = cache-handle-cache_obj-info.status; /* recall_headers() was called in cache_select_url() */ cache-provider-recall_body(cache-handle, r-pool, bb); I found no place in the code where the saved status code is restored to the response, so I think this could be the right place to do so. Could somebody crosscheck who is more familar with the code? Sander? Justin? Sander Regards RĂ¼diger
Re: mod_cache caching the 301 Moved Permanently
Olaf van der Spek wrote: On 4/22/05, Justin Erenkrantz [EMAIL PROTECTED] wrote: [..cut..] I don't get it. What's your problem? -- justin The 'here' link is to http://www.beach-clothing.com:8080/where-to-buy/ while he wants it do be to http://www.beach-clothing.com/where-to-buy/ His problem is a different one. If you type http://www.beach-clothing.com/where-to-buy (without the slash) you have the following headers (first my request / second server reply): GET /where-to-buy HTTP/1.1 Host: www.beach-clothing.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050319 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plai n;q=0.8,image/png,*/*;q=0.5 Accept-Language: de,en;q=0.8,de-de;q=0.5,en-gb;q=0.3 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive HTTP/1.x 200 OK Date: Fri, 22 Apr 2005 20:39:58 GMT Server: Apache Content-Type: text/html; charset=iso-8859-1 Location: http://www.beach-clothing.com/where-to-buy/ Cache-Control: max-age=432000 Expires: Tue, 26 Apr 2005 04:58:38 GMT Content-Length: 256 Age: 142880 Keep-Alive: timeout=3, max=20 Connection: Keep-Alive The problem seems to be, that the proxied backend server that is cached via mod_disk_cache originally delivers HTTP status 301 and the Location http://www.beach-clothing.com/where-to-buy/, but once cached mod_disk_cache delivers HTTP status 200 instead of 301 (but correctly redelivering the Location header). I have not proved this for myself so far, but this seems the problem to me. I think the difference between the here link and the Location header can be explained by the ProxyPassReverse directive which rewrites the Location header, but not the HTML code in the error page. Regards RĂ¼diger
Re: RFC: Name Based Virtual Host Callback
Paul Querna wrote: Instead I have tested a simple callback method, that will only run the callback for the matching virtual hosts. Attached is a prototype patch. Attached is an updated patch. If no one cares, I will commit it to trunk tonight. This does include a minor MMN bump. -Paul Index: server/vhost.c === --- server/vhost.c (revision 164089) +++ server/vhost.c (working copy) @@ -977,7 +977,57 @@ } } +/** + * For every virtual host on this connection, call func_cb. + */ +AP_DECLARE(int) ap_vhost_iterate_given_conn(conn_rec *conn, +ap_vhost_iterate_conn_cb func_cb, +void* baton) +{ +server_rec *s; +server_rec *last_s; +name_chain *src; +apr_port_t port; +int rv = 0; +if (conn-vhost_lookup_data) { +last_s = NULL; +port = conn-local_addr-port; + +for (src = conn-vhost_lookup_data; src; src = src-next) { +server_addr_rec *sar; + +/* We only consider addresses on the name_chain which have a + * matching port. + */ +sar = src-sar; +if (sar-host_port != 0 port != sar-host_port) { +continue; +} + +s = src-server; + +if (s == last_s) { +/* we've already done a callback for this vhost. */ +continue; +} + +last_s = s; + +rv = func_cb(baton, conn, s); + +if (rv != 0) { +break; +} +} +} +else { +rv = func_cb(baton, conn, conn-base_server); +} + +return rv; +} + /* Called for a new connection which has a known local_addr. Note that the * new connection is assumed to have conn-server == main server. */ Index: include/http_vhost.h === --- include/http_vhost.h(revision 164089) +++ include/http_vhost.h(working copy) @@ -53,6 +53,30 @@ const char *arg); /** + * Callback function for every Name Based Virtual Host. + * @param baton Opaque user object + * @param conn The current Connection + * @param s The current Server + * @see ap_vhost_iterate_given_conn + * @return 0 on success, any non-zero return will stop the iteration. + */ +typedef int(*ap_vhost_iterate_conn_cb)(void* baton, conn_rec* conn, server_rec* s); + +/** + * For every virtual host on this connection, call func_cb. + * @param conn The current connection + * @param func_cb Function called for every Name Based Virtual Host for this + *connection. + * @param baton Opaque object passed to func_cb. + * @return The return value from func_cb. + * @note If func_cb returns non-zero, the function will return at this point, + * and not continue iterating the virtual hosts. + */ +AP_DECLARE(int) ap_vhost_iterate_given_conn(conn_rec *conn, +ap_vhost_iterate_conn_cb func_cb, +void* baton); + +/** * given an ip address only, give our best guess as to what vhost it is * @param conn The current connection */ Index: include/ap_mmn.h === --- include/ap_mmn.h(revision 164089) +++ include/ap_mmn.h(working copy) @@ -93,6 +93,7 @@ *symbols from the public sector, and decorated real_exit_code *with ap_ in the win32 os.h. * 20050305.0 (2.1.4-dev) added pid and generation fields to worker_score + * 20050305.1 (2.1.5-dev) added ap_vhost_iterate_given_conn. */ #define MODULE_MAGIC_COOKIE 0x41503230UL /* AP20 */ @@ -100,7 +101,7 @@ #ifndef MODULE_MAGIC_NUMBER_MAJOR #define MODULE_MAGIC_NUMBER_MAJOR 20050305 #endif -#define MODULE_MAGIC_NUMBER_MINOR 0 /* 0...n */ +#define MODULE_MAGIC_NUMBER_MINOR 1 /* 0...n */ /** * Determine if the server's current MODULE_MAGIC_NUMBER is at least a
Module code review (long). Was Re: RFC: Who owns a passed brigade?
On 22-Apr-05, at 6:19 AM, Nick Kew wrote: However, I think it is a fallacy that a cleaned-up brigade is not too big to wait for pool cleanup. The brigade itself is about four pointers; and the corresponding pool cleanup is another four pointers, so that's a total of around 32 bytes, not a lot. But suppose that a brigade is created for every 8k of filtered data, Are there applications like that in the wild? I'd have thought that sounds like the kind of case where you definitely want to re-use one brigade, with a brigade_cleanup after each 8Kb pass_brigade. Indeed. But the filter does not know, in general, how much data it will be called upon to filter, nor how many times it will be invoked. Being invoked once every 8 kb should be pretty common, though, since that seems to be the recommendation on how often a filter should pass a brigade. Are there applications like that in the wild? I don't know, but it seems likely that there are. None of the following examples are definitive, but they seem illustrative. I cite these examples only because they are available; it was not an attempt to criticize anyone's code in particular, but rather to show the result of the lack of clear documentation on how to write an output filter. In summary, of five modules examined, three of them create new brigades on every invocation without destroying the previous one. One of them fails to empty the passed brigade. One of them destroys the passed brigade. Two of them do one-time initialization on every invocation of the filter. The only output filter I looked at which seems to conform to brigades-are-retained, buckets-are-passed is mod_deflate. This is the loop from mod_deflate.c, which seems to be a reasonable model: 1 static apr_status_t deflate_out_filter(ap_filter_t *f, 2 apr_bucket_brigade *bb) 3 { // ... declarations 4 if (APR_BRIGADE_EMPTY(bb)) { 5 return APR_SUCCESS; 6 } 7 if (!ctx) { // ... determine whether we should filter the content 8 /* We're cool with filtering this. */ 9 ctx = f-ctx = apr_pcalloc(r-pool, sizeof(*ctx)); 10 ctx-bb = apr_brigade_create(r-pool, f-c-bucket_alloc); // ... more initialization 11 } 12 while (!APR_BRIGADE_EMPTY(bb)) 13 { 14 e = APR_BRIGADE_FIRST(bb); // ... check if e is metadata, process data in e 15 apr_bucket_delete(e); 16 } 17 apr_brigade_cleanup(bb); 18 return APR_SUCCESS; 19 } Notes: lines 4-6 these only serve to avoid initialization in the event that the filter only receives empty brigades. I don't know if this is necessary. line 7 all initialization is performed only once, if the ctx structure has not yet been created line 12, 15 all buckets are removed from the brigade and deleted line 17 the call to cleanup is presumably redundant since there is no break in the while block. But there could be. mod_case_filter is supposedly an example of how to write filters, so it is likely that its code will be copied by naive filter writers. Here is its loop, in essence: 1 static apr_status_t CaseFilterOutFilter(ap_filter_t *f, 2 apr_bucket_brigade *pbbIn) 3 { // ... declarations 4 pbbOut=apr_brigade_create(r-pool, c-bucket_alloc); 5 for (pbktIn = APR_BRIGADE_FIRST(pbbIn); 6pbktIn != APR_BRIGADE_SENTINEL(pbbIn); 7pbktIn = APR_BUCKET_NEXT(pbktIn)) 8 { // ... check for EOS 8 apr_bucket_read(pbktIn,data,len,APR_BLOCK_READ); // ... do the transform 10 pbktOut = apr_bucket_heap_create(buf, len, apr_bucket_free, 11c-bucket_alloc); 12 APR_BRIGADE_INSERT_TAIL(pbbOut,pbktOut); 13 } 14 return ap_pass_brigade(f-next,pbbOut); 15 } Notes: line 4: a new brigade is created on every invocation. lines 5-7: the input brigade is iterated but the buckets are not removed line 14: the input brigade is not cleaned, and the output brigade is not destroyed. OK, here's mod_charset_lite, which is flagged as a learning experience (for the author), but I believe can actually be found in the wild. The filter function itself was a bit much to include here (it does seem to empty its input brigade), but the key is in the following function which is called everytime the filter decides it has output available: 1 static apr_status_t send_downstream(ap_filter_t *f, const char *tmp, 2 apr_size_t len) 3 { // ... declarations 4 bb = apr_brigade_create(r-pool, c-bucket_alloc); 5 b = apr_bucket_transient_create(tmp, len, c-bucket_alloc); 6 APR_BRIGADE_INSERT_TAIL(bb, b); 7 rv =
Re: mod_cache caching the 301 Moved Permanently
[EMAIL PROTECTED] wrote: The problem seems to be, that the proxied backend server that is cached via mod_disk_cache originally delivers HTTP status 301 and the Location http://www.beach-clothing.com/where-to-buy/, but once cached mod_disk_cache delivers HTTP status 200 instead of 301 (but correctly redelivering the Location header). I have not proved this for myself so far, but this seems the problem to me. This wouldn't surprise me one bit. The 2.1 branch has seen quite a bit of churn in this area. Any chance you could give 2.1 a go and see if that works correctly? Sander
Re: RFC: Who owns a passed brigade?
On 22-Apr-05, at 9:32 AM, Joe Orton wrote: The issue here is really which party can *destroy* a brigade, right? Or perhaps which party *must* destroy a brigade. This is much less of an issue if neither party creates a new brigade on every filter invocation. But some do. In the current code where brigades are never really destroyed the fact that apr_brigade_cleanup() is called everywhere is not really harmful, is it? No, what would be harmful would be apr_brigade_cleanup() not being called on a non-empty brigade. It looked to me like it was going to much simpler to adapt filters shipped in httpd to match the creator-must-destroy model, than to match a consumer-must-destroy model, despite the ap_pass_brigade docstring. That seems to be the case. It's certainly easier to audit for. Sure, but it applies equally to buckets. It seems that the consensus is creator-must-destroy-brigades; consumer-must-remove-buckets. That's not easy to audit for, which is why I think ap_pass_brigade should guarantee the latter condition. In creator-must-destroy it is still fine for a consumer to call apr_brigade_cleanup() on a brigade it is passed, of course; that's just equivalent to consuming all the buckets, after all. Sure. Consequently, it's ok for ap_pass_brigade to do the call. Then a consumer is free to either consume the buckets or not.