Re: mod_proxy_fcgi issues
On 2014-12-04 13:27, Eric Covener wrote: On Thu, Dec 4, 2014 at 1:11 PM, Jim Riggs apache-li...@riggs.me wrote: This all may certainly be true, but I just for clarity's sake (since it was my quote that started this new mod_proxy_fcgi thread), my mod_proxy_balancer - mod_proxy_fcgi - php-fpm issue is NOT an httpd issue...at least that is not how I have treated it. It is actually a code fix I have had to make in PHP to get it to work. [...] It doesn't seem that usable values for these things should be so unique to php-fpm. My experience has been that the PHP FPM SAPI function init_request_info() in sapi/fpm/fpm/fpm_main.c, which I think was originally copied from the CGI SAPI, is very old code that goes to great lengths to preserve old, not always standards-compliant behavior in order to avoid breaking backward compatibilities. Hence, I'm not convinced that the things Eric refers to above might not be unique to php-fpm. After struggling to get php-fpm working with mod_proxy_fcgi, I eventually completely rewrote the whole init_request_info function the way I thought it should be without any regards to backwards compatibility; this solved the problems I was having. If memory serves (it's been a few years) the main problems I was encountering were with serving the index file for directories and correct handling of PATH_INFO. I've attached the patch I'm using (a completely new version of the init_request_info function) in case anyone wants to either play with it or compare it to the code that PHP currently uses. -- Mark Montague m...@catseye.org diff -up php-5.6.3/sapi/fpm/fpm/fastcgi.c.fpm-init-request php-5.6.3/sapi/fpm/fpm/fastcgi.c --- php-5.6.3/sapi/fpm/fpm/fastcgi.c.fpm-init-request 2014-11-18 20:33:20.313769152 + +++ php-5.6.3/sapi/fpm/fpm/fastcgi.c2014-11-18 20:33:38.424369147 + @@ -488,6 +488,7 @@ static int fcgi_get_params(fcgi_request ret = 0; break; } +zlog(ZLOG_DEBUG, fcgi_get_params: %s=%s, tmp, s); zend_hash_update(req-env, tmp, eff_name_len+1, s, sizeof(char*), NULL); p += name_len + val_len; } @@ -1093,12 +1094,14 @@ char* fcgi_putenv(fcgi_request *req, cha { if (var req) { if (val == NULL) { + zlog(ZLOG_DEBUG, fcgi_putenv: %s=, var); zend_hash_del(req-env, var, var_len+1); } else { char **ret; val = estrdup(val); if (zend_hash_update(req-env, var, var_len+1, val, sizeof(char*), (void**)ret) == SUCCESS) { + zlog(ZLOG_DEBUG, fcgi_putenv: %s=%s, var, val); return *ret; } } diff -up php-5.6.3/sapi/fpm/fpm/fpm_main.c.fpm-init-request php-5.6.3/sapi/fpm/fpm/fpm_main.c --- php-5.6.3/sapi/fpm/fpm/fpm_main.c.fpm-init-request 2014-11-12 13:52:21.0 + +++ php-5.6.3/sapi/fpm/fpm/fpm_main.c 2014-11-18 20:33:38.425369123 + @@ -1422,6 +1422,317 @@ static void init_request_info(TSRMLS_D) } /* }}} */ +static char *fpm_cgibin_saveenv(char *name, char *val) +{ +int name_len = strlen(name); +char *old_val = sapi_cgibin_getenv(name, name_len TSRMLS_CC); +char save_name[256]; + +if (val != NULL old_val != NULL strcmp(val, old_val) == 0) { + return old_val; +} + +if (name_len 256 - strlen(ORIG_) - 1) { + strcpy(save_name, ORIG_); + strcat(save_name, name); +} else { + save_name[0] = '\0'; + } + +/* Save the old value only if one was not previously saved */ +if (old_val save_name[0] != '\0' + sapi_cgibin_getenv(save_name, strlen(save_name) TSRMLS_CC) == NULL) { + _sapi_cgibin_putenv(save_name, old_val TSRMLS_CC); + } + + return _sapi_cgibin_putenv(name, val TSRMLS_CC); + +} + +static void init_request_info0(TSRMLS_D) +{ +char *document_root; +int document_root_len; +char *script_filename; +int script_filename_len; +char *script_filename_part = NULL; +char *s = NULL; +char *path = NULL; +char *path_info = NULL; +char *path_translated = NULL; +char *content_type; +char *content_length; +const char *auth; +char *ini; +int result; +struct stat st; +int add_index = 0; + +zlog(ZLOG_DEBUG, initializing request info:); + +/* initialize the defaults */ +SG(request_info).path_translated = NULL; +SG(request_info).request_method = NULL; +SG(request_info).proto_num = 1000; +SG(request_info).query_string = NULL; +SG(request_info).request_uri = NULL; +SG(request_info).content_type = NULL; +SG(request_info).content_length = 0; +SG(sapi_headers).http_response_code = 200; + +/* + * Use our document root instead of one passed to us by our invoker
Re: [RFC] enhancement: mod_cache bypass
On 2014-08-23 12:36, Graham Leggett wrote: On 23 Aug 2014, at 3:40 PM, Mark Montague m...@catseye.org wrote: [root@sky ~]# httpd -t AH00526: Syntax error on line 148 of /etc/httpd/conf/dev.catseye.org.conf: CacheEnable cannot occur within If section [root@sky ~]# The solution here is to lift the restriction above. Having a generic mechanism to handle conditional behaviour, and then having a special case to handle the same behaviour in a different way is wrong way to go. I've looked into allowing CacheEnable directives within If sections. This can be done by removing the NOT_IN_FILES flag from the call to ap_check_cmd_context() in modules/cache/mod_cache.c:add_cache_enable() The problem is that If sections are currently walked only during mod_cache's normal handler phase, not during the quick handler phase. It looks easy enough to add a call to ap_if_walk() to cache_quick_handler(), but this would add significant extra processing to the quick handler phase, as all If expressions for the enclosing context would be evaluated, and I think that at the end we'd have to discard the results that ap_if_walk() caches in the request record so that they can be recomputed later during normal request processing after all information about the request is available. Is this acceptable? Also, the proof of concept patch that I sent yesterday, which adds an expr= clause to the CacheEnable directive, records cache bypasses in the request notes (for logging), in the X-Cache and X-Cache-Detail headers, in the cache-status subprocess environment variable, and adds a new subprocess environment variable named cache-bypass. If we enable using CacheEnable in If sections, conditional cache bypasses will no longer be called out explicitly to the server administrator; they will need to infer a bypass from comparing the URL path to their configuration. I do not see this as a large problem, but I thought I would mention it for consideration. Given these things, what thoughts does the developer community have? Would a patch to allow CacheEnable within If sections have a better chance of being accepted than one that adds a expr= clause to the CacheEnable directive? Or should mod_cache not allow cache bypassing at all? Use NGINX ( http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_bypass ) if you want that or use Varnish ( https://www.varnish-cache.org/docs/4.0/users-guide/increasing-your-hitrate.html#cookies ) if you want that are answers I'm fine with, if there's no interest in this feature for httpd's mod_cache. -- Mark Montague m...@catseye.org
Re: [RFC] enhancement: mod_cache bypass
On 2014-08-23 5:19, Graham Leggett wrote: On 23 Aug 2014, at 03:50, Mark Montague m...@catseye.org mailto:m...@catseye.org wrote: I've attached a proof-of-concept patch against httpd 2.4.10 that allows mod_cache to be bypassed under conditions specified in the conf files. Does this not duplicate the functionality of the If directives? No, not in this case: If -z %{req:Cookie} CacheEnable disk / /If [root@sky ~]# httpd -t AH00526: Syntax error on line 148 of /etc/httpd/conf/dev.catseye.org.conf: CacheEnable cannot occur within If section [root@sky ~]# Also, any solution has to work within both the quick handler phase and the normal handler phase of mod_cache. # Only serve cached data if no (login or other) cookies are present in the request: CacheEnable disk / expr=-z %{req:Cookie} As an aside, trying to single out and control just one cache using directives like this is ineffective, as other caches like ISP caches and browser caches will not be included in the configuration. Rather control the cache using the Cache-Control headers in the formal HTTP specs. The proposed enhancement is about the server deciding when to serve items from the cache. Although the client can specify a Cache-Control request header in order to bypass the server's cache, there is no good way for a web application to signal to a client when it should do this (for example., when a login cookie is set). The behavior of other caches is controlled using the Cache-Control response header. This functionality is provided by Varnish Cache: https://www.varnish-cache.org/docs/4.0/users-guide/increasing-your-hitrate.html#cookies Squid does not currently provide this functionality, but it seems like there is consensus that it should: http://bugs.squid-cache.org/show_bug.cgi?id=2258 Here is a more detailed example scenario, in case it helps. There are also many other scenarios in which conditionally bypassing mod_cache is useful. - Reverse proxy setup using mod_proxy_fcgi - Static resources served through httpd front-end with response header Cache-Control: max-age=14400 so that they are cached by mod_cache, ISP caches, and browser caches. - Back-end pages are dynamic (PHP), but very expensive to generate (1-2 seconds). - Back-end sets response header Cache-Control: max-age=0, s-maxage=14400 so that mod_cache caches the response, but ISP caches and browser caches do not. (mod_cache removes s-maxage and does not pass it upstream). - When back-end content changes (e.g., an author makes an update), the back-end invokes htcacheclean /path/to/resource to invalidate the cached page so that it is regenerated the next time a client requests it. - Clients have multiple cookies set. Tracking cookies and cookies used by JavaScript should not cause a mod_cache miss. - Dynamic pages that are generated when a login cookie is set should not be cached. This is accomplished by the back-end setting the response header Cache-Control: max-age=0. - However, when a login cookie is set, dynamic pages that are currently cached should not be served to the client with the login cookie, while they should still be served to all other clients. -- Mark Montague m...@catseye.org
Re: [RFC] enhancement: mod_cache bypass
On 2014-08-23 12:36, Graham Leggett wrote: On 23 Aug 2014, at 3:40 PM, Mark Montague m...@catseye.org wrote: AH00526: Syntax error on line 148 of /etc/httpd/conf/dev.catseye.org.conf: CacheEnable cannot occur within If section The solution here is to lift the restriction above. Having a generic mechanism to handle conditional behaviour, and then having a special case to handle the same behaviour in a different way is wrong way to go. I assumed this would be OK because the Header directive has a similar expr=expression clause. But, I'll look into whether if restriction on If could be removed. If I rewrite things to use the If directive, do you see bypass functionality as something worth including? I ask because from your points below I get the impression that the answer is no. The proposed enhancement is about the server deciding when to serve items from the cache. Although the client can specify a Cache-Control request header in order to bypass the server's cache, there is no good way for a web application to signal to a client when it should do this (for example., when a login cookie is set). The behavior of other caches is controlled using the Cache-Control response header. There is - use “Cache-Control: private”. This will tell all public caches, including mod_cache and ISP caches, not to cache content with cookies attached, while at the same time telling browser caches that they should. The problem is not whether the content should be cached: it should. The problem is, to which clients should the cached content be served? If the client's request does not contain a login cookie, that client should get the cached copy. If the client's request does contain a login cookie, the cache should be bypassed and the client should get a copy of the resource generated specifically for it. Cache-Control: private cannot be used in a request, only in a response, where it works as you said. The problem is that the first request for a given resource where the client includes a login cookie gets intercepted by mod_cache and served from the cache (if you assume that other clients without login cookies have already requested it). There must therefore be some way to tell mod_cache that this client needs something different. One way to do this would be by having different URL paths for logged in versus non-logged in users, but this is awkward, user-visible, and may not be feasible with all web application. - Back-end sets response header Cache-Control: max-age=0, s-maxage=14400 so that mod_cache caches the response, but ISP caches and browser caches do not. (mod_cache removes s-maxage and does not pass it upstream). mod_cache shouldn’t remove any Cache-Control headers. It apparently does, although I haven't found where in the code yet. I would be interested to see if anyone can reproduce my experience. As far as I know, I don't have any configuration that would result in this. httpd 2.4.10 with mod_proxy_fcgi (Fedora 19 build) PHP 5.5.5 with PHP-FPM Relevant configuration: CacheEnable disk / CacheDefaultExpire 86400 CacheIgnoreHeaders Set-Cookie CacheHeader on CacheDetailHeader on # We'll be paying attention to Cache-Control: s-maxage=xxx for all # of our caching decisions. The browser will use max-age=yyy for its # decisions. So we drop the Expires header. See the following page # from Google which says, It is redundant to specify both Expires and # Cache-Control: max-age # https://developers.google.com/speed/docs/best-practices/caching?hl=sv Header unset Expires RewriteRule ^(.*\.php)$ fcgi://127.0.0.1:9001/www/dev.catseye.org/content/$1 [P,L] File test.php, containing: ?php header( Cache-Control: max-age=0, s-maxage=14400 ); header( Content-type: text/html ); ? htmlbodyHello!/body/html Browser transaction for https://dev.catseye.org/test.php: GET /test.php HTTP/1.1 Host: dev.catseye.org User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Firefox/31.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate DNT: 1 Connection: keep-alive HTTP/1.1 200 OK Date: Sat, 23 Aug 2014 20:11:00 GMT Server: Apache/2.4 Cache-Control: max-age=0 X-Cache: MISS from dev.catseye.org X-Cache-Detail: cache miss: attempting entity save from dev.catseye.org Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html;charset=UTF-8 And mod_cache definitely receives s-maxage from the backend: [root@sky cache]# cat ./J/k/WPiKG0bwW@R_H4YvSOdw.header (binary data omitted)https://dev.catseye.org:443/test.php?Cache-Control: max-age=0 Cache-Control: max-age=0, s-maxage=14400 Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data: ; font-src 'self' data: ; report-uri /csp-report.php Content-Type
Re: [RFC] enhancement: mod_cache bypass
On 2014-08-23 17:43, Mark Montague wrote: - Back-end sets response header Cache-Control: max-age=0, s-maxage=14400 so that mod_cache caches the response, but ISP caches and browser caches do not. (mod_cache removes s-maxage and does not pass it upstream). mod_cache shouldn’t remove any Cache-Control headers. It apparently does, although I haven't found where in the code yet. I would be interested to see if anyone can reproduce my experience. As far as I know, I don't have any configuration that would result in this. Please ignore this part of my previous reply, I found out what was going on: When the content is first requested, mod_cache has a miss and it stores the content. But when it sends it on to the client, it does so without any Cache-control header at all: GET /test.php HTTP/1.1 Host: dev.catseye.org HTTP/1.1 200 OK Date: Sat, 23 Aug 2014 22:01:26 GMT Server: Apache/2.4 X-Cache: MISS from dev.catseye.org X-Cache-Detail: cache miss: attempting entity save from dev.catseye.org Transfer-Encoding: chunked Content-Type: text/html;charset=UTF-8 The second time the resource was requested, it is served by mod_cache from the cache with the original Cache-Control header (I added foo=1 to the header track this when I generated the page): GET /test.php HTTP/1.1 Host: dev.catseye.org HTTP/1.1 200 OK Date: Sat, 23 Aug 2014 22:02:21 GMT Server: Apache/2.4 Cache-Control: max-age=0, foo=1, s-maxage=14400 Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data: ; font-src 'self' data: ; report-uri /csp-report.php Age: 54 X-Cache: HIT from dev.catseye.org X-Cache-Detail: cache hit from dev.catseye.org Content-Length: 33 Content-Type: text/html;charset=UTF-8 What was happening was that led me to assume that mod_cache was editing the header (which, I now see, it wasn't) was because I had the following directives that escaped my attention when I was composing my previous reply: ExpiresActive on ExpiresDefault access plus 1 week ExpiresByType text/html access plus 0 seconds This resulted in a Cache-control: max-age=0 header being unconditionally added to the response headers, even if another header was already there. So for a cache miss, I would see: Cache-control: max-age=0 while for a cache hit I would see Cache-control: max-age=0 Cache-Control: max-age=0, foo=1, s-maxage=14400 Mystery solved. I apologize for the red herring and waste of people's time and attention. I'm still looking for a solution to the original problem: how to indicate to mod_cache that cached content for a particular URL path should be served to some clients (ones without login cookies), but not to other clients (ones with login cookies). -- Mark Montague m...@catseye.org
[RFC] enhancement: mod_cache bypass
I've attached a proof-of-concept patch against httpd 2.4.10 that allows mod_cache to be bypassed under conditions specified in the conf files. It adds an optional fourth argument to the CacheEnable directive: CacheEnable cache_type [url-string] [expr=expression] If the expression is present, data will only be served from the cache for requests for which the expression evaluates to true. This permits things such as: # Only serve cached data if no (login or other) cookies are present in the request: CacheEnable disk / expr=-z %{req:Cookie} # Do not serve cached pages to our testing network: Location /some/path CacheEnable disk expr=! ( %{REMOTE_ADDR} -ipmatch 192.168.0.0/16 ) /Location Is there interest in such an enhancement? If so, I'll make any requested changes to the implementation, port the patch forward to trunk, put in real APLOGNOs, make sure it passes the test suite, create a documentation patch, and create a bugzilla for all this. -- Mark Montague m...@catseye.org diff -urd httpd-2.4.10.orig/modules/cache/cache_util.c httpd-2.4.10/modules/cache/cache_util.c --- httpd-2.4.10.orig/modules/cache/cache_util.c2014-05-30 13:50:37.0 + +++ httpd-2.4.10/modules/cache/cache_util.c 2014-08-23 01:34:25.521689874 + @@ -27,6 +27,10 @@ extern module AP_MODULE_DECLARE_DATA cache_module; +extern int cache_run_cache_status(cache_handle_t *h, request_rec *r, +apr_table_t *headers, ap_cache_status_e status, +const char *reason); + /* Determine if url matches the hostname, scheme and port and path * in filter. All but the path comparisons are case-insensitive. */ @@ -129,10 +133,30 @@ } static cache_provider_list *get_provider(request_rec *r, struct cache_enable *ent, -cache_provider_list *providers) +cache_provider_list *providers, int *bypass) { -/* Fetch from global config and add to the list. */ cache_provider *provider; + +/* If an expression is present, evaluate it and make sure it is true */ +if (ent-expr != NULL) { +const char *err = NULL; +int eval = ap_expr_exec(r, ent-expr, err); +if (err) { +ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, +APLOGNO(06668) Failed to evaluate expression (%s) - skipping CacheEnable for uri %s, err, r-uri); +return providers; +} +if (eval = 0) { +ap_log_rerror(APLOG_MARK, APLOG_DEBUG, APR_SUCCESS, r, + APLOGNO(06667) cache: CacheEnable expr at %s(%u) is FALSE for uri %s, ent-expr-filename, ent-expr-line_number, r-uri); +(*bypass)++; +return providers; +} +ap_log_rerror(APLOG_MARK, APLOG_DEBUG, APR_SUCCESS, r, + APLOGNO(0) cache: CacheEnable expr at %s(%u) is TRUE for uri %s, ent-expr-filename, ent-expr-line_number, r-uri); +} + +/* Fetch from global config and add to the list. */ provider = ap_lookup_provider(CACHE_PROVIDER_GROUP, ent-type, 0); if (!provider) { @@ -172,6 +196,7 @@ { cache_dir_conf *dconf = ap_get_module_config(r-per_dir_config, cache_module); cache_provider_list *providers = NULL; +int bypass = 0; int i; /* per directory cache disable */ @@ -193,7 +218,7 @@ for (i = 0; i dconf-cacheenable-nelts; i++) { struct cache_enable *ent = (struct cache_enable *)dconf-cacheenable-elts; -providers = get_provider(r, ent[i], providers); +providers = get_provider(r, ent[i], providers, bypass); } /* loop through all the global cacheenable entries */ @@ -201,10 +226,16 @@ struct cache_enable *ent = (struct cache_enable *)conf-cacheenable-elts; if (uri_meets_conditions(ent[i].url, ent[i].pathlen, uri)) { -providers = get_provider(r, ent[i], providers); +providers = get_provider(r, ent[i], providers, bypass); } } +if (providers == NULL bypass 0) { +/* we're bypassing the cache. tell everyone who cares */ +cache_run_cache_status(NULL, r, r-headers_out, AP_CACHE_BYPASS, + apr_psprintf(r-pool, cache bypass: %d conditions not satisfied, bypass)); +} + return providers; } diff -urd httpd-2.4.10.orig/modules/cache/cache_util.h httpd-2.4.10/modules/cache/cache_util.h --- httpd-2.4.10.orig/modules/cache/cache_util.h2014-08-20 14:50:12.251792173 + +++ httpd-2.4.10/modules/cache/cache_util.h 2014-08-22 00:20:24.946556676 + @@ -109,6 +109,7 @@ apr_uri_t url; const char *type; apr_size_t pathlen; +ap_expr_info_t *expr; }; struct cache_disable { diff -urd httpd-2.4.10.orig/modules/cache/mod_cache.c httpd-2.4.10/modules/cache/mod_cache.c --- httpd-2.4.10.orig/modules/cache/mod_cache.c 2014-08-20 14:50:12.253792121
Re: TRACE still enabled by default
On March 21, 2012 15:33 , Roy T. Fielding field...@gbiv.com wrote: TRACE won't work at all if the most popular end-point doesn't support it. Why would this be a bad thing? Or, to phrase it another way, what are the situations in which it is desirable that TRACE be already-enabled on a web server as opposed to having the owner of the web server enable the TRACE method in response to a specific debugging need? -- Mark Montague m...@catseye.org
Re: TRACE still enabled by default
On March 21, 2012 16:02 , Greg Stein gst...@gmail.com wrote: TRACE won't work at all if the most popular end-point doesn't support it. Why would this be a bad thing? Or, to phrase it another way, what are the situations in which it is desirable that TRACE be already-enabled on a web server as opposed to having the owner of the web server enable the TRACE method in response to a specific debugging need? Roy means that if we don't set the precedent for TRACE being present and how it is supposed to work, then nobody else will. The Apache HTTP server is effectively the embodiment and leader of the HTTP specification. Yes, that was clear. But why would setting a precedent and leading the way for TRACE only being present when explicitly enabled by the owner of a specific web server be bad? For the sake of discussion, what real world problems -- troubleshooting, debugging, or other problems -- would such a course of action actually cause? -- Mark Montague m...@catseye.org
Re: questions about document_root
On December 8, 2011 1:48 , Rui Hu tchrb...@gmail.com wrote: 2011/12/8 Rui Hu tchrb...@gmail.com mailto:tchrb...@gmail.com Is $DOCUMENT_ROOT in php-cgi determined by ap_add_common_vars() in Apache? It seems not to me. I commented the line 237 assigning DOCUMENT_ROOT and re-compiled apache. php-cgi still works fine. It seems that $DUCUMENT_ROOT in php-cgi is not determined by this function. What you say is correct. This is an Apache HTTP Server mailing list. You said, in apache, I cannot find any code which assign this var. and I showed you where in Apache HTTP Server the DOCUMENT_ROOT environment variable is set when running CGIs. What you are now asking should be sent to the PHP users mailing list, since it is a question about PHP. But, see the function init_request_info() in the PHP source code, in the file sapi/cgi/cgi_main.c (lines 1123-1137) http://svn.php.net/viewvc/php/php-src/branches/PHP_5_3_8/sapi/cgi/cgi_main.c?revision=315335view=markup How PHP determines the value of $_SERVER[DOCUMENT_ROOT] depends on all of the following: - The value for the PHP directive cgi.fix_pathinfo - The value for the PHP directive doc_root - The value of the environment variable DOCUMENT_ROOT (set by Apache HTTP Server) -- Mark Montague m...@catseye.org
Re: questions about document_root
On December 7, 2011 23:23 , Rui Hu tchrb...@gmail.com wrote: I looked up the code of PHP and apache2, and found that PHP gets docroot from environment var $DOCUMENT_ROOT. However in apache, I cannot find any code which assign this var. I googled but got nothing. Can you please show me the detailed process generating $DOCUMENT_ROOT in $_SERVER from apache to php. Thank you very much! If you invoke PHP as a CGI, then Apache HTTP Server sets DOCUMENT_ROOT in the function ap_add_common_vars() which is in the file server/util_script.c See line 237, https://svn.apache.org/viewvc/httpd/httpd/branches/2.2.x/server/util_script.c?revision=1100216view=markup -- Mark Montague m...@catseye.org
Re: Infinite data stream from a non-HTTPD external process via HTTPD
On September 20, 2011 5:13 , Henrik Strand henrik.str...@axis.com wrote: I would like to send an infinite data stream from a non-HTTPD external process via HTTPD to the client connection. Both HTTP and HTTPS must be supported. Dw.'s solution is a good one, especially if the external process you are talking about is not a child process spawned by httpd. In the special but very common case where the external process is a child process spawned by httpd, then the easiest solution is to have that process send its data to httpd (it's parent process) via STDOUT -- in other words, simply output the data as you normally would in any CGI script or other active content. To explicitly answer other questions: - Yes, httpd supports sending infinite amounts of data as long as output is sent at least as often as the value of the TimeOut directive (see https://httpd.apache.org/docs/2.2/mod/core.html#timeout ) and as long as the client (web browser, end user) does not close the connection by pressing the Stop button or by doing something else. - Anything that works via HTTP should also work via HTTPS, as far as I know, including Dw's solution. -- Mark Montague m...@catseye.org
Re: mod_proxy_fcgi + mod_proxy_balancer vs. php-fpm and query strings
On September 19, 2011 8:37 , Jim Riggs apache-li...@riggs.me wrote: httpd - balancer - fcgi balancer members - php-fpm Issue 1: PHP-FPM does not handle the proxy:balancer prefix in SCRIPT_FILENAME. It does handle proxy:fcgi as a special case (see https://bugs.php.net/bug.php?id=54152 fix by jim). So, it seems we need to also add a proxy:balancer exception there unless a balanced mod_proxy_fcgi member should actually be using proxy:fcgi instead. What are people's thoughts on the prefix that should be sent by httpd in this case? To address this for now, I have modified PHP (fpm_main.c alongside jim's existing changes). As the person who wrote the changes that Jim later modified and committed, this seems reasonable to me, assuming it is correct (I say assuming only because I have never used mod_proxy_fcgi in a balancer configuration). Issue 2: Once I got Issue 1 addressed, everything started working except in the case of a query string. I spent considerable time tracing and trying to figure out where the issue is occurring, but I am hoping one of you who is much more familiar with the code than I will be able to say, Oh, look right here. The problem is that the query string is getting appended to SCRIPT_FILENAME if proxied through a balancer. FPM does not like this. It does not seem to happen in the case of proxying directly to fcgi://..., but once I change this to balancer://..., the query string gets added to SCRIPT_FILENAME. I believe this happened with both ProxyPass* and mod_rewrite [P]. In mod_rewrite, this should get handled in splitout_queryargs(), but somehow it is getting added back (probably in proxy_balancer_canon() which adds the query string back to r-filename?). For right now, I have done a brute-force fix for this by adding the code below to the beginning of send_environment() in mod_proxy_fcgi.c, before the calls to ap_add_common_vars() and ap_add_cgi_vars(). I am guessing that this isn't the ultimate fix for this issue, so I am interested in others' thoughts. +/* Remove query string from r-filename (r-args is already set and passed via QUERY_STRING) */ +q = ap_strchr_c(r-filename, '?'); +if (q != NULL) { +*q = '\0'; +} This sounds like it is related to https://issues.apache.org/bugzilla/show_bug.cgi?id=51077 as well. Probably a new patch is needed to consistently and properly fix all of the cases (regular, mod_proxy_{f,s}cgi, mod_proxy_{f,s}cgi + balancer). -- Mark Montague m...@catseye.org
Re: Developing an Authentication Module
On September 15, 2011 11:41 , Suneet Shah suneetshah2...@gmail.com wrote: In our architecture, authentication and authorization is handled by a set of web services. I would need to have the apache module make calls to the service. I was planning on using Axis 2 for this. Are there any issues with thiat? I have no experience with Axis 2, but an Apache module can certainly utilize external services. For example, mod_auth_kerb makes RPC calls to Kerberos KDCs, and mod_auth_dbd makes queries against SQL databases. I need to be able to look at request and see if it has a security token. If it does, then I need to validate it through the service. If it does not, then I need to redirect them to an authentication page. I thought it would be easier to handle the authentication through our java application (as we have the rest of the application) or should this part of the module as well? If a person successful authenticates, then the authentication app would redirect the user to the originally requested url. This sounds very much like the way cosign works. cosign is a web single-sign-on solution that includes an Apache HTTP Server module, mod_cosign. A diagram showing how cosign works is available at http://cosign.sourceforge.net/overview.shtml The actual authentication (prompting for and verifying the user's username and password) is handled by an application written in C which runs as a CGI and is not a part of the mod_cosign module itself. You may also want to study the implementation of other web single-sign-on solutions, including Pubcookie ( http://pubcookie.org/ ) and CAS ( http://www.jasig.org/cas ). CAS may be of particular interest to you because it is written in Java. This would flow through the apache web service and mod-proxy to end up at the target location. You may not need mod_proxy unless it is key to your requirements in some way. cosign, for example, simply redirects the user to the target location after verifying the security token (cookie) or authenticating the user and issuing a new security token. -- Mark Montague m...@catseye.org
Re: Apache janitor ?
On June 8, 2011 20:11 , Igor =?utf-8?Q?Gali=C4=87?= i.ga...@brainsware.org wrote: One of the many good suggestions they propose is to have a Patch Manager - someone who makes sure that patches submitted via Bugzilla or directly to the list don't get lost in the noise and that people get some feedback, even if it's just a one liner like Thanks, we're looking into this, Nope, that's really not in our scope, etc... Committers, is there anything that list/community members could do to pitch in and help? What, if anything, would be useful and accepted? For example, is there a list of things you'd like to be done before you commit a patch, and are there parts of that list that could be delegated to one or more non-committer janitors? I'd be willing to try and help (for example), if such help would be useful. Big thanks to Igor for his message, his suggestions, and recommending the Open Source Projects and Poisonous People talk. What Igor says has been bothering me for a while, too. -- Mark Montague m...@catseye.org
Patch review request
Could someone with commit access take a look at the following patches? Neither of these are high priority, but I don't want to let them get too far out of date. https://issues.apache.org/bugzilla/show_bug.cgi?id=51077 Fixes two issues with how mod_rewrite handles rules with the [P] flag: - Makes query string handling for requests destined for mod_proxy_fcgi and mod_proxy_scgi consistent with how query strings are already handled for mod_proxy_ajp and mod_proxy_http. - Makes logic for handling query strings in directory context the same as in server context. https://issues.apache.org/bugzilla/show_bug.cgi?id=50880 Prevents mod_proxy_scgi from setting PATH_INFO unless requested, for better compliance with RFC 3875. This will hopefully be an easy patch to review, since it was just submitted for consistency with a patch which was already committed for mod_proxy_fcgi, https://issues.apache.org/bugzilla/show_bug.cgi?id=50851 Thanks in advance. -- Mark Montague m...@catseye.org
Re: Need information about Apache module development.
dev@httpd.apache.org is for discussions related to development of httpd itself. Your questions below are more appropriate for the Third Party Module Authors' List. See http://httpd.apache.org/lists.html#modules-dev A rules execution engine that is able to accept the request, evaluate a set of ops defined rules and execute various responses. There are preference of using DROOLS fusion.Any functionality in Apache based on Rules ? Most Apache HTTP Server configuration directives can be thought of as rules. But this is probably not very helpful. Note that Drools is written in Java, while Apache HTTP Server is written in C. If you want to use Drools, you may want to consider using a web server that is written in Java. What are the approaches need to be taken for dynamic load balancing.Like suppose I have 3 instances of Apache is running and due to some issue one of the instance goes down.I would expect the traffic should be balanced properly by the existing 2 instances. This will not happen unless you have a load balancer that is external to Apache HTTP Server. For load balancing apart from mod_proxy_balancer any other Apache modules can be worth looking into? mod_proxy_balancer runs on a front-end (proxy) server to balance requests that the front-end server receives between multiple back-end servers. If one of the back-end servers goes down, the front-end server will detect this and split the traffic betweent the remaining back-end servers. However, this may not be what you want. As and alert broadcast engine that has the ability to distribute an events to multiple end sources. Apache HTTP Server is not a broadcast engine. This is functionality that you would have to write and include in your module. As a storage layer that allows for data persistence for long term tracking of keys and values. We have the target of good performance.Is it going to be good using Apache or any other webserver would be suggestion. Apache HTTP Server is not a storage layer. Apache HTTP Server just processes HTTP requests. In order to process these requests, Apache HTTP Server normally serves files from some filesystem or other storage layer which is external to Apache HTTP Server itself. (Normally, this would be some local filesystem such as ext3 or NTFS; but it could also be a remote or distributed filesystem such as NFS, CIFS, or AFS. In turn, the remote filesystem could be based on iSCSI, FibreChannel, or other technologies.) Inputs would be very much appreciated. Many thanks in advance Good luck. -- Mark Montague m...@catseye.org
Re: mod_fcgid in httpd tarball?
On March 23, 2011 7:37 , Graham Leggett minf...@sharp.fm wrote: Do we want to introduce mod_fcgid now into httpd 2.3.x for the next beta? How do we reconcile mod_fcgid with mod_proxy_fcgid? Do they need to be reconciled? Each currently has strengths the other lacks. I'd be fine with having both in future httpd 2.3.x betas and 2.4, at least until one clearly becomes redundant compared to the other. -- Mark Montague m...@catseye.org
Adding ProxyErrorOverride support to mod_proxy_fcgi
I've created a patch to add support for the ProxyErrorOverride directive to mod_proxy_fcgi: https://issues.apache.org/bugzilla/show_bug.cgi?id=50913 Could someone review this patch, please, and get back to me with feedback and/or requests for changes? It's not important to me to have this functionality in 2.4, per se, but I'd like to address any concerns while everything is still relatively fresh in my mind. Many thanks! -- Mark Montague m...@catseye.org
Re: mod_fcgid in httpd tarball?
On March 18, 2011 18:07 , William A. Rowe Jr. wr...@rowe-clan.net wrote: It seems like mod_fcgid has made huge progress and is now in a much more stable bugfix epoch of it's life, similar to how mod_proxy had progressed when development was kicked out of core for major http/1.1 rework, and brought back in when a vast percentage of it's bugs had been addressed. Do we want to introduce mod_fcgid now into httpd 2.3.x for the next beta? For what it's worth, on the systems I'm deploying, I'm using mod_proxy_fcgi and putting in as much effort as necessary to fix any bugs, add features I need to it, etc., simply because mod_proxy_fcgi is a core module, while mod_fcgid is not. If mod_fcgid were in core, I may have wound up putting the effort there instead. (I say may have because I've come to think that mod_proxy_fcgi is actually a better choice for my particular needs, anyway). -- Mark Montague m...@catseye.org
Re: Cipher suite used in default Apache
On October 28, 2010 17:30 , smu johnson smujohn...@gmail.com wrote: Unfortunately, I cannot figure out a single way for apache2ctl to tell me what ciphers apache is using. Not what it supports, but what it is currently allowing when clients use https://. You can configure httpd to log which ciphers that are actually being used for each request, see: http://httpd.apache.org/docs/2.2/mod/mod_ssl.html#logformats The reason is I'm worried that it's allowing 40-bit encryption, and I would like to see actual verification from Apache whether or not my current setup is allowing it. To see if 40-bit encryption is permitted, run the following from the command line: openssl s_client -connect your-web-server.example.com:443 -cipher LOW If you get a line that looks like 140735078042748:error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure:s23_clnt.c:658: then 40-bit encryption is not supported and you are safe. If, however, you get an SSL-Session section in the output, then the Cipher line will indicate which cipher was actually negotiated and used in this test. More information and additional tests and examples are available at http://idlethreat.com/site/index.php/archives/181 http://stephenventer.blogspot.com/2006/07/openssl-cipher-strength.html -- Mark Montague m...@catseye.org