[users@httpd] Re: mod_status: extended + auto (machine-readable) output

2017-02-20 Thread Raphaël
https://github.com/apache/httpd/pull/27

On Sat, Oct 08, 2016 at 09:58:57PM -0300, Raphaël wrote:
> Hi,
> 
> I've an Apache server handling various virtual hosts and I'd like to monitor
> the distinct activities of all of them without having to parse multiple
> accesslog files.
> 
> 
> Most monitoring softwares consume the output of "mod_status?auto" which is
> made easy to parse but does not provide the detailed the information
> available in the HTML mod_status+ExtendedStatus output

[...]

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



Re: [users@httpd] resources prioritization/scheduler (app vs assets)

2017-01-22 Thread Raphaël
use it's slow, how to avoid
the accumulation of main-script request to block requests directed to static 
stuff ?

Could be seen as the opposite of the "root's ext4 inode 5% reserve":
A percentage of ThreadsPerChild which, under high-load, is *not*
allocated to the main-script ?



Thanks!











> 
> On 10/12/2016 14:22, Raphaël wrote:
> >Hi,
> >
> >I've a question on how to prioritize traffic in order to optimize
> >the service in the case of traffic bursts:
> >
> >
> >Context:
> >* a server with finite resources (let's say 1 GB mem)
> >* a PHP application: initial page load needs 100 MB (index.php)
> >* for each page load (index.php) approx:
> >   * ~ 40 subsequent assets (static files) are needed
> >   * serving assets is, obviously, quicker than serving index.php
> >* I assume, and decide, that PHP-FPM must not use more than 700MB
> >* I want to avoid "broken" pages (missing assets/images/...) as much as 
> >possible
> >
> >
> >Thus PHP-FPM is configured to not allow no more than 7 children.
> >The Apache MaxRequestWorkers (worker MPM) is set to be strictly superior than
> >7*40 (lets say 350)
> >
> >
> >Now imagine a traffic burst with 200 distinct clients simultaneously
> >hitting the main page (wow!)
> >They now occupy 57% of the Apache workers, 193 of them waiting for a
> >PHP-FPM child. ( "max" default value being ThreadsPerChild)
> >
> >... some hundreds milliseconds later...
> >
> >The 7 first clients having been served, each one now requests 40 more assets.
> >And the situation is then as follows:
> >
> >* 7 hits on index.php were already processed successfully
> >* 7 currently being processed by PHP-FPM (still occupying Apache workers)
> >* 186 queued Apache workers hits /index.php, waiting for PHP-FPM/proxy-fcgi
> >* 7*40 = 280 new hits for assets (subsequent resources needed by the 7 first 
> >clients)
> >* 157 of them immediately get an available Apache worker and can be
> >  served (157+186+7 == 350)
> >* >>>>>>>  123 assets will NOT get an available worker  <<<<<<< PROBLEM 
> > HERE
> >
> >
> >In the "best" case these 123 requests, which should have been served
> >*now*, will end up in the ListenBackLog and wait the 157 first assets to
> >be served first and liberate their workers.
> >
> >The server works virtually *as* if only 350-200 = 150 workers were
> >available (150 being < 280, which is the typical workers implication
> >for 7 pages-load)
> >
> >200 being the (unpredictable/variable) "intensity" of the burst, I would
> >like to know of a better way to handle such a situation.
> >
> >
> >The first ideas that come to mind is service shaping (prioritization/quotas):
> >How to make Apache only accept 1/40 of the traffic to the fcgi php-fpm proxy.
> >Sample heuristic:
> >>If all worker are used (350/350), we "compute" which proportion is
> >>dedicated to index.php. If it's superior to a given configurable
> >>threshold, then free some of the workers dedicated to this resources
> >>in order to accept assets-directed resources.
> >
> >I'm curious about possible solutions.
> >Thank you for reading.

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] resources prioritization/scheduler (app vs assets)

2016-12-10 Thread Raphaël
Hi,

I've a question on how to prioritize traffic in order to optimize
the service in the case of traffic bursts:


Context:
* a server with finite resources (let's say 1 GB mem)
* a PHP application: initial page load needs 100 MB (index.php)
* for each page load (index.php) approx:
  * ~ 40 subsequent assets (static files) are needed 
  * serving assets is, obviously, quicker than serving index.php
* I assume, and decide, that PHP-FPM must not use more than 700MB
* I want to avoid "broken" pages (missing assets/images/...) as much as possible


Thus PHP-FPM is configured to not allow no more than 7 children.
The Apache MaxRequestWorkers (worker MPM) is set to be strictly superior than
7*40 (lets say 350) 


Now imagine a traffic burst with 200 distinct clients simultaneously
hitting the main page (wow!)
They now occupy 57% of the Apache workers, 193 of them waiting for a
PHP-FPM child. ( "max" default value being ThreadsPerChild)

... some hundreds milliseconds later...

The 7 first clients having been served, each one now requests 40 more assets.
And the situation is then as follows:

* 7 hits on index.php were already processed successfully
* 7 currently being processed by PHP-FPM (still occupying Apache workers)
* 186 queued Apache workers hits /index.php, waiting for PHP-FPM/proxy-fcgi
* 7*40 = 280 new hits for assets (subsequent resources needed by the 7 first 
clients)
   * 157 of them immediately get an available Apache worker and can be
 served (157+186+7 == 350)
   * >>>  123 assets will NOT get an available worker  <<< PROBLEM HERE


In the "best" case these 123 requests, which should have been served
*now*, will end up in the ListenBackLog and wait the 157 first assets to
be served first and liberate their workers.

The server works virtually *as* if only 350-200 = 150 workers were
available (150 being < 280, which is the typical workers implication
for 7 pages-load)

200 being the (unpredictable/variable) "intensity" of the burst, I would
like to know of a better way to handle such a situation.


The first ideas that come to mind is service shaping (prioritization/quotas):
How to make Apache only accept 1/40 of the traffic to the fcgi php-fpm proxy.
Sample heuristic:
> If all worker are used (350/350), we "compute" which proportion is
> dedicated to index.php. If it's superior to a given configurable
> threshold, then free some of the workers dedicated to this resources
> in order to accept assets-directed resources.


I'm curious about possible solutions.
Thank you for reading.

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] mod_status: extended + auto (machine-readable) output

2016-10-08 Thread Raphaël
Hi,

I've an Apache server handling various virtual hosts and I'd like to monitor
the distinct activities of all of them without having to parse multiple
accesslog files.


Most monitoring softwares consume the output of "mod_status?auto" which is
made easy to parse but does not provide the detailed the information
available in the HTML mod_status+ExtendedStatus output

As a consequence they can't the monitor the detailed states of the
children and the virtual-host they serve.


If it were to be done, would you consider merging a patch of
mod_status in order to provide a machine-readable detailed output?
If yes, then would you have a specific guidelines/advises about the
implementation?

Eg:
* support of an "?extended" parameter in order to keep as-is the default
  and widely used output of "?auto". But ap_run_status_hook() allow
  appending anyway?

* Specific format to render the "Server Details" section (separators)

* whether or not adding the "SSL session cache" section
  (ssl_ext_status_hook) and "Proxy LoadBalancer Status" (proxy_status_hook) too?

The alternative to patching mod_status would be doing a
 custom/out-of-tree module using the ap_run_status_hook() in order
 to append to the output. But IHMO having auto+extended fits
 mod_status better.


best regards



Note: I don't know what use-cases the "NoTable" output format was
  intended to, given that its HTML is neither nice to render in
  a browser, neither is it nice to parse.

Note: from a quick look at the code, "auto" is bound to a
  "short_report" variable implying that machine-readable format was
  projected to stay short to begin with.




-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



Re: [users@httpd] getting disk cache to respect removing key from request query string

2016-08-16 Thread Raphaël
On Wed, Aug 10, 2016 at 07:13:05PM +0200, Yann Ylavic wrote:
> On Wed, Aug 10, 2016 at 5:12 PM, Raphaël <raphael.d...@gmail.com> wrote:
> > On Tue, Aug 09, 2016 at 01:03:33AM -0600, Anthony Biacco wrote:
> >> Is there any way i can rewrite the query string so that only the modified
> >> query string is used to create the cache files?
> >
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=21935
> 
> Patch proposed on bugzilla (link above), could you and/or Tony please test it?

Tested using patched 2.4.10 Debian Jessie's sources: it works!
(I posted a vhost sample in BZ)

Note that mod_cache verbose logging still log the wrong (unparsed) keys.

Thank you very much Yann!
I'll be happy to see this land in the next 2.4

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



Re: [users@httpd] AW: mod_cache/mod_cache_disk responses missing Content-Type header

2016-06-08 Thread Raphaël
On Mon, Dec 21, 2015 at 11:39:55AM -0500, Eric Covener wrote:
> On Mon, Dec 21, 2015 at 11:25 AM, Alexander Härtig
>  wrote:
> > AH00717: Premature end of cache headers.
> 
> The error reporting is very poor there. Are you able to patch
> mod_cache_disk and run with at least LogLevel debug?
> 
> http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/cache/mod_cache_disk.c?r1=1721210=1721209=1721210


trace8 brings nothing new w.r.t debug (as sent in my previous email)

I found the following behavior interesting :
GET /  => OK  # put in cache
POST / => OK  # invalidate cache
POST / => not OK  # cache_disk:error AH00717

My guess is that it comes from following patch from 2013:
> mod_cache: Invalidate cached entities in response to RFC2616 Section
https://mail-archives.apache.org/mod_mbox/httpd-cvs/201305.mbox/%3c20130528203005.1a55d2388...@eris.apache.org%3E

* It seems it introduced a CACHE_INVALIDATE filter (for which, sadly, I
  didn't find much documentation)
* I guess it introduced an invalid code path triggering
  invalidate_entity() => recall_headers() for an already invalidated entity.

Maybe this error it just the symptom of trying to invalidate a "removed"
entity twice.

When doing tests I was also able to encouter easily:
> AH02468: cache: Attempted to invalidate cached entity with key: 
> http://:80/index.php?
by just playing a bit with htcacheclean

(side note: htcacheclean -A  is an incorrect usage, but an unexpected 
behavior)


Going further, I guess that the second invalidation shows this issue because
the first invalidation (in during the first POST) failed to output a
valid mod_cache file.

Attached the initial cache header file (GET.txt, just after a GET), copied from
/var/cache/apache2/mod_cache_disk/6q/m_/*.header.vary/ou/Ye/*.header 
and then just after a POST was issued (successive POSTs cause the error
but don't change this file anymore) : POST.txt


I hope it makes things clearer and could help bringing up a full
explanation + fix + workaround.


thank you
�&y׾8�4yz�J�4��8�4y׾8�41��C,http://developmentsystem:80/index.php?Vary:
 Cookie, Cookie
Vary: Cookie, Cookie
Last-Modified: Sun, 03 Apr 2016 16:21:51 GMT
X-Pingback: http://developmentsystem/xmlrpc.php
Cache-Control: public, max-age=300
Link: ; rel="https://api.w.org/;, 
; rel=shortlink
Content-Type: text/html; charset=UTF-8

Authorization: Basic 
Host: developmentsystem
User-Agent: lwp-request/6.03 libwww-perl/6.08

�&y׾8�4yz�J�4��8�4y׾8�41��C,http://developmentsystem:80/index.php?

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org

Re: [users@httpd] AW: mod_cache/mod_cache_disk responses missing Content-Type header

2016-06-05 Thread Raphaël Droz
On Mon, Dec 21, 2015 at 04:25:46PM +, Alexander Härtig wrote:
> By taking a closer look at the log-files I found a lot error entries with the 
> following message:
> AH00717: Premature end of cache headers.
> I tried to google but I couldn't find anything about what causes these 
> problems. I'm not even sure which headers are meant in this context, cache 
> entry header or request header?

I just probably stumbled upon the very same issue.
The issue arise when a POST was issued on an URL (for which GET would be
cached).
The first POST seems fine, but the second one isn't.

See log below (Apache 2.4.10-10+deb8u4)
(I'm simply POSTing x=y)


## FIRST POST on /

> [Mon Jun 06 05:13:26.861453 2016] [cache:debug] [pid 25898:tid 
> 13984785122] mod_cache.c(440): [client x.x.x.x:49398] AH02463: 
> PUT/POST/DELETE: Adding CACHE_INVALIDATE filter for /index.php
> [Mon Jun 06 05:13:27.202134 2016] [cache:debug] [pid 25898:tid 
> 13984785122] mod_cache.c(1731): [client x.x.x.x:49398] AH00777: cache: 
> CACHE filter was added twice, or was added where the cache has been bypassed 
> and will be ignored: /
> [Mon Jun 06 05:13:27.202225 2016] [cache:debug] [pid 25898:tid 
> 13984785122] mod_cache.c(1677): [client x.x.x.x:49398] AH02467: cache: 
> Invalidating all cached entities in response to 'POST' request for /index.php
> [Mon Jun 06 05:13:27.202246 2016] [cache:debug] [pid 25898:tid 
> 13984785122] cache_storage.c(664): [client x.x.x.x:49398] AH00698: cache: 
> Key for entity /index.php?(null) is http://mywebsite.test:80/index.php?
> [Mon Jun 06 05:13:27.202356 2016] [cache_disk:debug] [pid 25898:tid 
> 13984785122] mod_cache_disk.c(572): [client x.x.x.x:49398] AH00709: 
> Recalled cached URL info header http://mywebsite.test:80/index.php?
> [Mon Jun 06 05:13:27.202368 2016] [cache_disk:debug] [pid 25898:tid 
> 13984785122] mod_cache_disk.c(885): [client x.x.x.x:49398] AH00720: 
> Recalled headers for URL http://mywebsite.test:80/index.php?
> [Mon Jun 06 05:13:27.202545 2016] [cache_disk:debug] [pid 25898:tid 
> 13984785122] mod_cache_disk.c(1350): [client x.x.x.x:49398] AH00737: 
> commit_entity: Headers and body for URL http://mywebsite.test:80/index.php? 
> cached.
> [Mon Jun 06 05:13:27.202559 2016] [cache:debug] [pid 25898:tid 
> 13984785122] cache_storage.c(752): [client x.x.x.x:49398] AH02468: cache: 
> Attempted to invalidate cached entity with key: 
> http://mywebsite.test:80/index.php?


## SECOND POST on /

> [Mon Jun 06 05:13:40.044414 2016] [cache:debug] [pid 25899:tid 
> 139847842862848] mod_cache.c(440): [client x.x.x.x:49402] AH02463: 
> PUT/POST/DELETE: Adding CACHE_INVALIDATE filter for /index.php
> [Mon Jun 06 05:13:40.283025 2016] [cache:debug] [pid 25899:tid 
> 139847842862848] mod_cache.c(1731): [client x.x.x.x:49402] AH00777: cache: 
> CACHE filter was added twice, or was added where the cache has been bypassed 
> and will be ignored: /
> [Mon Jun 06 05:13:40.283099 2016] [cache:debug] [pid 25899:tid 
> 139847842862848] mod_cache.c(1677): [client x.x.x.x:49402] AH02467: cache: 
> Invalidating all cached entities in response to 'POST' request for /index.php
> [Mon Jun 06 05:13:40.283125 2016] [cache:debug] [pid 25899:tid 
> 139847842862848] cache_storage.c(664): [client x.x.x.x:49402] AH00698: cache: 
> Key for entity /index.php?(null) is http://mywebsite.test:80/index.php?
> [Mon Jun 06 05:13:40.283183 2016] [cache_disk:debug] [pid 25899:tid 
> 139847842862848] mod_cache_disk.c(572): [client x.x.x.x:49402] AH00709: 
> Recalled cached URL info header http://mywebsite.test:80/index.php?
> [Mon Jun 06 05:13:40.283189 2016] [cache_disk:error] [pid 25899:tid 
> 139847842862848] [client x.x.x.x:49402] AH00717: Premature end of cache 
> headers.
> [Mon Jun 06 05:13:40.283201 2016] [cache_disk:error] [pid 25899:tid 
> 139847842862848] [client x.x.x.x:49402] AH00717: Premature end of cache 
> headers.
> [Mon Jun 06 05:13:40.283204 2016] [cache_disk:debug] [pid 25899:tid 
> 139847842862848] mod_cache_disk.c(885): [client x.x.x.x:49402] AH00720: 
> Recalled headers for URL http://mywebsite.test:80/index.php?
> [Mon Jun 06 05:13:40.283304 2016] [cache_disk:debug] [pid 25899:tid 
> 139847842862848] mod_cache_disk.c(1350): [client x.x.x.x:49402] AH00737: 
> commit_entity: Headers and body for URL http://mywebsite.test:80/index.php? 
> cached.
> [Mon Jun 06 05:13:40.283311 2016] [cache:debug] [pid 25899:tid 
> 139847842862848] cache_storage.c(752): [client x.x.x.x:49402] AH02468: cache: 
> Attempted to invalidate cached entity with key: 
> http://mywebsite.test:80/index.php?


Here are two samples responses of :

### (cached) GET /
> 200 OK
> Cache-Control: public, max-age=300
> Connection: close
> Date: Mon, 06 Jun 2016 03:22:54 GMT
> Age: 6
> Server: Apache/2.4
> Vary: Cookie,Accept-Encoding
> Content-Length: 35658
> Content-Type: text/html; charset=UTF-8
> Last-Modified: Sun, 03 Apr 2016 16:21:51 GMT
> Client-Date: Mon, 06 Jun 2016 03:23:03 GMT
> Client-Peer: x.x.x.x:80

[users@httpd] Re: mod_cache for FallbackResource?

2015-12-22 Thread Raphaël
Any takers?

>From another discussion level I wanted to see if cache disk could
compete with Varnish, eg:
- Apache + mod_cache_disk + mod_ssl
could be a better stack than
- Apache + Varnish + Pound.

So far, I'm under the impression that managing a reverse-caching proxy
with mod_cache is, if even realistically possible, by far more complex
and less powerful than Varnish.
That's pretty hard to believe since being an Apache module, mod_cache
theorically benefits from a better integration and higher "knowledge"
from the backend HTTPd.

As an example, caching dynamic resources having different query strings
is a non-issue using Varnish (or most other reverse-proxy caches).
What makes mod_cache so specific in this regard?



On Tue, Nov 10, 2015 at 10:58:28AM -0300, Raphaël wrote:
> Hi,
> 
> using php/fcgi, I've a Content Management System whose entry-point is 
> /index.php
> On the Apache-side it makes use of FallbackResource
> 
> According to the documentation:
> > As a filter, mod_cache can be placed in front of content originating
> > from any handler, including flat files (served from a slow disk cached
> > on a fast disk), the output of a CGI script or dynamic content
> > generator, or content proxied from another server.
> 
> 
> I want to benefit from this fine grained control and configure it as:
> 
> > 
> > ServerName website
> > DocumentRoot "/var/www/website"
> > 
> > Require all granted
> > AllowOverride None
> > 
> > SetHandler "proxy:unix:/var/run/php5-fpm-website.sock|fcgi://blah"
> > 
> > FallbackResource /index.php
> > 
> > 
> > 
> > 
> > 
> > CacheHeader On
> > CacheDetailHeader On
> > CacheQuickHandler Off
> > 
> > 
> > CacheEnable disk
> > 
> > 
> > 
> > CacheDisable on
> > 
> > 
> 
> 
> A sample, not significant, index.php file inside /var/www/website:
> >  > $uri = trim($_SERVER['REQUEST_URI'], '/');
> > if ($uri == 'cacheit')   header('Cache-Control: max-age=30');
> > elseif ($uri == 'dontcache') header('Cache-Control: no-cache');
> 
> 
> There are multiple issues, whatever syntax/order variations is used, like
> > `CacheEnable disk /`
> at the VirtualHost level.
> 
> 
> But the first one being that the  directive are *not* taken into
> account.
> A sample of mod_cache debug output, when a `GET /dontcache` is issued soon
> after `GET /cacheit` (and results in a cached output):
> > cache_storage.c(664): AH00698: cache: Key for entity /index.php?(null) is 
> > http://website:80/index.php?
> > mod_cache_disk.c(572): AH00709: Recalled cached URL info header 
> > http://website:80/index.php?
> > mod_cache_disk.c(885): AH00720: Recalled headers for URL 
> > http://website:80/index.php?
> > mod_cache.c(601): AH00761: Replacing CACHE with CACHE_OUT filter for 
> > /index.php
> > mod_cache.c(652): AH00763: cache: running CACHE_OUT filter
> > mod_cache.c(681): AH00764: cache: serving /index.php
> 
> Indeed htcacheclean -A only shows one unique version of stored, keyed 
> "index.php"
> 
> 
> I did some attempts using `CacheQuickHandler On` and was able to get
> distinct cache entries for /cacheit and /dontcache.
> That was good but it does not solve the issue of  not being
> taken into account (and the CacheEnable flag not being respected):
> 
> Eg:
> >  
> >   CacheDisable on
> > 
> is not respected (wild guess: because /private is not a "real" resource)
> 
> 
> Moreover I'd rather stick with a normal cache handler since I hope it'd would
> make possible to insert Header/RequestHeader and I also expect to
> use things like:
> > SetEnvIfNoCase Cookie admin_cookie no-cache
> that the CacheQuickHandler would not treat.
> 
> 
> Question:
> Is there any way to make CacheEnable work on a granular  basis
> when a FallbackResource is used and that the  parameter is the
> original un-rewritten URL?
> 
> 
> Thank you!

-- 
GPG id: 0xF41572CEBD4218F4

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] Re: graphing Apache directives order

2015-12-22 Thread Raphaël
Am I the only one such a (or similar) graphical representation would
highly help?
The begin with, any pointer to the documentation of the 10 most used
modules directives ordering and 10 most used directives processing order
would help.



On Thu, Nov 19, 2015 at 11:23:51PM -0300, Raphaël D wrote:
> Hi,
> 
> I'm trying to figure out what can affect Apache mod_cache and when
> (= where should it be placed).
> 
> 
> I made a graph (attachment) in an attempt to start writing that down,
> but it's still (very) far from correct. Well, actually, I'm not even
> sure that the best may to represent the issue.
> (in green: directives  *setting* the "no-cache" environment variable
>  in yellow: directives *using*   the "no-cache" environment variable)
> 
> 
> I started from (so typical?) a use case where I wanted:
> 
> - to avoid cache based on raw URL:
> > SetEnvIf Request_URI ^/content no-cache #  context
> - to avoid cache based on cookie:
> > SetEnvIf Cookie /post/ no-cache 
> - however, cache some even with cookies
> >  UnsetEnv no-cache 
> - but avoid caching some location completely, while in a 
> >  CacheDisable on 
> - but cache by default
> >  CacheEnable disk 
> - taking into account that:
> > FallbackResource /index.php #  context
> 
> With 3 contexts involving so many directives from various
> modules, I'm obviously unable to get it right without a deep
> documentation of the ordering and merging rules.
> 
> 
> Anyway the "no-cache" variable is a nice example and environment seems
> to be the one mean by which an admin can make modules communicates with
> each other in Apache configuration.
> 
> 
> The graph is based on the section-merge documentation
> (/...//) however Apache documentation clearly
> states that merging policy is a module-specific thing.
> Then how could a better graphic representation look like?
> 
> 
> 
> Actually, I found that even the combination of two simple things like:
> > SetEnvIf Request_URI 
> +
> > RewriteRule . /foo
> makes Request_URI matches /foo in every configurations I tried.
> 
> 
> I also suffered the fact that URI argument given to CacheEnable
> (CacheQuickHandler=off) is passed *after* FallbackResource is run.
> (still unanswered) question about this sent here earlier this month.
> 
> 
> It seems to imply that RewriteRule always precede  too.
> What I felt surprising since a FallbackResource inside a 
> will affect a -based CacheEnable directive...
> 
> 
> I expect many more "strange" things could beat hard unless the processing
> order is made clear.
> 
> 
> 
> thank you!

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] graphing Apache directives order

2015-11-19 Thread Raphaël D
Hi,

I'm trying to figure out what can affect Apache mod_cache and when
(= where should it be placed).


I made a graph (attachment) in an attempt to start writing that down,
but it's still (very) far from correct. Well, actually, I'm not even
sure that the best may to represent the issue.
(in green: directives  *setting* the "no-cache" environment variable
 in yellow: directives *using*   the "no-cache" environment variable)


I started from (so typical?) a use case where I wanted:

- to avoid cache based on raw URL:
> SetEnvIf Request_URI ^/content no-cache #  context
- to avoid cache based on cookie:
> SetEnvIf Cookie /post/ no-cache 
- however, cache some even with cookies
>  UnsetEnv no-cache 
- but avoid caching some location completely, while in a 
>  CacheDisable on 
- but cache by default
>  CacheEnable disk 
- taking into account that:
> FallbackResource /index.php #  context

With 3 contexts involving so many directives from various
modules, I'm obviously unable to get it right without a deep
documentation of the ordering and merging rules.


Anyway the "no-cache" variable is a nice example and environment seems
to be the one mean by which an admin can make modules communicates with
each other in Apache configuration.


The graph is based on the section-merge documentation
(/...//) however Apache documentation clearly
states that merging policy is a module-specific thing.
Then how could a better graphic representation look like?



Actually, I found that even the combination of two simple things like:
> SetEnvIf Request_URI 
+
> RewriteRule . /foo
makes Request_URI matches /foo in every configurations I tried.


I also suffered the fact that URI argument given to CacheEnable
(CacheQuickHandler=off) is passed *after* FallbackResource is run.
(still unanswered) question about this sent here earlier this month.


It seems to imply that RewriteRule always precede  too.
What I felt surprising since a FallbackResource inside a 
will affect a -based CacheEnable directive...


I expect many more "strange" things could beat hard unless the processing
order is made clear.



thank you!


apache-req-proc.dia
Description: Binary data

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org

[users@httpd] mod_cache for FallbackResource?

2015-11-10 Thread Raphaël
Hi,

using php/fcgi, I've a Content Management System whose entry-point is /index.php
On the Apache-side it makes use of FallbackResource

According to the documentation:
> As a filter, mod_cache can be placed in front of content originating
> from any handler, including flat files (served from a slow disk cached
> on a fast disk), the output of a CGI script or dynamic content
> generator, or content proxied from another server.


I want to benefit from this fine grained control and configure it as:

> 
> ServerName website
> DocumentRoot "/var/www/website"
> 
>   Require all granted
>   AllowOverride None
>   
>   SetHandler "proxy:unix:/var/run/php5-fpm-website.sock|fcgi://blah"
>   
>   FallbackResource /index.php
> 
> 
> 
> 
> 
> CacheHeader On
> CacheDetailHeader On
> CacheQuickHandler Off
> 
> 
>   CacheEnable disk
> 
> 
> 
>   CacheDisable on
> 
> 


A sample, not significant, index.php file inside /var/www/website:
>  $uri = trim($_SERVER['REQUEST_URI'], '/');
> if ($uri == 'cacheit')   header('Cache-Control: max-age=30');
> elseif ($uri == 'dontcache') header('Cache-Control: no-cache');


There are multiple issues, whatever syntax/order variations is used, like
> `CacheEnable disk /`
at the VirtualHost level.


But the first one being that the  directive are *not* taken into
account.
A sample of mod_cache debug output, when a `GET /dontcache` is issued soon
after `GET /cacheit` (and results in a cached output):
> cache_storage.c(664): AH00698: cache: Key for entity /index.php?(null) is 
> http://website:80/index.php?
> mod_cache_disk.c(572): AH00709: Recalled cached URL info header 
> http://website:80/index.php?
> mod_cache_disk.c(885): AH00720: Recalled headers for URL 
> http://website:80/index.php?
> mod_cache.c(601): AH00761: Replacing CACHE with CACHE_OUT filter for 
> /index.php
> mod_cache.c(652): AH00763: cache: running CACHE_OUT filter
> mod_cache.c(681): AH00764: cache: serving /index.php

Indeed htcacheclean -A only shows one unique version of stored, keyed 
"index.php"


I did some attempts using `CacheQuickHandler On` and was able to get
distinct cache entries for /cacheit and /dontcache.
That was good but it does not solve the issue of  not being
taken into account (and the CacheEnable flag not being respected):

Eg:
>  
>   CacheDisable on
> 
is not respected (wild guess: because /private is not a "real" resource)


Moreover I'd rather stick with a normal cache handler since I hope it'd would
make possible to insert Header/RequestHeader and I also expect to
use things like:
> SetEnvIfNoCase Cookie admin_cookie no-cache
that the CacheQuickHandler would not treat.


Question:
Is there any way to make CacheEnable work on a granular  basis
when a FallbackResource is used and that the  parameter is the
original un-rewritten URL?


Thank you!

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] Re: avoid Vary: User-Agent when ?

2015-11-07 Thread Raphaël
On Mon, Nov 02, 2015 at 10:04:45PM -0300, Raphaël wrote:
> Hi,
> 
> still related to the previous post, but put simply:
> I'd just appreciate to learn how to keep Apache from automatically
> adding "User-Agent" to the "Vary" header.

This was an user error:

This line:
> 
obviously caused the `Vary:` HTTP header to contain "User-Agent".

I still have to find how to keep the "new" syntax without the cost of
the header modification.

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] Re: avoid Vary: User-Agent when ?

2015-11-07 Thread Raphaël
On Sun, Nov 08, 2015 at 12:04:33AM -0300, Raphaël wrote:
> This line:
> > 
> obviously caused the `Vary:` HTTP header to contain "User-Agent".
> 
> I still have to find how to keep the "new" syntax without the cost of
> the header modification.

which is clearly stated in the documentation. Thus:
> 

end of the story, sorry for the noise.

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] Re: mod_cache_disk: how to avoid Vary: User-Agent?

2015-11-02 Thread Raphaël
Hi,

still related to the previous post, but put simply:
I'd just appreciate to learn how to keep Apache from automatically
adding "User-Agent" to the "Vary" header.


thank you in advance



On Thu, Oct 22, 2015 at 11:51:31PM -0300, Raphaël Droz wrote:
> I recently received a dozens of AH00708: errors in my logs.
> Example:
> > (2)No such file or directory: AH00708: Cannot open data file 
> > /var/cache/apache2/mod_cache_disk/Y/Z/CLVvRR_4nNWabdUv_5wA.header.vary/B/z/b...@34pw3mjoqhrzda2kq.data
> 
> Said file does not exist, only does the corresponding .header file.
> The cache is 455 MB, I don't know what could have happened and I
> activated debug loglevel for mod_cache_disk for next days in order to
> understand why the data file isn't created.
> 
> ... but, looking at the header file I found the following line rather
> strange:
> [...]
> > Content-Encoding: gzip
> > Vary: User-Agent, User-Agent, User-Agent, User-Agent
> > Accept-Ranges: byte
> [...]
> 
> - I've no occurrence of Vary either in /var/www nor in /etc/apache2
> - searching google retrieved a couple of similar results of this Vary: string
> 
> The origin is probably the BrowserMatch directive which, by default are
> enabled in Debian's mods-available/setenvif.conf.
> (and mod_ssl depends upon mod_setenvif)
> 
> This is how I realized that the cache coupled to setenvif in it's
> default configuration was *very* inefficient (disk-space).
> `Vary: User-Agent` is something we would rather avoid given the
> ridiculously huge number of combination for this string nowadays.
> From the last quick grep' I had 325 of them what could reduce my cache's
> size to a dozens of MB and increase proportionally it's efficiency.
> 
> 
> First, not really a question, but a couple of suggestions:
> - BrowserMatch should not blindly add to the Vary header but
>   addifnotexist (it's part of the mod_headers possibility in Apache 2.4
>   series isn't?)
> - mod_cache_disk documentation could state whether this duplication hurts or 
> not
> - the default setenvif.conf may avoid use of "exotic"/"rare" default
>   BrowserMatch directives in case of 
> 
> 
> 
> ... but then, following the objective to avoid a Vary: User-Agent, I
> found that I had two other BrowserMatch in default-ssl.conf.
> The second one about MSIE 7 to 10, in order to set ssl-unclean-shutdown.
> - I didn't find where is documented what this variable does internally
> - I guess this variable only affects mod_ssl behavior but nothing about
>   request header/data
> If the above is right, then is it pertinent with discard^W Vary:
> User-Agent if the content can't change according to the User-Agent.
> Maybe BrowserMatch could be more flexible about that.
> 
> Anyway, I commented these two mod_ssl BrowserMatch directives (in case I
> would not care about broken? MSIE behaviors), but I'm still Vary'ing:
> User-Agent.
> 
> I commented the ultimate SetEnvIfExpr my configuration contained
> (unrelated to browsers anyway) and egrep -ri '(Browser|SetEnvIf)'
> but still... Vary: User-Agent
> https://github.com/apache/httpd/search?l=c=User-Agent=Code=%E2%9C%93
> didn't brought up another clear source for this.
> 
> Any hint about where this Vary's value could come from?
> 
> 
> thank you
> 

-- 
GPG id: 0xF41572CEBD4218F4

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org



[users@httpd] mod_cache_disk: .header and strange Vary: User-Agent

2015-10-22 Thread Raphaël Droz
I recently received a dozens of AH00708: errors in my logs.
Example:
> (2)No such file or directory: AH00708: Cannot open data file 
> /var/cache/apache2/mod_cache_disk/Y/Z/CLVvRR_4nNWabdUv_5wA.header.vary/B/z/b...@34pw3mjoqhrzda2kq.data

Said file does not exist, only does the corresponding .header file.
The cache is 455 MB, I don't know what could have happened and I
activated debug loglevel for mod_cache_disk for next days in order to
understand why the data file isn't created.

... but, looking at the header file I found the following line rather
strange:
[...]
> Content-Encoding: gzip
> Vary: User-Agent, User-Agent, User-Agent, User-Agent
> Accept-Ranges: byte
[...]

- I've no occurrence of Vary either in /var/www nor in /etc/apache2
- searching google retrieved a couple of similar results of this Vary: string

The origin is probably the BrowserMatch directive which, by default are
enabled in Debian's mods-available/setenvif.conf.
(and mod_ssl depends upon mod_setenvif)

This is how I realized that the cache coupled to setenvif in it's
default configuration was *very* inefficient (disk-space).
`Vary: User-Agent` is something we would rather avoid given the
ridiculously huge number of combination for this string nowadays.
>From the last quick grep' I had 325 of them what could reduce my cache's
size to a dozens of MB and increase proportionally it's efficiency.


First, not really a question, but a couple of suggestions:
- BrowserMatch should not blindly add to the Vary header but
  addifnotexist (it's part of the mod_headers possibility in Apache 2.4
  series isn't?)
- mod_cache_disk documentation could state whether this duplication hurts or not
- the default setenvif.conf may avoid use of "exotic"/"rare" default
  BrowserMatch directives in case of 



... but then, following the objective to avoid a Vary: User-Agent, I
found that I had two other BrowserMatch in default-ssl.conf.
The second one about MSIE 7 to 10, in order to set ssl-unclean-shutdown.
- I didn't find where is documented what this variable does internally
- I guess this variable only affects mod_ssl behavior but nothing about
  request header/data
If the above is right, then is it pertinent with discard^W Vary:
User-Agent if the content can't change according to the User-Agent.
Maybe BrowserMatch could be more flexible about that.

Anyway, I commented these two mod_ssl BrowserMatch directives (in case I
would not care about broken? MSIE behaviors), but I'm still Vary'ing:
User-Agent.

I commented the ultimate SetEnvIfExpr my configuration contained
(unrelated to browsers anyway) and egrep -ri '(Browser|SetEnvIf)'
but still... Vary: User-Agent
https://github.com/apache/httpd/search?l=c=User-Agent=Code=%E2%9C%93
didn't brought up another clear source for this.

Any hint about where this Vary's value could come from?


thank you


-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org