Re: MySQL Virtual Host and Traffic Module

2009-03-19 Thread Dave Ingram

Hi Vaughan,

Thanks for the response. I haven't thought of doing the SQL query the way
you suggested, however I agree that it will cause unnecessary load on busy
servers and I would like to keep this as efficient as possible.

The second option sounds more reasonable. I have already used threading to
make a function which ticks on a configurable interval so I suppose each
child process would dump data for each of its vhosts at this interval, using
a query similar to what you have suggested.
  
I think that's probably the most sensible approach. It does mean that 
you won't have up-to-the-moment statistics, and I would guess that you'd 
have to play about with different intervals as the number of hosts grows 
in order for it to scale. You may also want to consider somehow 
staggering the updates, so they don't all happen at once. It may also be 
advisable to perform an UPDATE rather than an INSERT... ON DUPLICATE 
UPDATE once your module knows that there is a value that can be updated 
(i.e. after the query has run once in the simple case, or once this 
day/hour/etc in the complex case).



I think I might go with the second option for the time being and see how it
goes but I am still interested to know if there is a way to store per vhost
data across children?
  
I would be interested to know how things turn out, and I'd be interested 
to see the final module. I've been thinking about writing a custom 
bandwidth monitoring/limiting module myself, but if I don't need to 
reinvent the wheel...


I'm afraid I can't answer this question in a definite way, though. One 
module that should store per-vhost data like this is mod_cband 
http://sourceforge.net/projects/cband/, so that might be worth looking 
into.


As a side note, I'd be interested to know how you create/template the 
virtual hosts. I myself have written a database-backed templating module 
that could be used for virtual hosting 
(http://www.dmi.me.uk/code/apache/mod_sqltemplate/) and I'm curious to 
see other approaches.


Thanks,


Dave



Thanks,
Vaughan

-Original Message-
From: Dave Ingram [mailto:d...@dmi.me.uk] 
Sent: Thursday, 19 March 2009 12:28 AM

To: modules-dev@httpd.apache.org
Subject: Re: MySQL Virtual Host and Traffic Module

Vaughan,

  

What I have so far are 2 filters which gather the inbound traffic and
outbound traffic for each transaction. These work ok and when logging
transactions to file all of the in/out byte amounts appear to be correct.
The first problem however, is that each child has its own set of memory


and
  

therefore keeps its own totals per virtual host. This also means that
multiple logging events occur for each transaction. I could just log this
all to database but it would 1) be inefficient and 2) cause the size of


the
  

database to grow quite quickly.
  



It sounds to me like you could go two ways with this. I don't know the
format of your database table, but it should be possible to update it
atomically using something like:

INSERT INTO bandwidth (vhost_id, bw_in, bw_out) VALUES (42, 1124,
5023409) ON DUPLICATE KEY UPDATE bw_in = bw_in + 1124, bw_out = bw_out +
5023409

but that could lead to a lot of load. Another way might be for each
child to collect statistics and only flush to the database periodically,
say every 30 seconds (perhaps configurable on a per-vhost basis, so that
load-heavy sites could have larger update intervals). It would still be
possible to use the query above though.

This query could probably even be updated to split statistics on a
date/time basis, if you require more granular reporting.

Or have I missed/misunderstood something?


Dave

  




Re: child_init for threads?

2009-03-19 Thread Saju Pillai

Andrej van der Zee wrote:

Hi,

Thanks, that helps!

Since I am developing my modules in C++, I think I should be using this one:

apr_status_t apr_thread_data_set(   void *   data,
const char *key,
apr_status_t(*)(void *) cleanup,
apr_thread_t *  thread  
)



Note that there is a difference between the apr_thread_data_*() methods 
and the apr_threadkey_private_*() methods. The apr_thread_data_*() 
methods actually associate your data with the thread-pool. Since each 
new apr thread has it's own private thread-pool everything should work 
fine.


The apr_threadkey_private_*() methods use the underlying thread 
libraries TLS routines.


While apr_thread_data_* may work for you, true TLS is only supplied by 
apr_threadkey_private_* routines.




If I understand correctly, I can pass my own cleanup function (calling
delete to free memory) which is called automatically when the thread
is destroyed by the APR framework. Would that be the way, or am I
still misunderstanding something?


This is correct. cleanup is a ptr to a function that gets passed a 
pointer to your data. You can choose to dealloc your data within this 
function.



srp
--
http://saju.net.in


Re: child_init for threads?

2009-03-19 Thread Andrej van der Zee
Hi,


 Note that there is a difference between the apr_thread_data_*() methods and
 the apr_threadkey_private_*() methods. The apr_thread_data_*() methods
 actually associate your data with the thread-pool. Since each new apr
 thread has it's own private thread-pool everything should work fine.

 The apr_threadkey_private_*() methods use the underlying thread libraries
 TLS routines.

 While apr_thread_data_* may work for you, true TLS is only supplied by
 apr_threadkey_private_* routines.


Thanks, that's clear! In my case the apr-thread based suffices, I
hope. But I realize now that it is difficult or impossible to get the
apr_thread_t data from the current thread in an Apache module (for
threaded MPMs). Is this possible to do this somehow? For example, does
this work:

apr_os_thread_t os_thd = apr_os_thread_current ();
apr_thread_t *apr_thd;
apr_status_t apr_os_thread_put(apr_thd, os_thd, r-pool);

And what would this do in an MPM prefork?

Or maybe there is some better technique?

Thank you!
Andrej


Re: child_init for threads?

2009-03-19 Thread Andrej van der Zee
Hi,



 apr_os_thread_t os_thd = apr_os_thread_current ();
 apr_thread_t *apr_thd;
 apr_status_t apr_os_thread_put(apr_thd, os_thd, r-pool);


Okay, this clearly doesn't work. I misunderstood the documentation,
but the sources don't lie. Anyway, how can I get the apr_thread_t for
the current thread in a module for MPM worker?

Cheers,
Andrej


MaxRequestsPerChild for MPM worker threads?

2009-03-19 Thread Andrej van der Zee
Hi,

Is there a way to let a MPM worker thread exit after serving X requests? I
need to do some tests that requires checking if my custom hook
ap_hook_thread_exit() is called properly. I found a directive to have one
thread per child that gets me in the right direction (ThreadsPerChild 1).
Now I need one to set the maximum number of requests served by a thread to
get a fast thread_exit(). I tried MaxRequestsPerChild but it does not seem
to do what I want. BTW, I start apache like /usr/local/apache2/bin/httpd -X
.

What option(s) could I use to force worker threads to exit after serving few
requests?

Thank you,
Andrej


rewrite before caching?

2009-03-19 Thread Anthony J. Biacco
I posted this on the users list, but didn't get any help, so I'm hoping
the dev people here can either help or at least explain if what I'm
seeing below is how it's meant to work..

I have a uri with cachebusting in it that looks like this:
/path?killCache=xparameter1=y%parameter2=z
where x is a unix timestamp used to make sure the url isn't cached by
the browser.

Now though I want to use apache disk caching on this url. Obviously with
the killCache parameters always changing though, this is futile. And I
can't ignore the query string with CacheIgnoreQueryString because
parameter1 and parameter2 are important to the output to the end-user,
so the cache would be inconsistent if I had caching ignore the query
string.

So I decided to rewrite the url to remove the killCache parameter, and
while this rule itself works fine (I'm rewriting to a path not a full
url), it seems the cache check is done before the rewrite.

My rule is:

RewriteCond %{QUERY_STRING}
^killCache=(.+)\parameter1=(.+)\parameter2=(.+)
RewriteRule ^/path /path?parameter1=%2parameter2=%3 [L]

And in the log I see:

 [Wed Mar 18 12:02:47 2009] [debug] mod_cache.c(131): Adding CACHE_SAVE
filter for /path
[Wed Mar 18 12:02:47 2009] [debug] mod_cache.c(138): Adding
CACHE_REMOVE_URL filter for /path
[Wed Mar 18 12:02:47 2009] [debug] mod_headers.c(740): headers:
ap_headers_output_filter()
mod_cache.c(639): cache: Caching url:
/path?killCache=xparameter1=yparameter2=z 
[Wed Mar 18 12:02:47 2009] [debug] mod_cache.c(645): cache: Removing
CACHE_REMOVE_URL filter. 
[Wed Mar 18 12:02:47 2009] [debug] mod_disk_cache.c(962): disk_cache:
Stored headers for URL
http://myhost/path?killCache=xparameter1=yparameter2=z
[Wed Mar 18 12:02:47 2009] [debug] mod_disk_cache.c(1051): disk_cache:
Body for URL http://myhost/path?killCache=xparameter1=yparameter2=z
cached.

And I get a different cache file stored for every value of the parameter
killCache.

Anybody know a way to get the rewriterule processed before the cache
mechanism? I can rewrite to a full url and the new request will then
pull from cache, but that induces an extra request and an extra log
entry, which I'd like to avoid if at all possible.

Thanx,

-Tony
---
Manager, IT Operations
Format Dynamics, Inc.
303-573-1800x27
abia...@formatdynamics.com
http://www.formatdynamics.com



RE: rewrite before caching?

2009-03-19 Thread Houser, Rick
Sounds like a badly broken application to me.  If the data is truly
cacheable, the application shouldn't be taking explicit steps to try to
prevent just that.  Depending on what the backend system is, you might
be better off using some kind of a filter to just remove that killCache
parameter in the body text before it gets to the client.  Then, you
don't have to worry about it at all.



Thanks,

Rick Houser
Auto-Owners Insurance
Systems Support
(517)703-2580

-Original Message-
From: Anthony J. Biacco [mailto:abia...@formatdynamics.com] 
Sent: Thursday, March 19, 2009 3:36 PM
To: modules-dev@httpd.apache.org
Subject: rewrite before caching?

I posted this on the users list, but didn't get any help, so I'm hoping
the dev people here can either help or at least explain if what I'm
seeing below is how it's meant to work..

I have a uri with cachebusting in it that looks like this:
/path?killCache=xparameter1=y%parameter2=z
where x is a unix timestamp used to make sure the url isn't cached by
the browser.

Now though I want to use apache disk caching on this url. Obviously with
the killCache parameters always changing though, this is futile. And I
can't ignore the query string with CacheIgnoreQueryString because
parameter1 and parameter2 are important to the output to the end-user,
so the cache would be inconsistent if I had caching ignore the query
string.

So I decided to rewrite the url to remove the killCache parameter, and
while this rule itself works fine (I'm rewriting to a path not a full
url), it seems the cache check is done before the rewrite.

My rule is:

RewriteCond %{QUERY_STRING}
^killCache=(.+)\parameter1=(.+)\parameter2=(.+)
RewriteRule ^/path /path?parameter1=%2parameter2=%3 [L]

And in the log I see:

 [Wed Mar 18 12:02:47 2009] [debug] mod_cache.c(131): Adding CACHE_SAVE
filter for /path [Wed Mar 18 12:02:47 2009] [debug] mod_cache.c(138):
Adding CACHE_REMOVE_URL filter for /path [Wed Mar 18 12:02:47 2009]
[debug] mod_headers.c(740): headers:
ap_headers_output_filter()
mod_cache.c(639): cache: Caching url:
/path?killCache=xparameter1=yparameter2=z
[Wed Mar 18 12:02:47 2009] [debug] mod_cache.c(645): cache: Removing
CACHE_REMOVE_URL filter. 
[Wed Mar 18 12:02:47 2009] [debug] mod_disk_cache.c(962): disk_cache:
Stored headers for URL
http://myhost/path?killCache=xparameter1=yparameter2=z
[Wed Mar 18 12:02:47 2009] [debug] mod_disk_cache.c(1051): disk_cache:
Body for URL http://myhost/path?killCache=xparameter1=yparameter2=z
cached.

And I get a different cache file stored for every value of the parameter
killCache.

Anybody know a way to get the rewriterule processed before the cache
mechanism? I can rewrite to a full url and the new request will then
pull from cache, but that induces an extra request and an extra log
entry, which I'd like to avoid if at all possible.

Thanx,

-Tony
---
Manager, IT Operations
Format Dynamics, Inc.
303-573-1800x27
abia...@formatdynamics.com
http://www.formatdynamics.com





mod_substitute \n

2009-03-19 Thread Nick Gearls

I found a problem with handling of new lines in mod_substitute.
Take the following file as example
html
body
/body
/html


1. If I use Substitute  s/\n/1/, it works almost correctly:
html
1body
1/body
1/html
1
Note that it does not replace the new line, but adds the replacement 
after it. This is quite weird.


2. If I use Substitute s/body\n/body2/ or
   Substitute s/\nbody/body2/, the file is unchanged


Could somebody explain how newlines are handled ?
Can we use them inside a pattern ?

Thanks,


Nick



Re: mod_substitute \n

2009-03-19 Thread Plüm, Rüdiger, VF-Group
Using newlines in the expression does not make sense as
mod_substitute (similar to sed) does apply the regular expression
line by line. You cannot process multiline regexs with mod_substitute.

Regards

Rüdiger 

 -Ursprüngliche Nachricht-
 Von: Nick Gearls 
 Gesendet: Donnerstag, 19. März 2009 15:43
 An: dev@httpd.apache.org
 Betreff: mod_substitute  \n
 
 I found a problem with handling of new lines in mod_substitute.
 Take the following file as example
   html
   body
   /body
   /html
   
 
 1. If I use Substitute  s/\n/1/, it works almost correctly:
   html
   1body
   1/body
   1/html
   1
 Note that it does not replace the new line, but adds the replacement 
 after it. This is quite weird.
 
 2. If I use Substitute s/body\n/body2/ or
 Substitute s/\nbody/body2/, the file is unchanged
 
 
 Could somebody explain how newlines are handled ?
 Can we use them inside a pattern ?
 
 Thanks,
 
 
 Nick
 
 


test framework/mod_authany's check user id hook vs. mod_ssl's

2009-03-19 Thread Jeff Trawick
mod_authany's check user id hook is registered to run APR_HOOK_FIRST, as is
mod_ssl's.

mod_ssl's check user id hook needs to run before anything else that *uses*
basic auth because it can create basic auth information from the
certificate, for processing by normal check user id hooks.

Like practically all check user id hooks, mod_authany's hook operates on
existing basic auth information, so it must run after mod_ssl's hook.

I don't have a crisp understanding of why mod_authany's check user id hook
should be registered to run APR_HOOK_FIRST.  Any comments on that?  I'll try
to think on that some more.

Note that while the current, single APR_HOOK_FIRST specification applies to
both check user id and auth checker hooks, in the original implementation of
the module APR_HOOK_FIRST was individually specified for both.  (changes to
framework magic, apparently to work with Apache 1.3)  So the double
application of APR_HOOK_FIRST isn't a hint.

Beyond the mod_authany question, why doesn't mod_ssl declare its check user
id hook really-first if it can generate the basic auth?  (Let the extremely
limited number of modules which generate basic auth headers fight it out via
predecessor/successor lists.)

assert(A change to the mod_ssl hook ordering could theoretically break
existing modules, so that should be for future releases only.)

assert(Whatever is done in mod_ssl, the 2.3 logic in mod_authany needs to
ensure that its check user id hook runs after mod_ssl's.)


RE: rewrite before caching?

2009-03-19 Thread Anthony J. Biacco
Sorry, no, these are in httpd.conf, I don't use any htaccess files.
The situation is, this particular path was going to tomcat through
mod_jk, and initially the dev guys didn't want the data cached in
tomcat, so they introduced the killCache parameter in referencing
content (before my existence in the company).
But then down the road I decided to start caching this data in apache to
relieve unnecessary load on our tomcats (and with the understanding that
at this time in our application's timeline, the data now could be
cached).
And while we changed the referencing urls that we had access to, to
remove the killCache parameter, there are some references out of our
control (currently) that still use it. So I want to now take those
requests and implement the caching for them also until the point where
we can get the people controlling the remaining references to change
them.
On the current (changed) url, I'm now setting the appropriate
cache-control and expire headers.
My current idea is using rewrite I can set an environmental variable on
the existence of killCache in the query string and then use that to set
a no-cache header and unset the expires header, that will make sure the
killCache request doesn't get cached to disk. The internally rewritten
url won't get served from cache then also, but I figure I can always do
an internal proxy with the [P] flag, and then the proxied rewritten
request will pull from cache.
I'll still get 2 requests in the log due to the proxy request, but I can
use a SetEnvIf Remote_Host to check for the local server ip (denoting
the proxied request) and not log the entry based on that.
It's ugly, I hate that I have to do it, and the proxy request will still
consume more resources than I'd like, but may just be more beneficial
than sending the killCache requests through to tomcat.
Opinions would be appreciated on my suggestion, privately if you want,
as I don't want to get too OT here.

-Tony
---
Manager, IT Operations
Format Dynamics, Inc.
303-573-1800x27
abia...@formatdynamics.com
http://www.formatdynamics.com


-Original Message-
From: Ray Morris [mailto:supp...@bettercgi.com] 
Sent: Thursday, March 19, 2009 2:22 PM
To: modules-...@httpd.apache.org
Subject: Re: rewrite before caching?

   If you're doing your rewrite in .htaccess, 
it _MAY_ help to move that rewrite to httpd.conf.
In many cases rewrite rules in httpd.conf get 
processed at an earlier stage than those in 
.htaccess.  This is becase Apace can't read 
the .htacces file until after it resolves the 
URL into a filename, so it knows which directories
to look for .htaccess files in.  That occurs 
quite late in the process, during the fixup hook.

   Other than that, I can only suggest that there 
are several http headers designed for the purpose 
of controlling caching so that just the right things 
are cached for just the right amount of time, and 
only under the right conditions.  Using that carefully 
designed and well established system may yield better 
results than the big hammer approach of sticking 
bogus junk into the URL, then trying to remove it 
later for certain kinds of caching.  You've added
that to try to prevent caching, yet clearly the 
content SHOULD be cached in some cases, as is 
clear because you are actually wanting to cache 
it.  Rather than inventing your own caching 
system, look into setting the appropriate headers
so that it's cached when appropriate and for the 
appropriate amount of time.
--
Ray B. Morris
supp...@bettercgi.com

Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/

Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/

Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php


On 03/19/2009 02:36:23 PM, Anthony J. Biacco wrote:
 I posted this on the users list, but didn't get any help, so I'm
 hoping
 the dev people here can either help or at least explain if what I'm
 seeing below is how it's meant to work..
 
 I have a uri with cachebusting in it that looks like this:
 /path?killCache=xparameter1=y%parameter2=z
 where x is a unix timestamp used to make sure the url isn't cached by
 the browser.
 
 Now though I want to use apache disk caching on this url. Obviously
 with
 the killCache parameters always changing though, this is futile. And 
 I
 can't ignore the query string with CacheIgnoreQueryString because
 parameter1 and parameter2 are important to the output to the end-
 user,
 so the cache would be inconsistent if I had caching ignore the query
 string.
 
 So I decided to rewrite the url to remove the killCache parameter, 
 and
 while this rule itself works fine (I'm rewriting to a path not a full
 url), it seems the cache check is done before the rewrite.
 
 My rule is:
 
 RewriteCond %{QUERY_STRING}
 ^killCache=(.+)\parameter1=(.+)\parameter2=(.+)
 RewriteRule ^/path /path?parameter1=%2parameter2=%3 [L]
 
 And in the log I see:
 
  

change in AP_INIT_TAKE1

2009-03-19 Thread Ryan Deemer
Hi all,

I am porting my custom module from apache 2.0 module to 2.2.11 using
gcc-3.4.6 and  get the following warning in the usage of AP_INIT_TAKE1.

mod_reversal.c:786: warning: initialization from incompatible pointer type

I see that it is discussed previously, but I couldnt find any specific
solution. to the problem.

http://objectmix.com/apache/692158-problems-starting-apache-2-2-mod_ruby-2.html
http://tp.its.yale.edu/pipermail/cas/2005-September/001575.html
http://www.nabble.com/www-apache20-on-6.0-S-td136.html

I see that in http_config.h checks if the compiler is C99 and an appropriate
definition is chosen, I tried several combinations with setting and
unsetting AP_HAVE_DESIGNATED_INITIALIZER but that didnt help.

Thanks in advance.

Ryan