Re: Apache modification questions

2008-09-10 Thread Andrej van der Zee
Hi,

  A post doesn't normally have anything in the
> QUERY_STRING.  Rather a POSTed form has the stuff
> being sent in the request body, which is read from
> STDIN by a script.  QUERY_STRING is still available
> for use.  Consider this form:
>
> form method=post action=myscript.cgi
> input name=emailaddress
>
>  The "emailaddress" input would NOT appear
> in QUERY_STRING, but rather would appear to
> the script as STDIN, or to another module as
> the request body.  You can still do:
>
> form method=post action=myscript.cgi?id=reallyunique
> input name=emailaddress



U yeah you are right, I am not thinking.

Cheers,
Andrej


Re: Apache modification questions

2008-09-10 Thread Ray Morris
> This is a bit awkward since the QUERY_STRING in POST 
> can be anything, for example an XML-documents. 
.. 
> Though, how should I deal with, for example, an 
> XML-document in a POST requests? Where should I 
> "hide" the transaction identifier?

   A post doesn't normally have anything in the 
QUERY_STRING.  Rather a POSTed form has the stuff 
being sent in the request body, which is read from 
STDIN by a script.  QUERY_STRING is still available 
for use.  Consider this form:

form method=post action=myscript.cgi
input name=emailaddress

  The "emailaddress" input would NOT appear 
in QUERY_STRING, but rather would appear to 
the script as STDIN, or to another module as 
the request body.  You can still do:

form method=post action=myscript.cgi?id=reallyunique
input name=emailaddress
--
Ray Morris



Re: Apache modification questions

2008-09-09 Thread Andrej van der Zee
Hi,


> > 1) What is the "recommended" way to carry the
> > GET/POST request identifier (inserted by the
> > developer of the web page) from the client
> > to Apache?
>
>You're not going to get the browser to send
> a custom header by any changes you make to the
> page, except possibly using Ajax.  Javascript isn't
> enabled on all browsers, so that's out for a
> public web site.  Cookies also are often turned
> off, so that's out, if it's a public site.
> That leaves the QUERY_STRING, mainly.  (What
> you're calling GET variables).  We have another
> way that's better for many sites, but there are
> patent issues with that method.  (My company
> is seeking a patent.)


Now I am curious, but I bet you are not going to tell me!


> Using the query string
> means rewriting the content of each page, so
> one could do what PHP does and use cookies if
> they are available, then failover to query string
> if cookies are not sent by the browser.
>
>
This is a bit awkward since the QUERY_STRING in POST can be anything, for
example an XML-documents. I wonder why it is impossible to add a HTTP
Request Header in Javascript. Does anybody know? With AJAX and ActionScript
(or Flex 3) it can be done with a few lines of code. Anyway, I guess I could
write an Apache module with a hook-method that attemps to get the identifier
from

1) Custom HTTP request header
2) A Cookie
3) QUERY_STRING

In this order and DECLINE if not found. Though, how should I deal with, for
example, an XML-document in a POST requests? Where should I "hide" the
transaction identifier?

Best regards,
Andrej


Re: Apache modification questions

2008-09-05 Thread Ray Morris
   As another poster said, a standard CGI 
script runs in a different processes and you 
can't generally muck with the address space 
of that process.  To effect that script, what 
you can do is change it's inputs and it's 
outputs.  Your module can add, remove, or 
alter the query string, path info, translate the URL, 
etc. Then you can also filter the output from 
the script to change it before it's sent to
to the browser.

   I get an odd feeling here about something.
I get the feeling that you may get better 
results by going back to a higher level view 
of your problem.  You're thinking about how to 
attach to the CGI process but you don't yet have 
enough information to know if that would be a 
good way to achieve your goal, for example. 
As a matter of fact an Apache module may well 
not be what you need.  I suggest that you go 
back and look at the actual business goal you're 
trying to achieve and perhaps email me about it 
off list as I may be able to help you find the 
right general approach and then point you in 
the right direction. 
--
Ray B. Morris
[EMAIL PROTECTED]





Re: Apache modification questions

2008-09-05 Thread Ralf Mattes
On Fri, 2008-09-05 at 18:49 +0900, Andrej van der Zee wrote:
> Hi,
> 
> >
> > A CGI script is run by the code of a module, mod_perl, mod_php5, etc.
> > In their register_hooks function they register their handler, which is
> > a script interpreter basically. The handler callback is invoked in the
> > same thread that did the rest of the request processing (URL parsing,
> > authentication, fixups, etc). 

I think there seems to be a confusion about terminology here: a
CGI-script by definition runs in its own process, _not_ within the
process address space of the webserver (hence the need for a defined 
"Common Gateway Interface" to pass request information to the external
application. See http://en.wikipedia.org/wiki/Common_Gateway_Interface).
Now, since the forking/spanwing is a rather time consuming operation
there are embedded interpreters for most of the commons scripting
languages (like perl/python etc.). Those often have a compatibility
layer to run unmodified CGI code within the webservers address/process
space (with sometimes strage side efects :-/).
 
> However, I don't know if the handler
> > callback (of mod_perl, mod_php5, etc), which can be seen as a sort of
> > third-party black box, spawns new processes/threads in which they
> > parse the script, compile, etc. I guess they do not spawn new
> > threads/processes but you have to read their docs or their sources in
> > order to be sure. If they do not spawn new threads/processes, then the
> > CGI is executed in the same thread as fixups and the rest of the
> > request processing.
> >
> 
> Thanks that makes sense.
> 
> If understood correctly, this means that I can add my own module to
> the chain of request processors that executes in the same thread as
> the hook function in mod_php5/mod_perl that executes CGI scripts.

For the embedded interpreters that's true.

> Though, if the module's hook function spawns a new process/thread for
> handling the CGI script is dependent on the module.

CGIs are handled by mod_cgi.

> Does anybody know if the hook functions of such modules usually
> spawning a new thread/process? My guess is that at least for compiled
> CGI application written in C/C++ a new process is forked in the hook
> function.

Iff those are real CGIs (applications) that's true.

 HTH Ralf Mattes

> Cheers,
> Andrej
> 
> 



Re: Apache modification questions

2008-09-05 Thread Dave Ingram
Andrej van der Zee wrote:
> Does anybody know if the hook functions of such modules usually
> spawning a new thread/process? My guess is that at least for compiled
> CGI application written in C/C++ a new process is forked in the hook
> function.
>   

This is just off the top of my head, so I have no solid proof for this,
and please correct me if I'm wrong.

I would think that the reason for having an Apache module for PHP/Perl
would be to avoid the overhead of initialising the interpreter for every
page - it initialises itself once at Apache start, and then resets its
state for each request. Or so I would hope. Using a module also gives
them access to some Apache internals that CGI applications can't reach.
CGI applications (including mod_suphp and PHP/Perl run as CGI scripts)
would be forked.


Dave


Re: Apache modification questions

2008-09-05 Thread Andrej van der Zee
Hi,

>
> A CGI script is run by the code of a module, mod_perl, mod_php5, etc.
> In their register_hooks function they register their handler, which is
> a script interpreter basically. The handler callback is invoked in the
> same thread that did the rest of the request processing (URL parsing,
> authentication, fixups, etc). However, I don't know if the handler
> callback (of mod_perl, mod_php5, etc), which can be seen as a sort of
> third-party black box, spawns new processes/threads in which they
> parse the script, compile, etc. I guess they do not spawn new
> threads/processes but you have to read their docs or their sources in
> order to be sure. If they do not spawn new threads/processes, then the
> CGI is executed in the same thread as fixups and the rest of the
> request processing.
>

Thanks that makes sense.

If understood correctly, this means that I can add my own module to
the chain of request processors that executes in the same thread as
the hook function in mod_php5/mod_perl that executes CGI scripts.
Though, if the module's hook function spawns a new process/thread for
handling the CGI script is dependent on the module.

Does anybody know if the hook functions of such modules usually
spawning a new thread/process? My guess is that at least for compiled
CGI application written in C/C++ a new process is forked in the hook
function.

Cheers,
Andrej


-- 
Andrej van der Zee
2-40-19 Koenji-minami
Suginami-ku, Tokyo
166-0003 JAPAN
Mobile: 0031-(0)80-65251092
Phone/Fax: 0031-(0)3-3318-3155


Re: Apache modification questions

2008-09-05 Thread Sorin Manolache
On Fri, Sep 5, 2008 at 11:19, Andrej van der Zee
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> Thanks for your comments.
>
>>
>> child_init is not the appropriate hook for your purpose. Use
>> ap_hook_fixups for getting the ID and ap_hook_log_transaction for
>> logging.
>
> In ap_hook_fixups, is it possible to get the thread/process ID of the
> CGI application serving the request? Moreover, is this thread/process
> already created? Or maybe the hook function is executed in the same
> thread/process as the CGI application?

A CGI script is run by the code of a module, mod_perl, mod_php5, etc.
In their register_hooks function they register their handler, which is
a script interpreter basically. The handler callback is invoked in the
same thread that did the rest of the request processing (URL parsing,
authentication, fixups, etc). However, I don't know if the handler
callback (of mod_perl, mod_php5, etc), which can be seen as a sort of
third-party black box, spawns new processes/threads in which they
parse the script, compile, etc. I guess they do not spawn new
threads/processes but you have to read their docs or their sources in
order to be sure. If they do not spawn new threads/processes, then the
CGI is executed in the same thread as fixups and the rest of the
request processing.

S


Re: Apache modification questions

2008-09-05 Thread Andrej van der Zee
Hi,

Thanks for your comments.

>
> child_init is not the appropriate hook for your purpose. Use
> ap_hook_fixups for getting the ID and ap_hook_log_transaction for
> logging.

In ap_hook_fixups, is it possible to get the thread/process ID of the
CGI application serving the request? Moreover, is this thread/process
already created? Or maybe the hook function is executed in the same
thread/process as the CGI application?

>
> Every module has a "register_hooks" function. There, you call the two
> ap_hook functions above in order to hook your callbacks to the fixups
> and log_transaction events. Next you implement the two callbacks and
> that's it.
>

That's clear.

Cheers,
Andrej


-- 
Andrej van der Zee
2-40-19 Koenji-minami
Suginami-ku, Tokyo
166-0003 JAPAN
Mobile: 0031-(0)80-65251092
Phone/Fax: 0031-(0)3-3318-3155


Re: Apache modification questions

2008-09-05 Thread Andrej van der Zee
Thanks for your comments.

Until I get the book, can you tell if a module's hook function can
execute in the same thread as the CGI application that serves the
request?

Also, I am unable to find the apache2 API for building modules. I
found some documentation for developers on the apache2 website, but
the link for "Autogenerated Apache 2 code documentation" is not
working.

Thank you,
Andrej



-- 
Andrej van der Zee
2-40-19 Koenji-minami
Suginami-ku, Tokyo
166-0003 JAPAN
Mobile: 0031-(0)80-65251092
Phone/Fax: 0031-(0)3-3318-3155


Re: Apache modification questions

2008-09-05 Thread Sorin Manolache
On Fri, Sep 5, 2008 at 05:11, Andrej van der Zee
<[EMAIL PROTECTED]> wrote:
> 2) I do need to "attach" to the thread/process handling the request to
> extract information just after starting and just before ending. Can I
> do this in an Apache module? I found the ap_hook_child_init() function
> but no similar exit()-function. Moreover, I need to access the request
> identifier and log to a file. Can all this be done in an Apache
> module?

child_init is not the appropriate hook for your purpose. Use
ap_hook_fixups for getting the ID and ap_hook_log_transaction for
logging.

Every module has a "register_hooks" function. There, you call the two
ap_hook functions above in order to hook your callbacks to the fixups
and log_transaction events. Next you implement the two callbacks and
that's it.

S


Re: Apache modification questions

2008-09-04 Thread Ray Morris
> Can all this be done in an Apache module?

  Yes, just about anything can be done in an 
Apache module.  Based on your other questions, 
it sounds like you need a thorough exposition 
of the Apache API and probably the HTTP protocol, 
so I'd suggets your next step is to study some 
good documentation, such as the Apache Modules book.

> 1) What is the "recommended" way to carry the 
> GET/POST request identifier (inserted by the 
> developer of the web page) from the client
> to Apache?

   You're not going to get the browser to send 
a custom header by any changes you make to the 
page, except possibly using Ajax.  Javascript isn't
enabled on all browsers, so that's out for a 
public web site.  Cookies also are often turned 
off, so that's out, if it's a public site.
That leaves the QUERY_STRING, mainly.  (What
you're calling GET variables).  We have another 
way that's better for many sites, but there are 
patent issues with that method.  (My company
is seeking a patent.)  Using the query string 
means rewriting the content of each page, so 
one could do what PHP does and use cookies if 
they are available, then failover to query string 
if cookies are not sent by the browser.
--
Ray B. Morris
[EMAIL PROTECTED]

Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/

Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/

Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php


On 09/04/2008 10:11:39 PM, Andrej van der Zee wrote:
> Hi,
> 
> I am about to modify Apache with some custom logging for GET/POST
> requests (and more). It is for the purpose of research. If possible, 
> I
> would like to get some guidance in how to implement my ideas. I will
> explain...
> 
> Every GET/POST request to Apache will carry a request identifier.
> Adding the identifier to the request is the responsibility of the
> developer of the web page. In Apache (NOT the CGI application)  I
> would like to extract the identifier from the request and write it to
> a log together with timestamp, request serve time and some specific
> information about the thread/process that handles the request. The 
> CGI
> application serving the request should be untouched! I have two 
> issues
> I would like to get some comments about, if possible:
> 
> 1) What is the "recommended" way to carry the GET/POST request
> identifier (inserted by the developer of the web page) from the 
> client
> to Apache? Add a custom HTTP header? Or should I do it in GET/POST
> variables? Any other alternatives?
> 2) I do need to "attach" to the thread/process handling the request 
> to
> extract information just after starting and just before ending. Can I
> do this in an Apache module? I found the ap_hook_child_init() 
> function
> but no similar exit()-function. Moreover, I need to access the 
> request
> identifier and log to a file. Can all this be done in an Apache
> module?
> 
> Hope you can help!
> 
> Cheers,
> Andrej
> 
> -- 
> Andrej van der Zee
> 2-40-19 Koenji-minami
> Suginami-ku, Tokyo
> 166-0003 JAPAN
> Mobile: 0031-(0)80-65251092
> Phone/Fax: 0031-(0)3-3318-3155
> 




Apache modification questions

2008-09-04 Thread Andrej van der Zee
Hi,

I am about to modify Apache with some custom logging for GET/POST
requests (and more). It is for the purpose of research. If possible, I
would like to get some guidance in how to implement my ideas. I will
explain...

Every GET/POST request to Apache will carry a request identifier.
Adding the identifier to the request is the responsibility of the
developer of the web page. In Apache (NOT the CGI application)  I
would like to extract the identifier from the request and write it to
a log together with timestamp, request serve time and some specific
information about the thread/process that handles the request. The CGI
application serving the request should be untouched! I have two issues
I would like to get some comments about, if possible:

1) What is the "recommended" way to carry the GET/POST request
identifier (inserted by the developer of the web page) from the client
to Apache? Add a custom HTTP header? Or should I do it in GET/POST
variables? Any other alternatives?
2) I do need to "attach" to the thread/process handling the request to
extract information just after starting and just before ending. Can I
do this in an Apache module? I found the ap_hook_child_init() function
but no similar exit()-function. Moreover, I need to access the request
identifier and log to a file. Can all this be done in an Apache
module?

Hope you can help!

Cheers,
Andrej

-- 
Andrej van der Zee
2-40-19 Koenji-minami
Suginami-ku, Tokyo
166-0003 JAPAN
Mobile: 0031-(0)80-65251092
Phone/Fax: 0031-(0)3-3318-3155