Hi.

1/ thanks

2/ on/off :
for my needs, I wanted to be sure, *really* sure, that is the module is
enabled, the server will return this response for all vhosts.
So I did not wan't to enable/disable it by vhost.
Then, if it could only be on/off gloablly, adding/removing the module is
the way to toggle it on/off ...
On debian, it is simple : "a2(en|dis)mod norobot" ...

3/ right. I'm not sure if the last \n is mandatory, but I added it ...


Note : I saw that the default comments, coming from the sample are wrong
: you don't have to enable the handler per Location.
I updated them ...


Thanks a lot for your advices !


Regards,

Mike Baroukh
---
Cardiweb  - 29 Cite d'Antin Paris IXeme
+33 6 63 57 27 22 / +33 1 53 21 82 63 
http://www.cardiweb.com/
---


Le 15/02/2012 11:07, Nick Kew a écrit :
> On 15 Feb 2012, at 08:07, Mike Baroukh wrote:
>
>> Disclaimer :
>> I'm absolutly not a C ou System developer.
>> I'm a Java developer.
>> And this is my first module.
>> So maybe it could be made better ...
> If you're asking for criticism, here goes:
>
> 1.  It looks fine as far as it goes.
> 2.  But would be much more generalisable if it were configurable on/off.
>     This would remove the issue of running order which you tackled with
>     APR_HOOK_FIRST.
> 3.  "Be conservative in what you send".  The last line of your
>     robots.txt is unterminated!
>
/* 
**  mod_norobot.c -- Apache sample norobot module
**  [Autogenerated via ``apxs -n norobot -g'']
**
**  To play with this sample module first compile it into a
**  DSO file and install it into Apache's modules directory 
**  by running:
**
**    $ apxs -c -i mod_norobot.c
**
**  Then activate it in Apache's apache2.conf file for instance
**  for the URL /norobot in as follows:
**
**    #   apache2.conf
**    LoadModule norobot_module modules/mod_norobot.so
**
**  Then after restarting Apache via
**
**    $ apachectl configtest
**    then
**    $ apachectl restart
**
**  you immediately can request the URL /norobot and watch for the
**  output of this module. This can be achieved for instance via:
**
**    $ lynx -mime_header http://localhost/robots.txt
**    if you have lynx installed or
**    $ wget -O - --save-header http://localhost/robots.txt 2>/dev/null
**
**  The output should be similar to the following one:
**
**    HTTP/1.1 200 OK
**    Date: Tue, 31 Mar 1998 14:42:22 GMT
**    Server: Apache/1.3.4 (Unix)
**    Connection: close
**    Content-Type: text/plain
**  
**    User-agent: *
**    Disallow: /
*/ 

#include "httpd.h"
#include "http_config.h"
#include "http_protocol.h"
#include "ap_config.h"

/* the content handler. 
   do something only if uri is "/robots.txt"
*/
static int norobot_handler(request_rec *r)
{

    if (r->parsed_uri.path==NULL || strcasecmp(r->parsed_uri.path, "/robots.txt")) {
        return DECLINED;
    }

    ap_set_content_type(r, "text/plain");
    ap_rputs("User-agent: *\nDisallow: /\n", r);

    return OK;
}

static void norobot_register_hooks(apr_pool_t *p)
{
    ap_hook_handler(norobot_handler, NULL, NULL, APR_HOOK_FIRST);
}

/* Dispatch list for API hooks */
module AP_MODULE_DECLARE_DATA norobot_module = {
    STANDARD20_MODULE_STUFF, 
    NULL,                  /* create per-dir    config structures */
    NULL,                  /* merge  per-dir    config structures */
    NULL,                  /* create per-server config structures */
    NULL,                  /* merge  per-server config structures */
    NULL,                  /* table of config file commands       */
    norobot_register_hooks /* register hooks                      */
};

Reply via email to