Author: Ivan Mikhnevich
Email: [EMAIL PROTECTED]
Message:
Hello, everybody and especially developers of mnoGoSearch.
I've found a bug in robots.txt standard implementation. The only path, without
filenames, is compared with Disallow-lines from robots.txt. While the standard at
http://www.robots
Author: Tim Hewitt
Email: [EMAIL PROTECTED]
Message:
Ivan,
The robots.txt standard clearly states that a disallow of
/tmp
would disallow
/tmp/nextdir/file.html
/tmp.html
/tmpjunk/myfile.html
the disallow directive being used as a prefix for the entire URL to be matched.
/tmp/
would disallow
Author: Ivan Mikhnevich
Email: [EMAIL PROTECTED]
Message:
The problem was not in this case.
Current mnoGoSearch implementation will never understand the lines like:
Disallow: /texts/print.phtml
or
Disallow: /forum/index.pl?archive
It does not take into account the filename when checking prefix
Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
> Current mnoGoSearch implementation will never understand the lines like:
> Disallow: /texts/print.phtml
> or
> Disallow: /forum/index.pl?archive
>
> It does not take into account the filename when checking prefix. For the examples
>abo