Webboard: robots.txt implementation bug & its fix

2001-04-28 Thread Ivan Mikhnevich
Author: Ivan Mikhnevich Email: [EMAIL PROTECTED] Message: Hello, everybody and especially developers of mnoGoSearch. I've found a bug in robots.txt standard implementation. The only path, without filenames, is compared with Disallow-lines from robots.txt. While the standard at http://www.robots

Webboard: robots.txt implementation bug & its fix

2001-04-30 Thread Tim Hewitt
Author: Tim Hewitt Email: [EMAIL PROTECTED] Message: Ivan, The robots.txt standard clearly states that a disallow of /tmp would disallow /tmp/nextdir/file.html /tmp.html /tmpjunk/myfile.html the disallow directive being used as a prefix for the entire URL to be matched. /tmp/ would disallow

Webboard: robots.txt implementation bug & its fix

2001-05-02 Thread Ivan Mikhnevich
Author: Ivan Mikhnevich Email: [EMAIL PROTECTED] Message: The problem was not in this case. Current mnoGoSearch implementation will never understand the lines like: Disallow: /texts/print.phtml or Disallow: /forum/index.pl?archive It does not take into account the filename when checking prefix

Webboard: robots.txt implementation bug & its fix

2001-05-03 Thread Alexander Barkov
Author: Alexander Barkov Email: [EMAIL PROTECTED] Message: > Current mnoGoSearch implementation will never understand the lines like: > Disallow: /texts/print.phtml > or > Disallow: /forum/index.pl?archive > > It does not take into account the filename when checking prefix. For the examples >abo