Hi,

Why is it that in the below config file, the Disallow's are ignored when the
indexer crawls the web sites?  Should the Dissallow's be placed elsewhere in
the configuration file so that they are honoured?

Thanks,

Michael




DBAddr          mysql://xxx:xxx@localhost/search/
DBMode multi
VarDir /usr/local/mnogosearch/var
StopwordFile stopwords/en.huge.sl
MinWordLength 2
MaxDocSize 1548576
DeleteNoServer yes
Disallow *.b    *.sh   *.md5  *.rpm
Disallow *.arj  *.tar  *.zip  *.tgz  *.gz   *.z     *.bz2
Disallow *.lha  *.lzh  *.rar  *.zoo  *.ha   *.tar.Z
Disallow *.gif  *.jpg  *.jpeg *.bmp  *.tiff *.tif   *.xpm  *.xbm *.pcx
Disallow *.vdo  *.mpeg *.mpe  *.mpg  *.avi  *.movie *.mov  *.dat
Disallow *.mid  *.mp3  *.rm   *.ram  *.wav  *.aiff  *.ra
Disallow *.vrml *.wrl  *.png
Disallow *.exe  *.com  *.cab  *.dll  *.bin  *.class *.ex_
Disallow *.tex  *.texi *.texinfo
Disallow *.cdf  *.ps
Disallow *.ai   *.eps  *.ppt  *.hqx
Disallow *.cpt  *.bms  *.oda  *.tcl
Disallow *.o    *.a    *.la   *.so *.log *.LOG *.js
Disallow *.pat  *.pm   *.m4   *.am   *.css
Disallow *.map  *.aif  *.sit  *.sea
Disallow *.m3u  *.qt   *.mov  *.rdf
Disallow *D=A *D=D *M=A *M=D *N=A *N=D *S=A *S=D
Disallow Regex \.r[0-9][0-9]$ \.a[0-9][0-9]$ \.so\.[0-9]$
Disallow */_notes*
Disallow */login/*
Disallow */images/*
Disallow */internal/*
Disallow */forums/*
Disallow */wwwthreads/*
Disallow */ubbthreads/*
Disallow *links*
Disallow *mojo.cgi*
Disallow *_print.html
Disallow */archives/*
Disallow */phpweblog/print.php*
Disallow */phpweblog/friend.php*
Disallow */phpweblog/search.php*
Disallow */phpweblog/contrib.php
Disallow */phpweblog/profiles.php?Author=*
Disallow */daver/gen/js/d0000/*
Disallow */daver/gen/js/d0001/*
Disallow */daver/gen/js/d0002/*
Disallow */daver/gen/js/d0003/*
Disallow */daver/gen/js/d0004/*
Disallow */daver/gen/js/d0005/*
Disallow */daver/gen/js/index/*
Disallow */tmp/*
Disallow */cyberworld/map/*
Disallow */ise/*
Disallow */tank/*
HrefOnly */phpweblog/stories.php?topic=*
HrefOnly */phpweblog/stories.php?page=*
HrefOnly */phpweblog/archive.php*

AddType image/x-xpixmap *.xpm
AddType image/x-xbitmap *.xbm
AddType image/gif       *.gif
AddType text/plain                      *.txt  *.pl *.js *.h *.c *.pm *.e
AddType text/html                       *.html *.htm *.php *.php3 *.phtml
*.php$
AddType text/rtf                        *.rtf
AddType application/pdf                 *.pdf
AddType application/msword              *.doc
AddType application/vnd.ms-excel        *.xls
AddType text/x-postscript               *.ps
Mime application/msword      text/plain  "/usr/home/ise/bin/bin/catdoc $1"
Mime application/pdf          text/plain
"/usr/home/ise/bin/bin/pdftotext $1 -"
Mime application/vnd.ms-excel text/plain
"/usr/home/ise/bin/bin/xls2csv $1"
Mime "text/rtf*"                text/html
"/usr/home/ise/bin/bin/rtf2html $1"

Period 10d
DefaultLang en
MaxHops 1000
MaxNetErrors 32
ReadTimeOut 90s
DocTimeOut 1m30s
NetErrorDelayTime 1d
Robots yes
DetectClones yes

Section body                    1
Section title                   2
Section description             3
Section keywords                4
Section url:file                5
Section url:path                0
Section url:host                6
Section url:proto               0
Section crosswords              7

DeleteBad yes
Index yes
Follow site

Server site http://www.social-ecology.org/
Server site http://www.eggplant.ws/
Server site http://www.infoshop.org/
Server site http://www.whitecleats.org/
Server site http://www.anarchosyndicalism.org/
Server site http://www.leftgreen.org/
Server site http://www.struggle.ws/
Server site http://www.houstonabc.org/
Server site http://www.abolishthebank.org/
Server site http://www.homedistiller.org/
Server site http://flag.blackened.net/nefac/
Server site http://flag.blackened.net/agony/
Server site http://flag.blackened.net/anarpics/
Server site http://flag.blackened.net/antinat/
Server site http://flag.blackened.net/asr/
Server site http://flag.blackened.net/aca/
Server site http://flag.blackened.net/blackflag/
Server site http://flag.blackened.net/biblioteca/
Server site http://flag.blackened.net/daver/
Server site http://flag.blackened.net/global/
Server site http://flag.blackened.net/heatwave/
Server site http://flag.blackened.net/ias/
Server site http://flag.blackened.net/kara/
Server site http://flag.blackened.net/ksl/
Server site http://flag.blackened.net/liberty/
Server site http://flag.blackened.net/library/
Server site http://flag.blackened.net/noterror/
Server site http://flag.blackened.net/nf/
Server site http://flag.blackened.net/pdg/
Server site http://flag.blackened.net/strider/
Server site http://flag.blackened.net/tolstoy/
Server site http://flag.blackened.net/wwa/
Server site http://flag.blackened.net/vrf/

___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to