Author: marcio
No such line "Period". But it is looping in the spider, fetching the pages. Are you 
sure Period will fix this ? How so ?

Greenstone takes just 4 minutes to download all pages and another 4 minutes to index 
it all. It's done after 8 minutes.

mnoGoSearch was still downloading pages after 30 minutes.

anyway, this is my conf file

# This is a minimal sample indexer config file


MaxHops 16

# Allow some known extensions and directory index
Allow regex \.html$ \.htm$ \.shtml$ \.txt$ \/$

# Disallow everything else
#Disallow .*

#Exclude some known extensions
Disallow regex \.b$  \.sh$     \.md5$
Disallow regex \.arj$  \.tar$  \.zip$  \.tgz$  \.gz$
Disallow regex \.lha$ \.lzh$ \.tar\.Z$  \.rar$  \.zoo$
Disallow regex \.gif$  \.jpg$  \.jpeg$ \.bmp$  \.tiff$
Disallow regex \.vdo$  \.mpeg$ \.mpe$  \.mpg$  \.avi$  \.movie$
Disallow regex  \.mid$   \.mp3$   \.rm$   \.ram$  \.wav$  \.aiff$ \.ra$
Disallow regex \.vrml$ \.wrl$
Disallow regex \.exe$  \.cab$  \.dll$  \.bin$  \.class$
Disallow regex \.tex$  \.texi$ \.xls$  \.doc$  \.texinfo$
Disallow regex \.rtf$  \.pdf$  \.cdf$  \.ps$
Disallow regex \.ai$   \.eps$  \.ppt$  \.hqx$
Disallow regex \.cpt$  \.bms$  \.oda$  \.tcl$
Disallow regex \.rpm$ \.css$ \.js$ \.vbs$

#Exclude Apache directory list in different sort order
Disallow regex \?D=A$ \?D=A$ \?D=D$ \?M=A$ \?M=D$  \?N=A$  \?N=D$ \?S=A$ \?S=D$

#Exclude ./. and ./.. from Apache and Squid directory list
Disallow regex /[.]{1,2} /\%2e /\%2f

#Exclude bookmarks to middle of a page (urls have #...)
Disallow regex #[.]*

Reply: <>

If you want to unsubscribe send "unsubscribe general"

Reply via email to