Webboard: indexer loops !

2001-06-01 Thread marcio

Author: marcio
Email: 
Message:
BTW, adding this line

Period 604800


made no difference whatsoever.

marcio

Reply: 

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: indexer loops !

2001-06-01 Thread marcio

Author: marcio
Email: 
Message:
No such line "Period". But it is looping in the spider, fetching the pages. Are you 
sure Period will fix this ? How so ?

Greenstone takes just 4 minutes to download all pages and another 4 minutes to index 
it all. It's done after 8 minutes.

mnoGoSearch was still downloading pages after 30 minutes.

anyway, this is my conf file


# This is a minimal sample indexer config file

Server  http://www.foo.com/

MaxHops 16

# Allow some known extensions and directory index
Allow regex \.html$ \.htm$ \.shtml$ \.txt$ \/$

# Disallow everything else
#Disallow .*


#Exclude some known extensions
Disallow regex \.b$  \.sh$ \.md5$
Disallow regex \.arj$  \.tar$  \.zip$  \.tgz$  \.gz$
Disallow regex \.lha$ \.lzh$ \.tar\.Z$  \.rar$  \.zoo$
Disallow regex \.gif$  \.jpg$  \.jpeg$ \.bmp$  \.tiff$
Disallow regex \.vdo$  \.mpeg$ \.mpe$  \.mpg$  \.avi$  \.movie$
Disallow regex  \.mid$   \.mp3$   \.rm$   \.ram$  \.wav$  \.aiff$ \.ra$
Disallow regex \.vrml$ \.wrl$
Disallow regex \.exe$  \.cab$  \.dll$  \.bin$  \.class$
Disallow regex \.tex$  \.texi$ \.xls$  \.doc$  \.texinfo$
Disallow regex \.rtf$  \.pdf$  \.cdf$  \.ps$
Disallow regex \.ai$   \.eps$  \.ppt$  \.hqx$
Disallow regex \.cpt$  \.bms$  \.oda$  \.tcl$
Disallow regex \.rpm$ \.css$ \.js$ \.vbs$

#Exclude Apache directory list in different sort order
Disallow regex \?D=A$ \?D=A$ \?D=D$ \?M=A$ \?M=D$  \?N=A$  \?N=D$ \?S=A$ \?S=D$

#Exclude ./. and ./.. from Apache and Squid directory list
Disallow regex /[.]{1,2} /\%2e /\%2f

#Exclude bookmarks to middle of a page (urls have #...)
Disallow regex #[.]*

Reply: 

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: indexer loops !

2001-06-01 Thread Alexander Barkov

Author: Alexander Barkov
Email: [EMAIL PROTECTED]
Message:
What is the Period command argument in your indexer.conf?

> The problem now is different - the indexer doesn't want to stop !!!
> 
> Does it not tag URLs it has visited already and skips them ? It looks like it is 
>visiting the same ones again and again and again.
> 
> I'll give SWISH a try :-)
> 
> marcio
> 

Reply: 

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]




Webboard: indexer loops !

2001-05-31 Thread marcio

Author: marcio
Email: 
Message:
The problem now is different - the indexer doesn't want to stop !!!

Does it not tag URLs it has visited already and skips them ? It looks like it is 
visiting the same ones again and again and again.

I'll give SWISH a try :-)

marcio


Reply: 

___
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]