According to Toxik - Dann Cohen:
> Hi Gilles,
> 
> If I set the max_hop_count to 0, it will only fetch the first page,
> and want it to fetch 1 page further so max_hop_count need to be at 1
> but what's happening is that the fetch goes behond the 1800 domains,
> when it's supposed to reject the domain that are not in the start_url...
> 
> Any suggestion, by the way it works fine when there less domain say
> 1500 domains ??? very strange...

Hmmm.  I imagine that the very long list in start_url, which gets
transferred to limit_urls_to by default, is overflowing the StringMatch
state table for the limits matching.  I don't know that there's an easy
fix for this.  The 3.2 code will be using regular expression handling
rather than StringMatch for the limit_urls_to attribute, but I don't know
for a fact that it too won't have problems with a huge list like this.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 


Reply via email to