Hi Gilles,
If I set the max_hop_count to 0, it will only fetch the first page, and want it to
fetch 1 page further so max_hop_count need to be at 1 but what's happening is that the
fetch goes behond the 1800 domains, when it's supposed to reject the domain that are
not in the start_url...
Any suggestion, by the way it works fine when there less domain say 1500 domains ???
very strange...
Dann Cohen - Dir., Outsourcing and Information Systems
Toxik Technologies Inc. - Montreal, QC, Canada
www.toxik.com - Phone: (514) 528-6945 x 2 . Fax: (514) 221-3329
-----Original Message-----
From: Gilles Detillieux [mailto:[EMAIL PROTECTED]]
Sent: 4 janvier, 2001 12:04
To: Toxik - Dann Cohen
Cc: [EMAIL PROTECTED]
Subject: Re: [htdig3-dev] Fetching outside of domain list (not supposed
to)
According to Toxik - Dann Cohen:
> I'm a new comer (6 month user of ht://dig) to this list and before
> saying anything I would like to say hi to everyone. Now to the good
> stuff =)
>
> I've encounter a problem with the fetching part. I have about 1800 site
> in my "start_url" to fetch with a "max_hop_count" of 1 and it seems to
> go beyond the 1800.
>
> HTTP statistics
> ===============
> Persistent connections : Yes
> HEAD call before GET : No
> Connections opened : 14973
> Connections closed : 14973
> Changes of server : 6030
> HTTP Requests : 35357
> HTTP KBytes requested : 209216
> HTTP Average request time : 0.647679 secs
> HTTP Average speed : 9.13605 KBytes/secs
>
> Has you can see the value of "changes server" is higher than 1800. I can
> also see in the log that it goes beyond the domain (see bellow for an
> example), the domain is www.singapore-inc.com and you can see that a
> "mailto:" and "www.sedb.com.sg" is pushed in. The problem doesn't happen
> when I fetch them alone, any suggestion or hints are welcome.
If you haven't already figured it out, you should be setting max_hop_count
to 0, not 1. One hop means it will attempt to follow all the valid links
in those initial 1800 documents.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.