Marcelo, The limit_urls_to: line looks a bit iffy - is this exactly how it looks in your config file? (It might be easier if you attached the whole file)
Given that the number of documents is increasing, it rather looks as though there is some limit on how far the indexing goes each time, which is then being built upon. Regards, Mike >-----Original Message----- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf >Of Marcelo García >Sent: Tuesday, March 04, 2008 3:29 PM >To: [email protected] >Subject: [htdig] Problem with htdig indexation > >Hello, > >I am a new user of htdig and I have a problem with indexation, >My server is runing with Ubuntu 7, with htdig 3.2.0b6, the >documents to index are in one server only. The server name is petete. > >The number of documents found by htdig change every time I run >rundig (rundig -a -s) and because of this some documents are >not found when we search them. > >Example: >First execution of rundig -a -s: >htdig: localhost:80 24 documents > >Second: >htdig: 2 servers seen: >htdig: localhost:80 24 documents >htdig: petete:80 37 documents > >3. >htdig: 2 servers seen: >htdig: localhost:80 24 documents >htdig: petete:80 326 documents > >4. >htdig: 2 servers seen: >htdig: localhost:80 24 documents >htdig: petete:80 409 documents > >5. >htdig: 2 servers seen: >htdig: localhost:80 24 documents >htdig: petete:80 438 documents >This is the max number of documents found, after some time >(one day more or less), if we repeat the process we have the >same results. > >In htdig.conf we have (parameters that I think could affect to >indexation): >start_url: http://localhost/manuales2/manuales2.html >http://localhost/proyectos/proyectos.html http://localhost/indices > >limit_urls_to: ${start_url} http://petete/ > >exclude_urls: /cgi-bin/ .cgi /images > >search_rewrite_rules: http://localhost/(.*) http://petete/\\1 > >Please, can you help me with this problem? > >Thanks in advance. > >--------------------------------------------------------------- >---------- >This SF.net email is sponsored by: Microsoft Defy all >challenges. Microsoft(R) Visual Studio 2008. >http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >_______________________________________________ >ht://Dig general mailing list: <[email protected]> >ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html >List information (subscribe/unsubscribe, etc.) >https://lists.sourceforge.net/lists/listinfo/htdig-general > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

