> hmm, allmost all sites i crawl are external ones. i am not sure but i
 > think that it doesnt depend on content of those pages, just thereis some
 > critical point where it fails.

 Ok. 

 > Maybe you can try to index more that 10000 not so small documents
 > localy, i.e. with your own documents first using my config, of course if
 > you have so many. If you still want my file of urls, let me know, and
 > i'll send them.

 I doubt very much that it's a matter of volume. We've indexed 2 millions
html documents using htword (not with htdig, though) without problems. Could
you send your config file ? I'll start crawling and wait for it to crash
(hopefully :-).

 Cheers,

-- 
                Loic Dachary

                24 av Secretan
                75019 Paris
                Tel: 33 1 42 45 09 16
                e-mail: [EMAIL PROTECTED]
                URL: http://www.senga.org/


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 

Reply via email to