> hmm, allmost all sites i crawl are external ones. i am not sure but i
> think that it doesnt depend on content of those pages, just thereis some
> critical point where it fails.
Ok.
> Maybe you can try to index more that 10000 not so small documents
> localy, i.e. with your own documents first using my config, of course if
> you have so many. If you still want my file of urls, let me know, and
> i'll send them.
I doubt very much that it's a matter of volume. We've indexed 2 millions
html documents using htword (not with htdig, though) without problems. Could
you send your config file ? I'll start crawling and wait for it to crash
(hopefully :-).
Cheers,
--
Loic Dachary
24 av Secretan
75019 Paris
Tel: 33 1 42 45 09 16
e-mail: [EMAIL PROTECTED]
URL: http://www.senga.org/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.