: to index remote HTML files. Can I use Nutch to crawl for the remote HTML
: files and use the index for the Lucene code I have already written? Or do
: I have to redo the whole thing using the Nutch API? I am using boosting
: during the indexing. I hope Nutch can boost fields, too. Any help
FYI:
open source web crawler:
http://java-source.net/open-source/crawlers
Thanks,
Koji
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
I am using lucene to index local HTML files. The requirement just changed
to index remote HTML files. Can I use Nutch to crawl for the remote HTML
files and use the index for the Lucene code I have already written? Or do
I have to redo the whole thing using the Nutch API? I am using boosting
d
The Lucene wiki should be a good kick-off
http://wiki.apache.org/jakarta-lucene/FrontPage?action=show&redirect=FrontPageEN
Nader Henein
Legolas Woodland wrote:
Hi
Thank you in advance
Can some one help me about finding some documents about lucene and Nutch
architecture and "how
Hi
Thank you in advance
Can some one help me about finding some documents about lucene and Nutch
architecture and "how it works"
also what are the algorithms used in Lucene indexing and everything about
under laying system of lucene and Nutch
Thank you