Hi,
is there a tutorial or can anyone explain if and how I can run the nutch
crawler via java and not with the shellscript?
Furthermore I don't need to crawl, because I've got a list of URLs (PDF,
Word, Excel, ... Documents) which I have to index
-> In my case nutch only has to create the index from the urls list.
Till now I've got a shellscript which calls "bin/nutch crawl ..."
But if it is possible, I want to use java code instead of the "bin/nutch"
crawlscript.
Are there Java classes and methods to do this?
For better understanding, my association to start the crawl respectively the
index process:
"java Crawl"
That I'm able to set options for crawling in the java code and not in a
shellscript.
Is this possible?
Thanks!
Matthias
--
View this message in context:
http://www.nabble.com/nutch-crawling-with-java-%28not-shellscript%29-tp21434602p21434602.html
Sent from the Nutch - User mailing list archive at Nabble.com.