nutch crawling with java (not shellscript)

Matthias W. Tue, 13 Jan 2009 04:18:25 -0800

Hi,
is there a tutorial or can anyone explain if and how I can run the nutch
crawler via java and not with the shellscript?
Furthermore I don't need to crawl, because I've got a list of URLs (PDF,
Word, Excel, ... Documents) which I have to index
-> In my case nutch only has to create the index from the urls list.


Till now I've got a shellscript which calls "bin/nutch crawl ..."

But if it is possible, I want to use java code instead of the "bin/nutch"
crawlscript.

Are there Java classes and methods to do this?

For better understanding, my association to start the crawl respectively the
index process:
    "java Crawl"
That I'm able to set options for crawling in the java code and not in a
shellscript.

Is this possible?

Thanks!
Matthias
-- 
View this message in context: 
http://www.nabble.com/nutch-crawling-with-java-%28not-shellscript%29-tp21434602p21434602.html
Sent from the Nutch - User mailing list archive at Nabble.com.

nutch crawling with java (not shellscript)

Reply via email to