[Nutch-general] Crawling the web and going into depth

Berlin Brown Tue, 12 Jun 2007 14:19:17 -0700

I am using the following tutorial below (with nutch 0.9) to crawl the
web.  I went through the steps, download dmoz and run the parser, etc,
etc.


bin/nutch inject crawl/crawldb dmoz
etc
etc.
bin/nutch fetch $s1

Once I get to this step, is there a way to "crawl" the sites that are
in the dmoz/url list.  It seems like we are just fetching the URLs
that are straight out of the dmoz list.  Lets say I want to crawl
those and give a particular depth?

http://lucene.apache.org/nutch/tutorial8.html

-- 
Berlin Brown
http://www.newspiritcompany.com - newspirit technologies

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] Crawling the web and going into depth

Reply via email to