Using nutch only as a webcrawler?

johan . sjoberg Fri, 26 Jun 2009 06:01:29 -0700

Hi,


Nutch seems like a great project, but we wish to take another approach to
indexing webcontent. Is it possible to use Apache Nutch only as a
webcrawler to fetch *htm (or similar) content directly to disk?


E.g. fetching url www.addr.com/* results in physical files
www.addr.com/about.html, www.add.com/whatever.htm, etc.


Regards,
Johan S.

Using nutch only as a webcrawler?

Reply via email to