Hi,

Nutch seems like a great project, but we wish to take another approach to
indexing webcontent. Is it possible to use Apache Nutch only as a
webcrawler to fetch *htm (or similar) content directly to disk?


E.g. fetching url www.addr.com/* results in physical files
www.addr.com/about.html, www.add.com/whatever.htm, etc.


Regards,
Johan S.

Reply via email to