Hi Team, I need to crawl a website using Apache Nutch. Currently, I am using Nutch 1.x.
I have followed the steps provided in the following URL upto 'invertlink' step. https://wiki.apache.org/nutch/NutchTutorial Then, used 'readseg' command to dump the segments. The dump file is created successfully. Now, I have the following questions. 1. Is this the right file (segment dump file) to read contents of a website? If yes, how to read the contents from dump file? I am unable to read as it looks like encrypted. 2. Otherwise, how can I read the contents of a website? Thanks, Vijay