Hello, segment dumps are notorious hard to comprehend. What information are you 
looking for? What do you mean by reading contents of a website? 
Markus

 
 
-----Original message-----
> From:Vijay Veluchamy <vijay.veluch...@gmail.com>
> Sent: Tuesday 5th April 2016 16:22
> To: user@nutch.apache.org
> Subject: How to read segment dump?
> 
> Hi Team,
> 
> I need to crawl a website using Apache Nutch. Currently, I am using Nutch
> 1.x.
> 
> I have followed the steps provided in the following URL upto 'invertlink'
> step.
> 
> https://wiki.apache.org/nutch/NutchTutorial
> 
> Then, used 'readseg' command to dump the segments. The dump file is created
> successfully.
> 
> Now, I have the following questions.
> 
> 1. Is this the right file (segment dump file) to read contents of a
> website? If yes, how to read the contents from dump file? I am unable to
> read as it looks like encrypted.
> 2. Otherwise, how can I read the contents of a website?
> 
> Thanks,
> Vijay
> 

Reply via email to