Hi Navin,
Crawl the data using crawl command[0]. After that, use the readseg
command[1],[2] to dump a text file.
You can easily automate using shell script, python etc scripting languages.

[0] : section 3.1 in http://wiki.apache.org/nutch/NutchTutorial
[1] :
http://www.marco.bianchi.name/myPortal/using-the-binnutch-readseg-command.aspx
[2] : http://wiki.apache.org/nutch/bin/nutch_readseg

Thanks,
Tejas Patil


On Tue, Dec 25, 2012 at 9:47 PM, navinkumar <navinkumar...@gmail.com> wrote:

> Hi ,I’m newbie to nutch,I have successfully installed and configured nutch
> to
> crawl the sites.I want to get the data from crawl?1.Is there any way to get
> the data programmatically?2.What is the command to extract the data into
> plain text?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Extract-data-in-nutch-tp4029072.html
> Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to