If you are using Nutch 2.2.1, the crawled data is already stored in a Nosql 
database, e.g., Apache HBase. What you need to develop is a client code that 
reads data out of this database. You probably would need to understand how 
fields are stored. I recommend have a look in the mapping configuration file, 
e.g., conf/gora-hbase-mapping.xml.

On Jul 30, 2556 BE, at 4:28 PM, Weder Carlos Vieira wrote:

> Hello everyone,
> 
> I would like to say that Nutch 2.2.1 is working very well.
> I spent the last few days testing this new version, I liked a lot,
> congratulations.
> 
> Now I would like to receive some tips of you, I want to create a new
> website interface to read the urls crawled, parsed and saved on the
> database and show its contents on the pages.
> 
> My doubt is, what the best way that I can read this data?
> What can I use for middle way, between database and my application, to
> facilitate the selection of data and obtain the best results.
> 
> 
> Could you share with me some stuff to read, some tips, some experience? how
> can I design this structure?
> 
> 
> Thanks.

Reply via email to