Hello,

After the crawl is done, I would like to query
the webdb for pages (by url), and i would like
to access the content of these pages.

I see that there is a method
WebDBReader.getUrl(String url) which returns a Page.
Is there a way to get the recno of this Page so
that i can retrieve the Content by doing something
like this:

// code from net.nutch.protocol.Content.java
File file = new File(segment, DIR_NAME);
ArrayFile.Reader contents = new
ArrayFile.Reader(file.toString());
Content content = new Content();
contents.get(recno, content);

thanks!
simon

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to