Dennis Kubes wrote:
Can somebody direct me on how to get the stored text and parse
metadata for a given url?
From a single segment, or from a set of segments?
From a single segment: please see how SegmentReader.get() does this
(although it's a bit obscured by the fact that it uses multiple threads
to retrieve different parts of the data).
For multiple segments, it would help if you knew in advance which
segment holds the data associated with the URL, that's what normally the
Lucene index is for ;) - please see FetchedSegments for details.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general