Andrzej, do you think it is possible without to much work to access the segment data from a Lucene client that I have made with or without the use of nutch?
I stated the same question late yesterday in a new mail named 'Lucene client and nutch index' refering the FetchedSegment class you are refering to. Best regards, Ronny -----Opprinnelig melding----- Fra: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sendt: 19. juni 2007 20:44 Til: [EMAIL PROTECTED] Emne: Re: SV: doubt about indexing Naess, Ronny wrote: > No because the content is not stored, only indexed in the index it self. > As I have found out the content is cached elswere and I am trying to > figure out how to get it from a Lucene client just now. Lucene might store the full text, but Nutch doesn't use this (for performance reasons). Whenever the full text is needed, it's retrieved from Nutch segment data. Please see the logic in o.a.n.s.FetchedSegment for details - this process doesn't use Lucene at all, it simply retrieves records from Hadoop MapFile using URL as document ID. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com !DSPAM:467826f0291056608812681! ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
