On 6/20/07, Naess, Ronny <[EMAIL PROTECTED]> wrote:
> I tried your tip Brian, but the property
>
>  <property>
>          <name>fetcher.store.content</name>
>          <value>true</value>
>          <description>If true, fetcher will store content.</description>
>  </property>
>
> set in nutch-site.xml does not seem to work (still no content) and I
> found exactly the same setting in nutch-default.xml anyway, and it was
> also set to true.....strange!!??
>
> Does it mean what we think it does as in store into index or does it
> mean store as segment data?

If fetcher.store.content is set to true, then fetcher stores the
original version of the page (its 'content') in <segment>/content
directory. It has nothing to do with indexing.

Note that content is not available to Indexer but parse text is. If
you want to store parse text in index, just change index-basic plugin
where it adds the "content" field to Store.YES. (If there is any
confusion, parse text is indexed as "content").

>
> Regards,
> Ronny
>
> -----Opprinnelig melding-----
> Fra: Brian Whitman [mailto:[EMAIL PROTECTED]
> Sendt: 19. juni 2007 19:52
> Til: [EMAIL PROTECTED]
> Emne: Re: Lucene client and nutch index
>
>
> On Jun 19, 2007, at 1:39 PM, Naess, Ronny wrote:
>
> > I have made a small Lucene client reading my nutch index created with
> > Nutch-0.9
> >
> > This works fine. However since 'content' is not stored only indexed in
>
> > the index I have to find a way to access the content to create a
> > summary (and highlighting the query terms).
> >
>
> You can simply set the content to be stored in the Lucene index then
> highlighting will work normally from any Lucene client. Search the
> mailing list (there was a post just yesterday) about how to accomplish
> this, there's a single line of code to change. Do realise that storing
> content will slow down some queries and your index size will grow very
> large.
>
> -Brian
>
>
>
> !DSPAM:467817bf321421501980509!
>
>


-- 
Doğacan Güney
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to