^5 ;) On Mon, Mar 25, 2013 at 11:02 PM, Bushman, Lamont <[email protected]> wrote: > Thank you very much for the help Simon. I am amazed I was able to accomplish > what I wanted. I didn't store the body in the Index. And I used Highlighter > to return the best fragments by parsing my original document. > ________________________________________ > From: Simon Willnauer [[email protected]] > Sent: Monday, March 25, 2013 4:07 AM > To: [email protected] > Subject: Re: Compression and Highlighter > > On Mon, Mar 25, 2013 at 8:13 AM, Bushman, Lamont <[email protected]> wrote: >> I have a project where I need to index documents using Lucene 4.1.0. >> One of the fields for the stored Document is the actual text from the >> document(.pdf, .docx, etc.) I want to be able to highlight text from the >> documents in the search results. I was looking at some older tutorials >> about storing the field with TermVectors and also storing it in the index >> with Store.COMPRESS. However, with Lucene 4.1 they have done away with >> Store.COMPRESS. Is there still a way to compress the field? > > Lucene 4.1 uses a compressed stored fields format under the hoods by > default. The compression is completely transparent and enabled by > default. Here is some background: > http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene > >> I am worried about the amount of space that will be stored in the index >> if I have to have the "body" Field stored and uncompressed. >> Are there ways around having to store the whole Field in its original >> form? >> Since I am already going to be storing the actual documents on the >> server, would it be feasible (time) to not store TermVectors or Store the >> field at all until the user searches for a document. Then at runtime I can >> re-index the top docs from the original documents in RAM and use Highlighter >> to return fragments? > > this is what the highlighter does if you are not using the > FastVectorHighlighter. You can just pass in the string value you wanna > highlight no matter if you stored it in lucene or not. You just need > to see if that works for you performance wise without storing TV. > > simon >> >> Thanks > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > >
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
