Dmitry Serebrennikov wrote:
Niraj Alok wrote:
By the way, I could be wrong, but I think the 35% figure you referenced in the your first e-mail actually does not include any stored fields. The deal with 35% was, I think, to illustrate that index data structures used for searching by Lucene are efficient. But Lucene does nothing special about stored content - no compression or anything like that. So you end up with the pure size of your data plus the 35% of the indexed data.Hi PA,
Thanks for the detail ! Since we are using lucene to store the data also, I
guess I would not be able to use it.
There will be a patch available to the end of this week, which allows you to store binary values compressed within a lucene index. It means that you will be able to store and retrieve whole documents within lucene in a very efficient way ;-)
regards bernhard
Cheers. Dmitry.
Regards, Niraj ----- Original Message ----- From: "petite_abeille" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Wednesday, September 01, 2004 1:14 PM Subject: Re: indexing size
Hi Niraj,
On Sep 01, 2004, at 06:45, Niraj Alok wrote:
If I make some of them Field.Unstored, I can see from the javadocs
that it
will be indexed and tokenized but not stored. If it is not stored, how
can I
use it while searching?
The different type of fields don't impact how you do your search. This is always the same.
Using Unstored fields simply means that you use Lucene as a pure index for search purpose only, not for storing any data.
Specifically, the assumption is that your original data lives somewhere else, outside of Lucene. If this assumption is true, then you can index everything as Unstored with the addition of one Keyword per document. The Keyword field holds some sort of unique identifier which allows you to retrieve the original data if necessary (e.g. a primary key, an URI, what not).
Here is an example of this approach:
(1) For indexing, check the indexValuesWithID() method
http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/SZObject/ SZIndex.java?view=markup
Note the addition of a Field.Keyword for each document and the use of Field.UnStored for everything else
(2) For fetching, check objectsWithSpecificationAndHitsInStore()
http://cvs.sourceforge.net/viewcvs.py/zoe/ZOE/Frameworks/SZObject/ SZFinder.java?view=markup
HTH.
Cheers,
PA.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]