Re: Storing a field as byte[[]

2010-05-14 Thread Saurabh Agarwal
Thanks Uwe Saurabh Agarwal On Fri, May 14, 2010 at 11:09 AM, Uwe Schindler u...@thetaphi.de wrote: There is a class NumericField in the same package. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message-

Re: Error of the code

2010-05-14 Thread manjula wijewickrema
Hi Ian, Thanx for your reply. vector.size() returns the total number of indexed terms in the index. However I was able to run the program and get the results finally with your help. Thanks a lot. Manjula On Thu, May 13, 2010 at 6:52 PM, Ian Lea ian@gmail.com wrote: What does vector.size()

Access indexed terms

2010-05-14 Thread manjula wijewickrema
Hi, Is it possible to put the indexed terms into an array in lucene. For example, imagine I have indexed a single document in Lucene and now I want to acces those terms in the index. Is it possible to retrieve (call) those terms as array elements? If it is possible, then how? Thanks, Manjula

Re: Access indexed terms

2010-05-14 Thread Andrzej Bialecki
On 2010-05-14 11:35, manjula wijewickrema wrote: Hi, Is it possible to put the indexed terms into an array in lucene. For example, imagine I have indexed a single document in Lucene and now I want to acces those terms in the index. Is it possible to retrieve (call) those terms as array

Re: Access indexed terms

2010-05-14 Thread manjula wijewickrema
Hi Andrzej Thanx for the reply. But as you have mentioned, creating arrays for indexed terms seems to be little difficult. Here my intention is to find the term frequencies (of terms) of an indexed document. I can find the term frequency of a particular term (giving as a query) if I specify the

Re: Access indexed terms

2010-05-14 Thread Andrzej Bialecki
On 2010-05-14 14:24, manjula wijewickrema wrote: Hi Andrzej Thanx for the reply. But as you have mentioned, creating arrays for indexed terms seems to be little difficult. Here my intention is to find the term frequencies (of terms) of an indexed document. I can find the term frequency of a

Re: TermDocs

2010-05-14 Thread Grant Ingersoll
On May 12, 2010, at 7:42 PM, roy-lucene-u...@xemaps.com wrote: Hi guys, I've had this code for some time but am just now questioning if it works. I have a custom filter that i've been using since Lucene 1.4 to Lucene 2.2.0 and it essentially builds up a BitSet like so: for ( int x =

Re: Will doc ids ever change if nothing is deleted?

2010-05-14 Thread Chris Harris
Could you address your needs by assigning each document a unique identifier (maybe you have a natural key, or maybe you could generate a new GUID or something for each doc), and using those identifiers, rather than internal Lucene docids, to track documents between the search stage and the loading

Re: IndexWriter and memory usage

2010-05-14 Thread Michael McCandless
The patch looks correct. The 16 MB RAM buffer means the sum of the shared char[], byte[] and PostingList/RawPostingList memory will be kept under 16 MB. There are definitely other things that require memory beyond this -- eg during a segment merge, SegmentReaders are opened for each segment

Re: Will doc ids ever change if nothing is deleted?

2010-05-14 Thread Nigel
We do assign GUIDs to everything in the index for cases where longer-term identity is necessary. For this case, using GUIDs would be prohibitively expensive as we'd need to load the GUIDs for all search results. We might have tens of thousands of results and only want to load a random 100, so

Re: Will doc ids ever change if nothing is deleted?

2010-05-14 Thread Chris Lu
The doc id will get changed if the segments are merged. The doc id is more depending on the order of documents being added. Just think about it. The doc ids are starting from 0 to N. And when some documents are deleted, they are marked deleted on .del file. So no change there. When some

Re: Will doc ids ever change if nothing is deleted?

2010-05-14 Thread Nigel
Right, but my question was whether merging segments will renumber docs *if no documents are deleted*. Empirically, the answer is no. I've written test code that indexes documents with a field equal to each document's current id, and verified that the ids still match the field values even after

Re: Access indexed terms

2010-05-14 Thread manjula wijewickrema
Dear Andrzej, Thanx for your valuable help. I also noticed this HighFreqTerms approach in the Lucene email archive and try to use it. In order to do that I have downloaded lucene-misc-2.9.1.jar and added org.apache.lucene.misc package into my project. Now I think I have to call this HighFreqTerms

How to call high fre. terms using HighFreTerms class

2010-05-14 Thread manjula wijewickrema
Hi, I am struggling with using HighFreTerms class for the purpose of find high fre. terms in my index. My target is to get the high frequency terms in an indexed document (single document). To do that I have added org.apache.lucene.misc package into my project. I think upto that point I am