Re: [Lucene.Net] index version compatibility (1.9 to 2.9.2)?

Robert Stewart Thu, 09 Jun 2011 13:33:59 -0700

I found the problem.  The problem is that I have a custom "query optimizer", 
and that replaces certain TermQuery's within a Boolean query with a custom 
Query and this query has its own weight/scorer that retrieves matching 
documents from an in-memory cache (and that is not Lucene backed).  But it 
looks like my custom hitcollectors are now wrapped in a HitCollectorWrapper 
which assumes Collect() needs called for multiple segments - so it is adding a 
start offset to the doc ID that comes from my custom query implementation.  I 
looked at the new Collector class and it seems it works the same way (assumes 
it needs to set the next index reader with some offset).  How can I make my 
custom query work with the new API (so that there is basically a single 
"segment" in RAM that my query uses, but still other query clauses in same 
boolean query use multiple lucene segments)?  I am sure that is not clear and 
will try to provide more detail soon.


Thanks
Bob


On Jun 9, 2011, at 1:48 PM, Digy wrote:

> Sorry no idea. Maybe optimizing the index with 2.9.2 can help to detect the
> problem.
> DIGY
> 
> -----Original Message-----
> From: Robert Stewart [mailto:[email protected]] 
> Sent: Thursday, June 09, 2011 8:40 PM
> To: <[email protected]>
> Subject: Re: [Lucene.Net] index version compatibility (1.9 to 2.9.2)?
> 
> I tried converting index using IndexWriter as follows:
> 
> Lucene.Net.Index.IndexWriter writer = new IndexWriter(TestIndexPath+"_2.9",
> new Lucene.Net.Analysis.KeywordAnalyzer());
> 
> writer.SetMaxBufferedDocs(2);
> writer.SetMaxMergeDocs(1000000);
> writer.SetMergeFactor(2);
> 
> writer.AddIndexesNoOptimize(new Lucene.Net.Store.Directory[] { new
> Lucene.Net.Store.SimpleFSDirectory(new DirectoryInfo(TestIndexPath)) });
> 
> writer.Commit();
> 
> 
> That seems to work (I get what looks like a valid index directory at least).
> 
> But still when I run some tests using IndexSearcher I get the same problem
> (I get documents in Collect() which are larger than IndexReader.MaxDoc()).
> Any idea what the problem could be?  
> 
> BTW, this is a problem because I lookup some fields (date ranges, etc.) in
> some custom collectors which filter out documents, and it assumes I dont get
> any documents larger than maxDoc.
> 
> Thanks,
> Bob
> 
> 
> On Jun 9, 2011, at 12:37 PM, Digy wrote:
> 
>> One more point, some write operations using Lucene.Net 2.9.2 (add, delete,
>> optimize etc.) upgrades automatically your index to 2.9.2.
>> But if your index is somehow corrupted(eg, due to some bug in 1.9) this
> may
>> result in data loss.
>> 
>> DIGY
>> 
>> -----Original Message-----
>> From: Robert Stewart [mailto:[email protected]] 
>> Sent: Thursday, June 09, 2011 7:06 PM
>> To: [email protected]
>> Subject: [Lucene.Net] index version compatibility (1.9 to 2.9.2)?
>> 
>> I have a Lucene index created with Lucene.Net 1.9.  I have a multi-segment
>> index (non-optimized).   When I run Lucene.Net 2.9.2 on top of that index,
> I
>> get IndexOutOfRange exceptions in my collectors.  It is giving me document
>> IDs that are larger than maxDoc.  
>> 
>> My index contains 377831 documents, and IndexReader.MaxDoc() is returning
>> 377831, but I get documents from Collect() with large values (for instance
>> 379018).  Is an index built with Lucene.Net 1.9 compatible with 2.9.2?  If
>> not, is there some way I can convert it (in production we have many
> indexes
>> containing about 200 million docs so I'd rather convert existing indexes
>> than rebuilt them).
>> 
>> Thanks
>> Bob=
>> 
>

Re: [Lucene.Net] index version compatibility (1.9 to 2.9.2)?

Reply via email to