Re: TermsEnum.docFreq() returns 0

2013-05-14 Thread Ravikumar Govindarajan
Thanks for the help Mike. Was quick to jump to a wrong conclusion My codec does not implement Term-Vectors, Payloads, DocValues and Norms. It should be trivial to implement Payloads, but I am not sure about others. Anyways, I can generate a HTML report and identify failures based on individual t

Re: TermsEnum.docFreq() returns 0

2013-05-14 Thread Michael McCandless
On Tue, May 14, 2013 at 3:03 AM, Ravikumar Govindarajan wrote: > We ran the checkIndex and a simple test case. It passes. Actually, I had > assumed problem with lucene, whereas it was an issue with our custom codec. Phew, thanks for bringing closure! > I do not know how to confirm whether a new

Re: TermsEnum.docFreq() returns 0

2013-05-14 Thread Ravikumar Govindarajan
We ran the checkIndex and a simple test case. It passes. Actually, I had assumed problem with lucene, whereas it was an issue with our custom codec. I do not know how to confirm whether a new codec works correctly. Are there any tools/existing test-cases available for validation? -- Ravi On Mo

Re: TermsEnum.docFreq() returns 0

2013-05-13 Thread Michael McCandless
That code looks correct. But can you tie it all together into a runnable test case? Ie add in the terms enum, calling docFreq and getting 0 when it should be 1. Also, if you run CheckIndex on the index produced by the code below, how many terms/freqs/positions does it report? Mike McCandless h

Re: TermsEnum.docFreq() returns 0

2013-05-13 Thread Ravikumar Govindarajan
Indexing code below. Looks very simple. Is this correct? IndexWriterConfig conf = new IndexWriterConfig(Version.LUCENE_42, new StandardAnalyzer(Version.LUCENE_42)); conf.setOpenMode(OpenMode.CREATE_OR_APPEND); String indexPath = ""; Directory dir=FSDi

Re: TermsEnum.docFreq() returns 0

2013-05-10 Thread Michael McCandless
It should not be 0, as long as TermsEnum.next() does not return null ... can you make a small test case? Thanks. Mike McCandless http://blog.mikemccandless.com On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan wrote: > I have to add that the above code is wrong. > > It has to be > > wh

Re: TermsEnum.docFreq() returns 0

2013-05-10 Thread Ravikumar Govindarajan
I have to add that the above code is wrong. It has to be while((ref=tEnum.next())!=null) { ref = tEnum.term(); tEnum.docFreq(); // Even here VAL=0 } Apologies for the mistake, but the problem remains On F

TermsEnum.docFreq() returns 0

2013-05-10 Thread Ravikumar Govindarajan
We have the following code SegmentInfos segments = new SegmentInfos(); segments.read(luceneDir); for(SegmentInfoPerCommit sipc: segments) { String name = sipc.info.name; SegmentReader reader = new SegmentReader(sipc, 1, new IOContext()); Terms terms = reader.terms("content"); TermsEnum tEnum = t