Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread ohaya
Hi, I've noticed a kind of strange problem with term counts and actual terms. Some background: I wrote an app that creates an index, including a "path" field. I am now working on an app (code was in the previous thread) that, as part of what it does, needs to get a list of all of the "path"

Re: Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread ohaya
Hi, BTW, my indexer app is basically the same as the demo IndexFiles.java. Here's part of the main: try { IndexWriter writer = new IndexWriter(INDEX_DIR, new StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED); System.out.println("Indexing to directory '" +INDEX_DIR+

Re: Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread Phil Whelan
Hi Jim, On Sun, Aug 2, 2009 at 1:32 AM, wrote: > I first noticed the problem that I'm seeing while working on this latter app. > Basically, what I noticed was that while I was adding 13 documents to the > index, when I listed the "path" terms, there were only 12 of them. Field text (the whole

Re: Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread Phil Whelan
Hi Jim, On Sun, Aug 2, 2009 at 9:08 AM, Phil Whelan wrote: > >> So then, I reviewed the index using Luke, and what I saw with that was that >> there were indeed only 12 "path" terms (under "Term Count" on the left), >> but, when I clicked the "Show Top Terms" in Luke, there were 13 terms listed

Re: Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread Andrzej Bialecki
Phil Whelan wrote: Hi Jim, On Sun, Aug 2, 2009 at 9:08 AM, Phil Whelan wrote: So then, I reviewed the index using Luke, and what I saw with that was that there were indeed only 12 "path" terms (under "Term Count" on the left), but, when I clicked the "Show Top Terms" in Luke, there were 13 te

Re: Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread Phil Whelan
On Sun, Aug 2, 2009 at 10:58 AM, Andrzej Bialecki wrote: > Thank you Phil for spotting this bug - this fix will be included in the next > release of Luke. Glad to help. Thanks for building this great tool! Phil - To unsubscribe,

Re: Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread ohaya
Hi Phil, For problem with my app, it wasn't what you suggested (about the tokens, etc.). For some later things, my indexer creates both a "path" field that is analyzed (and thus tokenized, etc.) and another field, "fullpath", which is not analyzed (and thus, not tokenized). The problem with my

Re: Weird discrepancy with term counts vs. terms (off by 1)

2009-08-02 Thread Phil Whelan
Hi Jim, On Sun, Aug 2, 2009 at 12:12 PM, wrote: > i.e., I was ignoring the 1st term in the TermEnum (since the .next() bumps > the TermEnum to the 2nd term, initially). Great! Glad you found the problem. I couldn't see it. Phil -