Hi Jim, On Sun, Aug 2, 2009 at 9:08 AM, Phil Whelan<phil...@gmail.com> wrote: > >> So then, I reviewed the index using Luke, and what I saw with that was that >> there were indeed only 12 "path" terms (under "Term Count" on the left), >> but, when I clicked the "Show Top Terms" in Luke, there were 13 terms listed >> by Luke. > > Yes, I just checked this and this seems to be a bug with Luke. It > always shows 1 less than in "Term Count" than it should. Well spotted.
I was able to see why this way happening in the Luke source and I've submitted the following patch to Andrzej, the author of Luke. Thanks, Phil --- luke.orig/src/org/getopt/luke/Luke.java 2009-03-19 22:41:34.000000000 -0700 +++ luke-src-0.9.2/src/org/getopt/luke/Luke.java 2009-08-02 09:33:24.000000000 -0700 @@ -813,23 +813,18 @@ setString(iFields, "text", String.valueOf(idxFields.length)); Object iTerms = find(pOver, "iTerms"); termCounts.clear(); - FieldTermCount ftc = new FieldTermCount(); + FieldTermCount ftc = null; TermEnum te = ir.terms(); numTerms = 0; while (te.next()) { Term currTerm = te.term(); - if (ftc.fieldname == null) { + if (ftc == null || ftc.fieldname == null || ftc.fieldname != currTerm.field()) { // initialize - ftc.fieldname = currTerm.field(); - termCounts.put(ftc.fieldname, ftc); - } - if (ftc.fieldname == currTerm.field()) { - ftc.termCount++; - } else { ftc = new FieldTermCount(); ftc.fieldname = currTerm.field(); termCounts.put(ftc.fieldname, ftc); } + ftc.termCount++; numTerms++; } te.close(); --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org