I think the FLAG_NONE ("I don't need/want freqs when reading the index") and the DOCS_ONLY ("Do not index freqs") are two different cases?
I think for DOCS_ONLY it makes sense that we lie (say freq=1 when we don't know): lots of places would otherwise have to be special cased for when they consume DOCS_ONLY vs DOCS_AND_POSITIONS. But, for FLAG_NONE, when the caller passes this it means they have no intention of using/calling freq() right? Eg MultiTermQueryWrapperFilter would pass this. For that case I'm not sure we should promise / require that codecs return 1 always? EG what if the index does has freqs? I think in that case the codec shouldn't be required to go out of its way and return 1? I'm also not sure that all codecs return 1 today if the fields was indexed with DOCS_ONLY ... Mike McCandless http://blog.mikemccandless.com On Mon, Dec 17, 2012 at 11:24 AM, Shai Erera <ser...@gmail.com> wrote: > Hi > > While migrating code to Lucene 4.0, I noticed that I have an assert on a > field that is indexed with DOCS_ONLY that DocsEnum.freq() == 1. This got me > thinking ... why? > > If you index w/ DOCS_ONLY, or ask for DocsEnum with FLAG_NONE, why do we > "lie" to the consumer? Rather, we could just return 0 or -1? > > I personally don't mind if we continue to return 1, if there's a real reason > to. I don't think that anyone should call freq() if he asked for DocsEnum > with FLAG_NONE. But if we do keep the current behavior, can we at least > document it? > > E.g., something like this patch: > > Index: lucene/core/src/java/org/apache/lucene/index/DocsEnum.java > =================================================================== > --- lucene/core/src/java/org/apache/lucene/index/DocsEnum.java (revision > 1422804) > +++ lucene/core/src/java/org/apache/lucene/index/DocsEnum.java (working > copy) > @@ -47,10 +47,16 @@ > protected DocsEnum() { > } > > - /** Returns term frequency in the current document. Do > - * not call this before {@link #nextDoc} is first called, > - * nor after {@link #nextDoc} returns NO_MORE_DOCS. > - **/ > + /** > + * Returns term frequency in the current document, or 1 if the > + * {@link DocsEnum} was obtained with {@link #FLAG_NONE}. Do not call > this > + * before {@link #nextDoc} is first called, nor after {@link #nextDoc} > returns > + * {@link DocIdSetIterator#NO_MORE_DOCS}. > + * > + * <p> > + * <b>NOTE:</b> if the {@link DocsEnum} was obtain with {@link > #FLAG_NONE}, > + * this method returns 1. > + */ > public abstract int freq() throws IOException; > > /** Returns the related attributes. */ > > Shai --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org