Hi
While migrating code to Lucene 4.0, I noticed that I have an assert on a
field that is indexed with DOCS_ONLY that DocsEnum.freq() == 1. This got me
thinking ... why?
If you index w/ DOCS_ONLY, or ask for DocsEnum with FLAG_NONE, why do we
"lie" to the consumer? Rather, we could just return 0 or -1?
I personally don't mind if we continue to return 1, if there's a real
reason to. I don't think that anyone should call freq() if he asked for
DocsEnum with FLAG_NONE. But if we do keep the current behavior, can we at
least document it?
E.g., something like this patch:
Index: lucene/core/src/java/org/apache/lucene/index/DocsEnum.java
===================================================================
--- lucene/core/src/java/org/apache/lucene/index/DocsEnum.java (revision
1422804)
+++ lucene/core/src/java/org/apache/lucene/index/DocsEnum.java (working
copy)
@@ -47,10 +47,16 @@
protected DocsEnum() {
}
- /** Returns term frequency in the current document. Do
- * not call this before {@link #nextDoc} is first called,
- * nor after {@link #nextDoc} returns NO_MORE_DOCS.
- **/
+ /**
+ * Returns term frequency in the current document, or 1 if the
+ * {@link DocsEnum} was obtained with {@link #FLAG_NONE}. Do not call
this
+ * before {@link #nextDoc} is first called, nor after {@link #nextDoc}
returns
+ * {@link DocIdSetIterator#NO_MORE_DOCS}.
+ *
+ * <p>
+ * <b>NOTE:</b> if the {@link DocsEnum} was obtain with {@link
#FLAG_NONE},
+ * this method returns 1.
+ */
public abstract int freq() throws IOException;
/** Returns the related attributes. */
Shai