Sorry to jump on a "Side note" of the thread, but the topic is about some of my need of the moment.

Side Note: It's my opinion that "type ahead" or "auto complete' style
functionality is best addressed by customized logic (most likely using
specially built fields containing all of the prefixes of the key words up
to N characters as seperate tokens).

Do you mean something like below ?
<field name="autocomplete">w wo wor word</field>

simple uses of PrefixQueries are
only going ot get you so far particularly under heavy load or in an index
with a large number of unique terms.

For a bibliographic app with lucene, I implemented a suggest on different fields (especially "subject" terms, like topic or place), to populate a form with already used values. I used the Lucene IndexReader to get very fastly list of terms in sorting order, without duplicate values.

<http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/index/IndexReader.html#terms(org.apache.lucene.index.Term)>

There's a bad drawback of this way, "The enumeration is ordered by Term.compareTo()", the sorting order is natively ASCII, uppercase is before lowercase. I had to patch Lucene Term.compareTo() for this project, definitively not a good practice for portability of indexes. A duplicate field with an analyser to produce a sortable ASCII version would be better.

Opinions of the list on this topic would be welcome.

--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique

Reply via email to