Re: Wildcards / Binary searches

Frédéric Glorieux Thu, 07 Jun 2007 07:32:24 -0700

Sorry to jump on a "Side note" of the thread, but the topic is aboutsome of my need of the moment.

Side Note: It's my opinion that "type ahead" or "auto complete' style
functionality is best addressed by customized logic (most likely using
specially built fields containing all of the prefixes of the key words up

to N characters as seperate tokens).


Do you mean something like below ?
<field name="autocomplete">w wo wor word</field>

simple uses of PrefixQueries are
only going ot get you so far particularly under heavy load or in an index
with a large number of unique terms.

For a bibliographic app with lucene, I implemented a suggest ondifferent fields (especially "subject" terms, like topic or place), topopulate a form with already used values. I used the Lucene IndexReaderto get very fastly list of terms in sorting order, without duplicate values.


<http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/index/IndexReader.html#terms(org.apache.lucene.index.Term)>

There's a bad drawback of this way, "The enumeration is ordered byTerm.compareTo()", the sorting order is natively ASCII, uppercase isbefore lowercase. I had to patch Lucene Term.compareTo() for thisproject, definitively not a good practice for portability of indexes. Aduplicate field with an analyser to produce a sortable ASCII versionwould be better.


Opinions of the list on this topic would be welcome.

--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique

Re: Wildcards / Binary searches

Reply via email to