Hi Chris,
The skills on this list are really very stimulating. I'm sad but I will
probably not be able to contribute. Solr may not be the choosen
technology of the project I'm working on, because of server
administration issues (java). I know that there is no performances
arguments (lucene is incredible, and solr is nicely close to it), but
that's real life. So I will not find time for the idea below.
> : project, definitively not a good practice for portability of indexes. A
> : duplicate field with an analyser to produce a sortable ASCII version
> : would be better.
>
> exactly ... I think conceptually the methodology for solving the problem
> is very similar to the way the SpellChecker contrib works: use a very
> custom index designed for the application (not just look at the terms in
> the main corpus) and custom logic for using that index.
It could be a useful request handler ? Giving a field, with a
displayable stored value, and a sortable indexed one, you need the
analyser to parse the user entry, build a term with it, and get very
fastly a pointer to the internal lucene index, exactly at the best
place, for w, wo, wor or word. From the iterator you can display a
suggest list, it's also possible to get one or more docs directly
attached, for example to display a count. It seems interesting for
things like, a topic or an author of a doc ?
: Do you mean something like below ?
: <field name="autocomplete">w wo wor word</field>
yeah, but there are some Tokenizers that make this trivial
(EdgeNGramTokenizer i think is the name)
--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique