Hi Steve, in fact the list
of terms returned is for user consumption. From every term is possible
with a link to activate a search on the term itself and access to
document.
Annales
cafe
Cafè
zucche
Thanks
Federica
Steven A Rowe ha scritto:
On 4/7/2009 at 1:19 PM, Michael McCandless wrote:
I think the new contrib/collation package may address this use case?
It converts each term to its CollationKey, outside of Lucene.
Since AFAIK CollationKey creation is a one-way process, CollationKeyFilter may not be useful for Federica.
Federica, what use do you make of the terms returned by reader.terms()? I ask because the new CollationKeyFilter would produce terms that would not be suitable for human consumption, but might be useful for other purposes.
Steve
On Tue, Apr 7, 2009 at 7:36 AM, Federica Falini Data Management S.p.A
<[email protected]> wrote:
Good morning,
In Lucene 2.2 i have made modification to Term.java, TermBuffer.java
(see below) in order to have Term enumerations sorted case-insensitive
(when a field is not-tokenized):
TermEnum terms = reader.terms(new Term("myFieldNotTokenized", ""));
while ("myFieldNotTokenized".equals(terms.term().field())) {
System.out.println( " " + terms.term());
if (!terms.next()) break;
}
For example, instead to obtain this sort on TermEnum:
Annales
Cafè
Zucche
cafe
i need to obtain this :
Annales
cafe
Cafè
Zucche
Now in Lucene 2.4 i find it difficult because the package "index" is
changed a lot; can i have some indications to keep my sort?
Thanks in advance
Federica
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
--
Firma
Federica
FALINI
Divisione Beni Culturali
|
|
48100 - Ravenna (RA)
Via S.Cavina, n 7
Italy
|
Questo messaggio
di posta elettronica contiene informazioni di carattere confidenziale
rivolte esclusivamente al destinatario sopra indicato. E' vietato
l'uso, la diffusione, distribuzione o riproduzione da parte di ogni
altra persona. Nel caso aveste ricevuto questo messaggio di posta
elettronica per errore, siete pregati di segnalarlo immediatamente al
mittente e distruggere quanto ricevuto (compresi i file allegati) senza
farne copia. This e-mail transmission may contain legally
privileged and/or confidential information. Please do not read it if
you are not the intended recipient(S). Any use, distribution,
reproduction or disclosure by any other person is strictly prohibited.
If you have received this e-mail in error, please notify the sender and
destroy the original transmission and its attachments without reading
or saving it in any manner. |
|
|