Hi,

is there any implementation for a chinese collator in Lucene. I've seen that there is a chinese analyzer which uses Hidden Markov Models. But sorting seems to be an issue on its own and all my googling hasn't led to any results yet.

I understand that this is not a trivial issue and I've read that the chinese tend to prefer other ordering than by name, since sorting orders are so complicated that nobody wants to use them. But we will have to sort search results by name, even though the name is chinese (simplified chinese at the moment, but traditional may also appear later) and currenty chinese words seem to be ordered by their unicode-number, which seems not to be the right order.

Thanks in advance for any suggestion,
 Nils

Reply via email to