> Tatsuo Ishii <[EMAIL PROTECTED]> wrote: > > > For me the idea that a text-search configuration maps to a > > locale/language seems to be totally wrong. IMO an encoding/charset > > could include several languages and a text-search configuration should > > be mapped to an encoding/charset, rather than a language. > > I think mapping by encoding/charset *is* totally wrong and by locale is > reasonable. How do you treat LATIN1? It can be used in French and German, > etc. Moreover, UTF-8 can be used in almost all languages. > > The tight mapping of EUC_jp <=> Japanese is a special case in the world.
What? I didn't say that an encoding/charset is mapped to single language. Actually EUC_JP includes Japanese, English(ascii), Greek, Cyrillic and so on. So for the full text search being able to process EUC_JP text properly, it should be able to process multiple languages at a time. You know that PostgreSQL allows only one locale for a PostgreSQL cluster, and the fact that text-search being depending on locale prevent it from processing multi language text. The only solution I can think of today is creating new parser which can process EUC_JP properly (I mean it can process not only Japanese but also English) and use it on C locale/EUC_JP cluster. I would do this for 8.4 if I have time. -- Tatsuo Ishii SRA OSS, Inc. Japan ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings