Martin Povolny writes:
>
> I think that the latin-2 support should be made from the root.
>
> I mean this: the latin-2 characters are no different than the english
> the only problem is that there are more of them. Therefor they should
> require no special treatment.
>
> It should be enough to enlarge data structures and change those places
> in all programs thet make diacritics disappear, starting with sending
> the right accept-charset header when ht-digging.
>
> I haven't studied the sources enough yet, and I's probably gonna be
> more complicated. But eg mysql has complete czech support including such
> things like czech sorting (ch is considered a single character which goes
> between 'h' and 'i' in the aplhabet) so why not htdig.
>
> The special treatment for latin (probably latin-1) characters Marton Lorand
> suggested simply isn't what I mean. It's not enough.
>
I think you have a correct vision of what's needed. htdig could, indeed
support some charsets and sorting order as MySQL does. The question is on
how to do it. When Michael Widenus did the MySQL support for foreign charset
the Unicode standard was only a standard, with no powerfull libraries to
support it. Nowadays we have those libraries. They are not yet a de facto
standard but within a year or two at most the standard will be established.
We basically have three solutions (more discussion in the archives): use
IBM ICU, use libunicode and enhance it, use glibc.
Chosing a library, chosing an internal charset (UTF-8 as Perl
or UTF-16 as Java), converting the code to know the distinction between
internal and external charset, using the *str* functions for the internal
charset chosen instead of strchr, strcmp... is all what needs to be done.
Basically it means that instead of concentrating on handling a specific
charset for a specific language we should use a library that implements it
for all languages and knows 400 different charset.
Cheers,
--
Loic Dachary
ECILA
100 av. du Gal Leclerc
93500 Pantin - France
Tel: 33 1 56 96 09 80, Fax: 33 1 56 96 09 61
e-mail: [EMAIL PROTECTED] URL: http://www.senga.org/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.