On Sun, 2 Apr 2000, Dmitry Sivachenko wrote:

> > > May be command-line option is that good solution.
> > > What do people think?
> >
> > I do think that it's a good solution.  But the problem is not solved by 
> > it completely.  If you think that texindex should call setlocale based on 
> > that option, I'm not sure it's the right thing to do: setlocale affects 
> > other functions beyond case-converting ones.
> 
> Hmm, what other functions LC_CTYLE and LC_COLLATE affect?

LC_COLLATE affects strxfrm and strcoll, LC_CTYPE affects printf,
scanf, strtol and its ilk, and the mb* functions.

> Locale was intended to simplify and unify process of handling non-ascii
> letters.  It is not good think to reject using it, IMHO.

The problem with locales in C is that they are global: there can be
only one locale in effect at any given time; you cannot have different
locales for different file streams or different classes of functions.

In other words, you cannot, for example, compare ASCII and non-ASCII
text at the same time and do both of them correctly.

In addition, in this particular case, you suggest to use the locale
facilities for something they were not meant for: when texindex
processes a Russian document, it doesn't necessarily mean that the
machine which runs the program is set to use the Russian locale.  Such
abuse of library facilities may produce unexpected results (read:
bugs).

> texindex now can't produce indexes sorted in alphabet order.
> Not all encodings satisfies the condition that 'a'>'b'==>'a' comes
> before 'b' in the alphabet.

If sorting is the only problem, it can be done without using locales.
It's not that hard to come up with collating sequences for supported
languages.  We could even have a provision for reading the lexical
order from some external file, submitted to texindex via a
command-line option, thus making texindex oblivious to what language
it is processing.

The advantage of this is that you don't subvert the locale mechanism.

Reply via email to