On Sun, Nov 06, 2022 at 10:02:44AM +0000, Werner LEMBERG wrote:
> 
> [texindex (GNU texinfo) 6.8dev]
> [GNU Awk 4.2.1, API: 2.0]
> [openSUSE Leap 15.4]
> 
> 
> There are two bugs with texindex, making it basically unusable for
> everything except English as the main document language.  For the
> report below, here is an input file.
> 
> ```
> \input texinfo.tex
> 
> @documentencoding UTF-8
> @documentlanguage ca
> 
> @findex a
> @findex à
> @findex u
> @findex ù
> 
> @printindex fn
> 
> @bye
> ```
> 
> * The first, really severe bug is that the resulting output is
>   completely broken if `texindex` is called with `LANG=C`.  Saying
> 
>   ```
>   LANG=C texi2pdf sort-ca.texi 
>   ```
> 
>   creates the following `.fns` output
> 
>   ```
>   \initial {0xc3}
>   \entry{\code {à}}{1}
>   \entry{\code {ù}}{1}
>   \initial {A}
>   \entry{\code {a}}{1}
>   \initial {U}
>   \entry{\code {u}}{1}
>   ```

A (non-ideal) workaround is to avoid the use of non-ASCII characters
in the first character of index entries, using accent commands
instead:

@findex ax
@findex @'ay
@findex ux
@findex @`uy

producing in sort-ca.fn

\entry{ax}{1}{\code {ax}}
\entry{ay}{1}{\code {\'ay}}
\entry{ux}{1}{\code {ux}}
\entry{uy}{1}{\code {\`uy}}

and in sort-ca.fns

\initial {A}
\entry{\code {ax}}{1}
\entry{\code {\'ay}}{1}
\initial {U}
\entry{\code {ux}}{1}
\entry{\code {\`uy}}{1}

In this example, I added extra letters to distinguish a and à otherwise
texindex will condense them to a single index entry as they have the same
sort key.



Reply via email to