On Sun, Nov 06, 2022 at 10:02:44AM +0000, Werner LEMBERG wrote: > > [texindex (GNU texinfo) 6.8dev] > [GNU Awk 4.2.1, API: 2.0] > [openSUSE Leap 15.4] > > > There are two bugs with texindex, making it basically unusable for > everything except English as the main document language. For the > report below, here is an input file. > > ``` > \input texinfo.tex > > @documentencoding UTF-8 > @documentlanguage ca > > @findex a > @findex à > @findex u > @findex ù > > @printindex fn > > @bye > ``` > > * The first, really severe bug is that the resulting output is > completely broken if `texindex` is called with `LANG=C`. Saying > > ``` > LANG=C texi2pdf sort-ca.texi > ``` > > creates the following `.fns` output > > ``` > \initial {0xc3} > \entry{\code {à}}{1} > \entry{\code {ù}}{1} > \initial {A} > \entry{\code {a}}{1} > \initial {U} > \entry{\code {u}}{1} > ```
A (non-ideal) workaround is to avoid the use of non-ASCII characters in the first character of index entries, using accent commands instead: @findex ax @findex @'ay @findex ux @findex @`uy producing in sort-ca.fn \entry{ax}{1}{\code {ax}} \entry{ay}{1}{\code {\'ay}} \entry{ux}{1}{\code {ux}} \entry{uy}{1}{\code {\`uy}} and in sort-ca.fns \initial {A} \entry{\code {ax}}{1} \entry{\code {\'ay}}{1} \initial {U} \entry{\code {ux}}{1} \entry{\code {\`uy}}{1} In this example, I added extra letters to distinguish a and à otherwise texindex will condense them to a single index entry as they have the same sort key.