> From: Gavin Smith <[email protected]> > Date: Mon, 27 Apr 2026 14:09:37 +0100 > Cc: [email protected], [email protected] > > On Mon, Apr 27, 2026 at 02:48:23PM +0300, Eli Zaretskii wrote: > > > From: Gavin Smith <[email protected]> > > > Date: Sun, 26 Apr 2026 22:01:37 +0100 > > > Cc: [email protected] > > > > > > I remember that the convention of merging index entries with the > > > same sort key was very old (from when I looked at this before, several > > > years ago), but I thought we could reconsider this, as I do not actually > > > see any advantage of merging the entries. > > > > > > Does anyone have an opinion on this? > > > > Isn't this what causes the Index to have stuff like > > > > Foo bar.......................................42, 142, 442 > > > > rather than > > > > Foo bar.......................................42 > > Foo bar.......................................142 > > Foo bar.......................................442 > > > > That is, if the same subject is mentioned in several places, have on > > cumulative entry for it in the index with all the pages? If so, I > > see a clear advantage to merging the entries, at least for > > non-punctuation characters. > > I propose that they only be merged if the index entry text is identical. > > Thus, the following two index entries should be merged: > > @cindex Foo bar > @cindex Foo bar > > However, the following index entries should be distinct: > > @cindex Foo bar > @cindex Foo @code{bar} > @cindex Foo @command{bar} > @cindex Föö bar > > I notice in the NEWS file for Texinfo, in the section for 6.0, there > is: > > * texindex: > . completely new implementation as a literate program using Texinfo > and (portable) awk (called TexiWeb Jr.), thanks to Arnold Robbins. > (Requires gawk 4.0+ if .twjr source is modified.) > . the -o (--output) is not supported, unless we hear of someone using it. > . duplicated sort keys with different display texts result in one > merged index entry, using the first display text. > . better sorting and parsing in unusual cases; most notably, { and } > characters can appear as initials. > > Bullet point 3 is what I am talking about here.
I guess I didn't understand what you meant by "the same sort key". I thought the entire text of each index entry is "the sort key", so I interpreted "the same key" as "the same text of index entry". This is your first example. It seems now that you are talking about collation that ignores certain secondary/tertiary weights, like accents and punctuation? If so, then whether this is a Good Thing should indeed be controllable by some option, because it is quite possible that someone will want to merge them.
