Re: Do not merge index entries with equal sort keys?

Eli Zaretskii Mon, 27 Apr 2026 07:04:31 -0700

> From: Gavin Smith <[email protected]>
> Date: Mon, 27 Apr 2026 14:09:37 +0100
> Cc: [email protected], [email protected]
> 
> On Mon, Apr 27, 2026 at 02:48:23PM +0300, Eli Zaretskii wrote:
> > > From: Gavin Smith <[email protected]>
> > > Date: Sun, 26 Apr 2026 22:01:37 +0100
> > > Cc: [email protected]
> > > 
> > > I remember that the convention of merging index entries with the
> > > same sort key was very old (from when I looked at this before, several
> > > years ago), but I thought we could reconsider this, as I do not actually
> > > see any advantage of merging the entries.
> > > 
> > > Does anyone have an opinion on this?
> > 
> > Isn't this what causes the Index to have stuff like
> > 
> >   Foo bar.......................................42, 142, 442
> > 
> > rather than
> > 
> >   Foo bar.......................................42
> >   Foo bar.......................................142
> >   Foo bar.......................................442
> > 
> > That is, if the same subject is mentioned in several places, have on
> > cumulative entry for it in the index with all the pages?  If so, I
> > see a clear advantage to merging the entries, at least for
> > non-punctuation characters.
> 
> I propose that they only be merged if the index entry text is identical.
> 
> Thus, the following two index entries should be merged:
> 
> @cindex Foo bar
> @cindex Foo bar
> 
> However, the following index entries should be distinct:
> 
> @cindex Foo bar
> @cindex Foo @code{bar}
> @cindex Foo @command{bar}
> @cindex Föö bar
> 
> I notice in the NEWS file for Texinfo, in the section for 6.0, there
> is:
> 
> * texindex:
>   . completely new implementation as a literate program using Texinfo
>     and (portable) awk (called TexiWeb Jr.), thanks to Arnold Robbins.
>     (Requires gawk 4.0+ if .twjr source is modified.)
>   . the -o (--output) is not supported, unless we hear of someone using it.
>   . duplicated sort keys with different display texts result in one
>     merged index entry, using the first display text.
>   . better sorting and parsing in unusual cases; most notably, { and }
>     characters can appear as initials.
> 
> Bullet point 3 is what I am talking about here.


I guess I didn't understand what you meant by "the same sort key".  I
thought the entire text of each index entry is "the sort key", so I
interpreted "the same key" as "the same text of index entry".  This is
your first example.  It seems now that you are talking about collation
that ignores certain secondary/tertiary weights, like accents and
punctuation?  If so, then whether this is a Good Thing should indeed
be controllable by some option, because it is quite possible that
someone will want to merge them.

Re: Do not merge index entries with equal sort keys?

Reply via email to