On Sat, Jun 28, 2025 at 07:32:04PM +0200, Patrice Dumas wrote:
> On Sat, Jun 28, 2025 at 04:39:14PM +0100, Gavin Smith wrote:
> > As I remember, the marker for an index node (^@^H[index^@^H]) was not
> > added by older versions of texi2any/makeinfo.  So there are Info files
> > out there that do not have such a marker in Info nodes.
> 
> Actually, there is a second reason, I believe, to restrict to nodes
> after "Index" in their name, is that it avoids scanning all the
> nodes to gather indices. 

Good point.

> If I read the code well, scanning a node for
> indices and other markers mainly happens when the node is visited,
> and for nodes after a node with "Index" in name in
> info_indices_of_file_buffer.  Therefore, what I propose would make the
> first call of info_indices_of_file_buffer (after starting Info or after
> calling info_indices_of_file_buffer in another manual) slower than it is
> today.

If scanning the entire file is too slow, an optimisation would be to only
scan (say) the first 1024 bytes of each node, or the first 10 lines or so,
as that is where the the index marker typically appears:

File: info-stnd.info,  Node: Index,  Prev: MS-DOS/Windows keybindings,  Up: Top

Appendix A Index
****************

 [index ]
* Menu:

I also don't think it is a huge restriction to have "Index" in the
node name.  Non-English Info manuals could have English node names, or
they could have the word "Index" in addition to the translation.  (I
haven't checked what happens with anchors with Index in the name but
marking with an anchor is another possiblity.)

> Maybe another option could be to add to the Tag table a new tag to
> locate the beginning of the node with the first @printindex by texi2any.
> I checked that a tag that is not 'Node:' or 'Ref:' has the Info reader
> (but not the Emacs Info reader) stop going through tags immediatly.
> This means that a new tag would break the Info readers from previous
> releases.  This could still be considered, but for a very long term, it
> could be added to the Info reader, but would start being emitted by
> makeinfo by default only when we think that the Info readers have all
> been modifed (in 20 years from now maybe?  I would be old by then).

In that time it is likely that the speed of scanning the entire Info file
would be even more insignificant than it is today.  I doubt it is worth
attempting to change the Info file format with some compatibility schedule
to follow.

> Another possibility (could be in addition to something in the tag table),
> which, I believe, would be more backward compatible would be to add a
> 'Local Variables' with the name of the first node with indices.  It
> would be less efficient that having the information in the tag table,
> but would still be better than using "Index", and somewhat more
> efficient as the whole node name would be matched.  If not present, then
> "Index" would be used as before.
> 
> > So what I suggest (if it is not the case already), is that if there are
> > any nodes with such a marker, only the marker is used to determine which
> > nodes are index nodes.  Only if no nodes are thus marked, should the "Index"
> > string in the node name then indicate an index node.
> 
> Ok.  This would be needed for the 'scan all the nodes' case only.

We could also only check nodes with "Index" in the name, but if any such
nodes also contain an index marker, then we require all such nodes to
have an index marker for them to be considered as indices.  That way we
avoid including nodes that aren't indices but which "Index" in the name
(such as "Indexing Commands" in the Texinfo manual).

Reply via email to