On Wed, May 17, 2017 at 05:19:56AM +0300, Eli Zaretskii wrote: > > From: Gavin Smith <[email protected]> > > Date: Tue, 16 May 2017 22:05:41 +0100 > > Cc: Texinfo <[email protected]> > > > > On 16 May 2017 at 20:06, Eli Zaretskii <[email protected]> wrote: > > >> From: Gavin Smith <[email protected]> > > >> Date: Tue, 16 May 2017 19:29:48 +0100 > > >> > > >> Note: Info files with CR-LF line endings should carry on being > > >> supported. on all operating systems. > > > > > > But actually, they aren't supported on any OS. That's why we changed > > > texi2any to emit Unix-style LF-only EOLs a while back, remember? > > > > They shouldn't be generated but info can still read them. > > Info can indeed read them, but cross-references to anchors don't work > correctly then. They land the reader at the wrong place, because byte > counts don't match. This started happening some versions ago, because > of some change whose details I no longer remember.
So the Info files produced by old versions of makeinfo on MS-DOS or MS-Windows ended lines with a CR-LF sequence, but they were only counted as one in the tags table (giving byte offsets within the file of nodes). So info stripped these sequences so that files could be read correctly. Eventually it appeared that there could be files with some lines ending in CR-LF which were counted as _two_ bytes, so in an attempt to support these, the CR bytes were only stripped after a node couldn't be found in a file. Apparently this doesn't work completely reliably, especially for anchors where is nothing at the offset to confirm that you found the right place (unlike nodes, which have a node separator). Could we unconditionally strip the CR's on DOS and Windows only? This could be done by calling the 'convert_eols' function (currently in info/nodes.c), or else by opening the file in "text" mode in filesys_read_info_file in info/filesys.c (currently it uses a flag "O_BINARY" to open the file). This would mean that Info files with CR-LF line endings could only be read on Windows, and moreover Info on Windows could not read (the few) Info files where the CR bytes were counted. I believe this could result in quite a bit of simplification of the code, as the code that conditionally calls 'convert_eols' is a bit difficult (see find_node_from_tag in info/nodes.c), so I'd like to make this change. The upside for Windows users would be that following xrefs to anchors would be more reliable. (The info/t/crs-not-counted.sh test would have to be skipped on Windows, or we could just remove this test as there wouldn't be much left for it to test.) If anyone really wanted to read an Info file with CR-LF line endings on GNU/Linux, they would have to convert the file endings themselves.
