Follow-up Comment #8, bug #62264 (group groff): More background/context/motivation.
commit 7ffd75ebd27fed200141cc17b825a215f23ace2a Author: G. Branden Robinson <[email protected]> Date: Sat May 10 08:50:37 2025 -0500 doc/*,man/*: Correct `pm` request description. The `pm` request does not, as a matter of fact, report the "sizes" of macros, strings, and diversions "in bytes". That claim is true only of "pure" macros and strings that contain only character sequences, and even then only to the extent that we ignore other properties of macros and strings besides what is internally referred to as their "contents". (All such properties are now visible using the `.pm name` syntax.) The traditional description of this GNU troff macro was thus misleading in two ways. 1. It undercounted the actual storage requirement--which is what we tend to measure in bytes--of even the simplest of these objects, such as empty macro definitions or one-character string definitions. (It came closest to telling the truth about the built-in `.T` string, which has no file name or line number data to track; but even it has a node list; see next item.) 2. It was wildly inaccurate for diversions in general. GNU troff represents a diversion internally as a "node list". A node list is a linked list, and even a singly-linked list stores a pointer with every node. A "node" is itself an object in a class hierarchy in GNU troff, and these classes vary widely in their own storage requirements, which the program does not attempt to measure, let alone report. Further, GNU troff always maintains both a "contents" datum and a "node list" for every macro, string, and diversion. The presence of a node is marked by a null character in the "contents" groff string, which is a sequence of bytes. Because macros, strings, and diversions are interchangeable ("punnable") in some contexts, all have both "contents" and a "node list", even though the node list of a macro or string is usually empty. It seems more useful to document the truth than to change how `pm` works; I expect few groff users concern themselves with the storage requirements of their documents' strings, macros, and diversions at the granularity of an individual byte. What is likely of more interest is the length of the list of objects (nodes or characters) the string, macro, or diversion contains. The elements of this list are what get counted by the `length` request, removed by the `chop` request, and transformed by the `asciify` and `unformat` requests. In the future, if/when we implement an iterator for these objects, the aforementioned elements will be the unit of iteration. See <https://savannah.gnu.org/bugs/?62264>. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?62264> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature
