[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2024-07-17 Thread Dave
Follow-up Comment #16, bug #55154 (group groff):

But zooming out:

Comment #13 was my attempt to test my hypothesis of "why .char doesn't work
for this" as presented in comment #11.

This is all tangential to the purpose of this ticket, which covers the .tr
request and various types of spaces.  Everything through comment #10 touches
on this.

Comment #11 onward is where I veered off course, covering the parallel
situation with .char.  This might be more relevant to bug #65829 (which didn't
exist till comment #12).

Does this clarify the status for you?


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2024-07-17 Thread Dave
Follow-up Comment #15, bug #55154 (group groff):

[comment #14 comment #14:]
> Try this patch:

Just tried to apply it, but it appears to have already been committed as
[http://git.savannah.gnu.org/cgit/groff.git/commit/?id=84bf4520b commit
84bf4520b].


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2024-07-11 Thread G. Branden Robinson
Update of bug #55154 (group groff):

 Assigned to:None => barx   

___

Follow-up Comment #14:

Try this patch:


diff --git a/src/roff/troff/node.cpp b/src/roff/troff/node.cpp
index b00f8377e..2c6cbbec7 100644
--- a/src/roff/troff/node.cpp
+++ b/src/roff/troff/node.cpp
@@ -2183,6 +2183,8 @@ void glyph_node::ascii_print(ascii_output_file *ascii)
 ascii->outs(ci->nm.contents());
 }
 
+// XXX: This and `composite_node::dump_node()` are identical.  C++
+// presumably has several different solutions for this.  Pick one.
 void glyph_node::dump_node()
 {
   unsigned char c = ci->get_ascii_code();
@@ -4279,6 +4281,7 @@ public:
   bool is_tag();
   void vertical_extent(vunits *, vunits *);
   vunits vertical_width();
+  void dump_node();
 };
 
 composite_node::composite_node(node *p, charinfo *c, tfont *t, statem *s,
@@ -4918,6 +4921,26 @@ void composite_node::tprint(troff_output_file *out)
 out->right(track_kern);
 }
 
+// XXX: This and `glyph_node::dump_node()` are identical.  C++
+// presumably has several different solutions for this.  Pick one.
+void composite_node::dump_node()
+{
+  unsigned char c = ci->get_ascii_code();
+  fprintf(stderr, "{type: %s, character: ", type());
+  if (c)
+fprintf(stderr, "\"%c\"", c);
+  else
+fprintf(stderr, "\"\\%s\"", ci->nm.contents());
+  fputs(", ", stderr);
+  if (push_state)
+fprintf(stderr, "push_state, ");
+  if (state)
+state->display_state();
+  fprintf(stderr, "diversion level: %d", div_nest_level);
+  fputs("}", stderr);
+  fflush(stderr);
+}
+
 static node *make_composite_node(charinfo *s, environment *env)
 {
   int fontno = env_definite_font(env);


Here's your exhibit.


$ ./build/test-groff -z EXPERIMENTS/55154.tr
{type: line_start_node, diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0},
{type: composite_node, character: "b", diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: word_space_node, diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: composite_node, character: "b", diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0}

{type: line_start_node, diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0},
{type: unbreakable_space_node, diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: word_space_node, diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: unbreakable_space_node, diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0}


Assigning back to you because I'm not clear where we are on this, or whether
the foregoing illuminates anything.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/


signature.asc
Description: PGP signature


[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2024-06-11 Thread Dave
Follow-up Comment #13, bug #55154 (group groff):

[comment #11 comment #11:]
> I presume this is due to this explanation in the Texinfo manual:
> 
>  -- Request: .char c ['"'][contents]
>  Every time C is to be output, CONTENTS is processed in a
> temporary environment and the result encapsulated in a node.
> 
> The temporary environment being unaware of the rest of the line,
> it can only turn \~ into a node that is the width of an ordinary
> unbreakable space.

The below (requiring a recent groff build, as it uses .pline) is consistent
with this hypothesis (though not proof of it, as the precise content of the
two composite_node nodes is unknown).

$ cat 55154.tr
.char b \~
abc cba
.pline
.brp
.tm
a\~c c\~a
.pline
.brp
$ groff 55154.tr > /dev/null
{type: line_start_node, diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0},
{type: composite_node, diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: word_space_node, diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: composite_node, diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0}

{type: line_start_node, diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0},
{type: unbreakable_space_node, diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: word_space_node, diversion level: 0},
{type: glyph_node, character: "c", diversion level: 0},
{type: unbreakable_space_node, diversion level: 0},
{type: glyph_node, character: "a", diversion level: 0}




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2024-06-02 Thread Dave
Follow-up Comment #12, bug #55154 (group groff):

[comment #11 comment #11:]
> While everything here appears to be working as designed, I'm
> tempted to open a new bug report anyway,

Succumbed to temptation: bug #65829


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2024-05-16 Thread Dave
Follow-up Comment #11, bug #55154 (group groff):

[comment #10 comment #10:]
> Bizarrely, while it accepts the second translation, it doesn't
> actually honor it.

It gets worse: even .char fails at this.

$ cat char-test
.char b \~
abc cba\p
$ nroff char-test | cat -s
a c   c a

$ 

I presume this is due to this explanation in the Texinfo manual:

 -- Request: .char c ['"'][contents]
 Every time C is to be output, CONTENTS is processed in a temporary
environment and the result encapsulated in a node.

The temporary environment being unaware of the rest of the line, it can only
turn \~ into a node that is the width of an ordinary unbreakable space.

This is frustrating, because it means there is no way within groff to work
around bug #62300 (fixed in 1.23.0) for UTF-8 documents that need to work
under earlier preconvs.  Another tool has to preprocess the file before
preconv gets to it.  While everything here appears to be working as designed,
I'm tempted to open a new bug report anyway, because the design thwarts such a
seemingly straightforward way to handle pre-1.23 preconv output for the input
character U+00A0.

Hopefully I'm wrong about something above, and someone wiser will set me
straight.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2024-05-15 Thread Dave
Follow-up Comment #10, bug #55154 (group groff):

[comment #0 original submission:]
> .tr a 
> .tr b\~
> .tr c\ 
> .tr d\|
> .tr e\^
> .tr f\0
> 
> This attempts to translate six alphabetic characters to six
> different types of space characters.  What it does instead is
> accept the first two translations and reject the last four:

Bizarrely, while it accepts the second translation, it doesn't actually honor
it.

$ cat tr-test
.tr b\~
abc cba\p
$ nroff tr-test | cat -s
a c   c a

$ 

I'm not sure what to think of this.  On the one hand, comment #2 argues this
shouldn't work; even the texinfo manual says a .tr target should be a glyph,
which \~ isn't.  On the other hand, given that fact, the translation failing
outright (as the subsequent lines of the original submission do) would make
sense, whereas silently converting a stretchable space to an unstretchable one
surely does not.

The deprecation proposed in bug #64337 would make this moot.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2023-12-27 Thread Dave
Follow-up Comment #9, bug#55154 (group groff):

[comment #7 comment #7:]
> > I don't think \| and \^ are too much of a challenge here. 
> 
> I recall you've spoken about the font file's ability to alter
> their sizes being a feature you've never seen used in practice,
> and maybe floated the idea of deprecating this as well, but if
> there's a ticket for that proposal, I couldn't find it.

I found the discussion, buried in bug #58930 (in a paragraph beginning "The
two exceptions").  You did say you'd never seen this feature used, but came
short of calling for its deprecation.  So my half-memory was half correct.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2023-06-24 Thread Dave
Follow-up Comment #8, bug #55154 (project groff):

Everything else I wanted to say (here and in bug #64337) I ended up talking
myself out of.  All I have left are two minor comments on the documentation
update.

[comment #6 comment #6:]
> > This feature replaces the odd-parity tr mapping trick 

"replaces" implies that the odd-parity trick has gone away.  Maybe "augments"
or "parallels"?

> > Translation of a character to \~ is unnecessary.

This seems unnecessarily opinionated for technical documentation.  Arguably
all translations are unnecessary.  But they're a useful tool, allowing you to
(1) simplify or make more legible your input, or (2) treat certain characters
conditionally (e.g. based on the output device).  I'm not sure it's fair to
characterize mapping to \~ as any more or less necessary than mapping to any
other character.

The goal presumably being to steer users away from a mapping that may be
deprecated, saying so directly probably serves the reader better, e.g.:
"Translation of a character to \~ may be deprecated in a future release."


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2023-06-22 Thread Dave
Follow-up Comment #7, bug #55154 (project groff):

[comment #6 comment #6:]
> Possibly, we should try deprecating both of these uses of `tr`,
> so that we can say simply that `tr` remaps *characters* (ordinary
> or special).  Full stop.

Now that this proposal has its own ticket (bug #64337), I'll respond there to
the parts of this comment specifically related to that.

> I don't think \| and \^ are too much of a challenge here. 
> They're still not _characters_, and we could introduce new
> directives to the font description file syntax to replace them.

I recall you've spoken about the font file's ability to alter their sizes
being a feature you've never seen used in practice, and maybe floated the idea
of deprecating this as well, but if there's a ticket for that proposal, I
couldn't find it.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2023-06-22 Thread G. Branden Robinson
Follow-up Comment #6, bug #55154 (project groff):

I may be groping toward an apology (in the formal, rhetorical, not
conversational, sense) in the course of recent plus pending changes to
groff_diff(7).

I had forgotten, and thus was semi-surprised, that ".tr a\ " (note trailing
space) was not accepted.

But that surprise was only semi- because as James Clark pointed out, ".tr a",
more often seen as ".tr ~" is a truly bizarre feature.

I _suspect_ this feature crept in very early, before the "\ " escape sequence,
which gets you the same result without having to bother with a translation.

Clark stayed precisely within this non-orthogonality when supporting ".tr
a\~".

Possibly, we should try deprecating both of these uses of `tr`, so that we can
say simply that `tr` remaps *characters* (ordinary or special).  Full stop.

I don't think \| and \^ are too much of a challenge here.  They're still not
_characters_, and we could introduce new directives to the font description
file syntax to replace them.


hair-space-width 10
thin-space-width 20


I observe that the "\| and \^ in the font description file" trick is _not_
documented or implied in CSTR #54, though it does appear in CSTR #97.  Perhaps
by 1992, Kernighan began to think better of it.

Anyway, here's what is in my working copy.

> In GNU troff, the tr request can map characters to the unbreakable
> space escape sequence \~ as a special case (tr normally operates
> only on characters).  This feature replaces the odd-parity tr
> mapping trick used in AT&T troff documents, where a character, often
> ~, was "sacrificed" by mapping it to "nothing", drafting it into use
> as an unadjustable, unbreakable space.  (This feature was gratuitous
> even in early AT&T troff, which supported the \space escape sequence
> by 1976.)  Often, it makes more sense to use GNU troff's \~ escape
> sequence instead, which has been adopted by every other active troff
> implementation except that of Illumos, as well as by the non-troff
> mandoc.  Translation of a character to \~ is unnecessary.

Thoughts?


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2023-05-24 Thread G. Branden Robinson
Follow-up Comment #5, bug #55154 (project groff):

A subset of escape sequences interpolates special characters.

\` \' \- \_ \(xx \[xxx]

No escape sequences interpolate ordinary characters.

(I'm ignoring \*(xx and \*[xxx], which interpolate strings potentially
containing characters.) 


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2023-05-24 Thread Dave
Follow-up Comment #4, bug #55154 (project groff):

I also wonder whether this statement in groff.texi should be expanded: "The
following characters can't be translated: space (with one exception, see
below), backspace, newline, leader (and '\a'), tab (and '\t')."

It is not just ordinary space characters that cannot be translated: it is all
spaces (such as \0, \|, \^):

$ echo '.tr \|a' | groff
troff::1: error: expected ordinary or special character, got a
horizontal motion

One can argue that the current statement is sufficient:

1. Comment #2 points out that these other "spaces" are instead actually
horizontal motions.
2. The docs term these "escapes" rather than "characters," and .tr is
documented to work only on characters.

But there are potential rebuttals to both of these:

1. \| and \^ fall into a gray area, as they can be defined as characters in a
font file.
2. The above-quoted list of exceptions already includes two escapes.

I have no firm conclusion yet.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2022-07-07 Thread Dave
Follow-up Comment #3, bug #55154 (project groff):

Yes, that makes sense; the documentation changes you speak of (commit bf6ff8f2
) refute my
"illogically inconsistent" claim in the initial report.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2022-07-05 Thread G. Branden Robinson
Update of bug #55154 (project groff):

  Status:None => Need Info  

___

Follow-up Comment #2:

Hi Dave,

With my recent terminological revisions and clearer distinction between
"spaces" and (horizontal) "motions", this puzzling behavior becomes more
understandable, I think.

Character translation targets have to be input sequences that are constitutive
of text.

That includes spaces, which are discardable and may be breakable or not, but
not _motions_, which are never either.

If we pretend that a font's word space is one en wide, and its numerals 1 em
wide, then the following inputs are equivalent.


.tr c\ \" translate to escaped space
.tr d\|
.tr e\^
.tr f\0

.tr c\h'1n'
.tr d\h'1m/6u'
.tr e\h'1m/12u'
.tr f\h'1m'


No one attempts to translate characters to horizontal or vertical motions, or
crazier things like device control escape sequences.  Thus the acceptance and
rejection pattern you see.

Do you buy this?  If so, I can update the document to make the corresponding
clarfication.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #55154] .tr has undocumented and inconsistent space-character restrictions

2021-08-25 Thread Dave
Update of bug #55154 (project groff):

 Summary: .tr has undocumented and inconsistennt
space-character restrictions => .tr has undocumented and inconsistent
space-character restrictions


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/