[bug #58930] take baby steps toward Unicode

2022-11-14 Thread Dave
Follow-up Comment #33, bug #58930 (project groff): This bug has now spawned its fifth spinoff: I opened bug #63360 to document the concern raised here in comment #9. ___ Reply to this item at: ___

[bug #58930] take baby steps toward Unicode

2022-11-12 Thread Dave
Follow-up Comment #32, bug #58930 (project groff): [comment #28 comment #28:] (concerning U+2016) > I defined it as "\[ba]\[ba]" rather than "||", which I did > because I had already translated | into something else. > fallbacks.tmac might consider doing the same Among its other concerns, bug #6

[bug #58930] take baby steps toward Unicode

2022-11-07 Thread Dave
Follow-up Comment #31, bug #58930 (project groff): Well, it also seemed cleaner to give that particular problem its own space rather than grafting any ensuing discussion about it onto a bug that already had 30 comments. ___ Reply to this i

[bug #58930] take baby steps toward Unicode

2022-11-07 Thread G. Branden Robinson
Follow-up Comment #30, bug #58930 (project groff): Thanks, Dave. I hadn't forgotten, but there was a risk that I would. :) ___ Reply to this item at:

[bug #58930] take baby steps toward Unicode

2022-11-07 Thread Dave
Follow-up Comment #29, bug #58930 (project groff): [comment #27 comment #27:] > This can be a new bug report if that makes more sense. Now bug #63332. ___ Reply to this item at: _

[bug #58930] take baby steps toward Unicode

2022-10-30 Thread Dave
Follow-up Comment #28, bug #58930 (project groff): [comment #26 comment #26:] > My aforementioned footnote macro defined an ersatz U+2016 > character largely as you do here, I say "largely" because I defined it as "\[ba]\[ba]" rather than "||", which I did because I had already translated | into

[bug #58930] take baby steps toward Unicode

2022-10-25 Thread Dave
Follow-up Comment #27, bug #58930 (project groff): Hmm, the fallbacks.tmac changes in the resolving commit are having an effect on ASCII output I can't (yet) explain. If I run a groff _without_ this commit: $ echo 'I \[dg] you' | groff-latest -Tascii -mtty-char | cat -s I <*> you This is the e

[bug #58930] take baby steps toward Unicode

2022-10-25 Thread Dave
Follow-up Comment #26, bug #58930 (project groff): [comment #24 comment #24:] > > > +.fchar \[u2016] || \" double vertical line (matrix norm) > > > > This one presents a kerning issue: if two U+2016s are set next to each other, they should have a little space between them. > > You've studied ker

[bug #58930] take baby steps toward Unicode

2022-10-23 Thread G. Branden Robinson
Update of bug #58930 (project groff): Status: In Progress => Fixed Open/Closed:Open => Closed ___ Follow-up Comment #25: commit 132182bd714a

[bug #58930] take baby steps toward Unicode

2022-10-23 Thread G. Branden Robinson
Update of bug #58930 (project groff): Status: Need Info => In Progress ___ Follow-up Comment #24: [comment #23 comment #23:] > A few nits about specific definitions: > > [comment #21 comment #2

[bug #58930] take baby steps toward Unicode

2022-10-14 Thread Dave
Follow-up Comment #23, bug #58930 (project groff): A few nits about specific definitions: [comment #21 comment #21:] > +.fchar \[u2000] \h'1n' \" en quad > +.fchar \[u2001] \h'1m' \" em quad > +.fchar \[u2002] \h'1n' \" en space > +.fchar \[u2003] \h'1m' \" em space As the "quad" and "space" for

[bug #58930] take baby steps toward Unicode

2022-10-14 Thread Dave
Follow-up Comment #22, bug #58930 (project groff): * The expanded Unicode coverage is great! * fallbacks.tmac does seem like a better place for these definitions. * I'm interested in your thoughts on the trade-off presented in comment #11, because it has bearing on some of your other proposed defi

[bug #58930] take baby steps toward Unicode

2022-10-13 Thread G. Branden Robinson
Follow-up Comment #21, bug #58930 (project groff): Dave, what do you think of this? diff --git a/tmac/fallbacks.tmac b/tmac/fallbacks.tmac index 428aaa2b7..02ac998a2 100644 --- a/tmac/fallbacks.tmac +++ b/tmac/fallbacks.tmac @@ -142,6 +142,49 @@ .fchar \[oe] oe .fchar \[:Y] \z\[ad]Y . +.fchar

[bug #58930] take baby steps toward Unicode

2022-10-08 Thread Dave
Follow-up Comment #20, bug #58930 (project groff): Ah. Yes. I read that. I just failed to put two and two together. I'll try to keep up. ___ Reply to this item at:

[bug #58930] take baby steps toward Unicode

2022-10-08 Thread G. Branden Robinson
Follow-up Comment #19, bug #58930 (project groff): [comment #18 comment #18:] > [comment #17 comment #17:] > > Setting Severity to "Blocker" so that I don't forget about this. > > You since unBlockered it, so, did you decide to forget about it after all, or did you come to a conclusion? Neither.

[bug #58930] take baby steps toward Unicode

2022-10-08 Thread Dave
Follow-up Comment #18, bug #58930 (project groff): [comment #17 comment #17:] > Setting Severity to "Blocker" so that I don't forget about this. You since unBlockered it, so, did you decide to forget about it after all, or did you come to a conclusion? __

[bug #58930] take baby steps toward Unicode

2022-09-26 Thread G. Branden Robinson
Update of bug #58930 (project groff): Severity: 5 - Blocker => 3 - Normal Planned Release:None => 1.23.0 ___ Reply to this item at:

[bug #58930] take baby steps toward Unicode

2022-06-01 Thread G. Branden Robinson
Update of bug #58930 (project groff): Severity: 3 - Normal => 5 - Blocker ___ Follow-up Comment #17: [comment #16 comment #16:] > And while I have your attention: > > Before 1.23.0, and potentiall

[bug #58930] take baby steps toward Unicode

2022-05-29 Thread Dave
Follow-up Comment #16, bug #58930 (project groff): And while I have your attention: Before 1.23.0, and potentially before rc2, there should be a decision about the question from comment #10. Leaving the commit in place is a valid choice; I just want to make sure it _is_ a deliberate choice rathe

[bug #58930] take baby steps toward Unicode

2022-05-29 Thread Dave
Follow-up Comment #15, bug #58930 (project groff): [comment #14 comment #14:] > I guess what we need to do here is be more clear whether > \[u...] escape sequences are intended to represent input > characters, or desired output glyphs. Fair point. I don't have an opinion either way. But the ans

[bug #58930] take baby steps toward Unicode

2022-05-28 Thread G. Branden Robinson
Follow-up Comment #14, bug #58930 (project groff): Hi Dave, [comment #13 comment #13:] > "The input sequence '\[u00A0]' is _syntactically_ valid...but like '\[u]' and '\[u]', it's not _meaningful_" > > This is true of the current implementation but less true conceptually: U+ and U+F

[bug #58930] take baby steps toward Unicode

2022-05-28 Thread Dave
Follow-up Comment #13, bug #58930 (project groff): [comment #0 original submission:] > But if the input is some other encoding, preconv converts > the character into the string "\[u00A0]", which groff does > _not_ recognize. The resolved bug #62300 has fixed preconv to emit "\~" rather than "\[u0

[bug #58930] take baby steps toward Unicode

2020-10-12 Thread Dave
Follow-up Comment #12, bug #58930 (project groff): The test case in comment #9 is probably the best way to see the kerning deficiency. The test cases presented in the 86b99bdbf58c8dd1a4036f4004a6d8518a5b8357 commit message

[bug #58930] take baby steps toward Unicode

2020-10-12 Thread Dave
Follow-up Comment #11, bug #58930 (project groff): It's a judgement call, certainly. Is a change that allows groff to handle U+2011 on some output devices but incorrectly kerns it better or worse than having groff emit a warning and discard the character? In the latter case, the user has a clear

[bug #58930] take baby steps toward Unicode

2020-10-08 Thread G. Branden Robinson
Follow-up Comment #10, bug #58930 (project groff): Hi Dave, Something I'm not clear on. Do you think 86b99bdbf58c8dd1a4036f4004a6d8518a5b8357 should be reverted? ___ Reply to this item at: __

[bug #58930] take baby steps toward Unicode

2020-09-22 Thread Dave
Follow-up Comment #9, bug #58930 (project groff): [comment #4 comment #4:] > And it appears to be a one-liner fix (morally). Turns out another reason it's not quite that simple is this caveat from the info manual (which is documented, so it can't be a bug): "Only the current font is checked for.

[bug #58930] take baby steps toward Unicode

2020-08-19 Thread Dave
Follow-up Comment #8, bug #58930 (project groff): [comment #2 comment #2:] > Unicode considers U+2009 THIN SPACE and U+200A HAIR SPACE breakable... > Groff... does not offer breaking versions of these spaces, and the only > reason to add them would be strict compliance with a Unicode property > th

[bug #58930] take baby steps toward Unicode

2020-08-15 Thread Dave
Follow-up Comment #7, bug #58930 (project groff): [comment #4 comment #4:] > just lamenting the total disjunctivity of the set. That two of the three, intended to serve different purposes, are disjunct seems more laudable than lamentable. But I'm not here to police your feelings. > I can't thi

[bug #58930] take baby steps toward Unicode

2020-08-15 Thread G. Branden Robinson
Follow-up Comment #6, bug #58930 (project groff): [comment #5 comment #5:] > On further investigation, it appears in fact to be 0% accurate. See bug #58962. groff_char(7) is _full_ of problems with accuracy. It's on my (s)hit list. I recently fixed up the introductory material but it needs a

[bug #58930] take baby steps toward Unicode

2020-08-15 Thread Dave
Follow-up Comment #5, bug #58930 (project groff): [comment #2 comment #2:] > groff_char(7) (which I only now thought to check) says it > maps to \~. But that appears to be less than 100% accurate: On further investigation, it appears in fact to be 0% accurate. See bug #58962. _

[bug #58930] take baby steps toward Unicode

2020-08-14 Thread G. Branden Robinson
Follow-up Comment #4, bug #58930 (project groff): [comment #2 comment #2:] > "\~" and "\ " _shouldn't_ be equivalent; they're documented as behaving differently. No, not suggesting they should, just lamenting the total disjunctivity of the set. > > The input string "\[u00A0]" being equivalent t

[bug #58930] take baby steps toward Unicode

2020-08-14 Thread Dave
Follow-up Comment #3, bug #58930 (project groff): [comment #1 comment #1:] > 2. The behavior of \: when used as the RHS of a .char request > does indeed seem a bit strange. Now its very own bug! Bug #58958. ___ Reply to this item at: <

[bug #58930] take baby steps toward Unicode

2020-08-14 Thread Dave
Follow-up Comment #2, bug #58930 (project groff): [comment #1 comment #1:] > 1. "U+00A0 NO-BREAK SPACE > > None of these are equivalent to the others. :-/ "\~" and "\ " _shouldn't_ be equivalent; they're documented as behaving differently. The input string "\[u00A0]" being equivalent to neither

[bug #58930] take baby steps toward Unicode

2020-08-14 Thread G. Branden Robinson
Update of bug #58930 (project groff): Status:None => Need Info Assigned to:None => gbranden ___ Follow-up Comment #1: It's a little demoral

[bug #58930] take baby steps toward Unicode

2020-08-10 Thread Dave
URL: Summary: take baby steps toward Unicode Project: GNU troff Submitted by: barx Submitted on: Mon 10 Aug 2020 09:56:06 AM CDT Category: Core Severity: 3 - Normal