from:"James K. Lowden"

Re: [Groff] error handling if Perl is not suitable

2013-04-13 Thread James K. Lowden

On Fri, 12 Apr 2013 21:48:49 +0200 (CEST)
"Bernd Warken"  wrote:

> It would make sense to install this Perl error handling at a single
> space somewhere in the configure place.
> 
> Has anyone an idea how this could be made or where one can learn how
> to do this.

have_perl:
@command -v perl || ( echo "need perl" >&2; exit 1 )

Add that to the Makefile and make have_perl a prerequisite for some
early target.  

Would that do what you want?  

--jkl

Re: [Groff] encapsulated postscript from pic

2013-04-23 Thread James K. Lowden

On Tue, 23 Apr 2013 15:34:34 -0400
Doug McIlroy  wrote:

> Is there a tool or trick for getting encapsulated postscript from pic?
> What I want is that the bounding box should have origin 0 0 and
> be just big enough to cover the picture.

AFAIK not with groff alone.

Ghostscript has ps2epsi.  ISTR PStricks has something similar.  If PDF
is acceptable, dvipdfm (http://gaspra.kettering.edu/dvipdfm/) produces
encapsulated an encapsulated PDF.  

I have the same need, and do wish it was simpler.  What better way to
generate diagrams for reference by HTML?  Until, that is, the browser
recognizes troff.  

HTH.  

--jkl

[Groff] dynamic grap ticks

2013-04-26 Thread James K. Lowden

Is there's a cool way to pass "sh" output to grap's ticks command?  

I'd like to generate my x-axis "ticks" programatically, from the
data I'm bringing in with "copy".  I have a line like this:

ticks bottom out at 6.4000e+01 "\s-1%.0f", 9.05096680e+01 "%.0f", \
1.2800e+02 "%.0f", 1.81019336e+02 "%.0f", 2.5600e+02 "%.0f", \
3.62038672e+02 "%.0f", 5.1200e+02 "%.0f", 7.24077344e+02 "%.0f", \
1.0240e+03 "%.0f", 1.44815469e+03 "%.0f", 2.0480e+03 "%.0f", \
2.89630938e+03 "%.0f", 4.0960e+03 "%.0f", 5.79261875e+03 "%.0f", \
8.1920e+03 "%.0f\s+1"

which came from the first column of the data file.  I tried many
variations of 

ticks sh { awk ... }

without any luck.  (Even less luck because there doesn't seem to be a
way to escape the "$1" I would like to pass to awk.) 

While I'm in the neighborhood, the grap manual says that labels stack,
but mine (version 1.41) doesn't seem to:

$ cat t
.G1
label top "abc", "def"
.G2

$ grap t
grap: syntax error
Error near line 2, file "t"
 context is:
label top "abc" >>> , <<<  "def"

Many thanks for any assistance.  

--jkl

Re: [Groff] dynamic grap ticks

2013-04-27 Thread James K. Lowden

On Sat, 27 Apr 2013 11:19:01 +0100
Ralph Corderoy  wrote:

> > I tried many variations of 
> > 
> > ticks sh { awk ... }
> > 
> > without any luck.  
> 
> Use `sh' to produce the whole `ticks' line in a temporary file and
> then `copy' to read it in?

Ah, thanks, Ralph!  

At first I tried 

ticks at copy "t"

which didn't work, but if I put "ticks at " at the front of the t file,
a simple 

copy "t" 

where the ticks line belongs does the tick, as it were.  

As an experiment, I then replaced that with

sh { cat "t" }

and saw my file splatted up at the top of the output, above the graph.
Now I finally understand something that, to me at least, isn't terribly
obvious from the documentation: the "sh" commands are executed first,
before graphing starts, not at the point they appear in the document.  

Regards, 

--jkl

[Groff] tables side-by-side

2013-10-26 Thread James K. Lowden

I'm confused about how best to lay out tables side by side.  I'm using
the ms macro set.  The main text stretches across the page (one
column), but I want the tables adjacent so that they can be
more easily compared.  

My basic approach is

.mk
.TS
 ... table 1 ...
.TE
.rt
.DS 3i
.TS
 ... table 2 ...
.TE

If the table is too long for the current page, and lands on the next
one, that doesn't work well at all, so I have

.KS
.mk
.TS
 ... table 1 ...
.TE
.KE
.rt
.DS 3i
.TS
 ... table 2 ...
.TE

In both cases, though, the second table isn't quite lined up with the
fisrt.  I use .sp to move it down, sometimes by 5p, sometimes by 5
lines, depending on whether or not a keep was needed.  

I'm obviously doing something wrong, because 1) the second table doesn'
line up, and 2) the way it doesn't line up varies, and 3) the keep
behavior varies according to where the table falls.  

What's the right way, please?  

--jkl

Re: [Groff] tables side-by-side

2013-11-02 Thread James K. Lowden

On Sat, 26 Oct 2013 19:16:02 +0100 (BST)
(Ted Harding)  wrote:

> Follow-up: I have found the source of your pronblem (the one
> to do with vertical displacement). When a table is oujtput using
> the .TS macro, a "display drop" vertical spacing is added (and
> the same happens for .EQ using eqn, and in other contexts).
> 
> You can get rid of this (in ms macros) by setting the "display
> drop" to zero. 

Many thanks, Ted.  Setting DD to zero did indeed get it closer; I also
had to set PD to zero.  

DD isn't mentioned afaik in groff_ms or in usd-17.  That leads me with
two questions:

1.  Do you find the best way to really understand -ms is to read the
macros?  I keep hoping just to rely on the documentation. 

2.  Is there a good/accepted convention for naming user-defined
registers?  I used

.nr DD1 \n(DD
...
.nr DD \n[DD1]

on the theory that /usr/share/tmac/ms.tmac is unlikely to use register
names longer than 2 letters. 

--jkl

Re: [Groff] tables side-by-side

2013-11-02 Thread James K. Lowden

On Mon, 28 Oct 2013 17:13:16 +0100
Tadziu Hoffmann  wrote:

> > The main text stretches across the page (one column),
> > but I want the tables adjacent so that they can be more
> > easily compared.
> > What's the right way, please?  
> 
> I would do it with diversions (see attached roff source and
> resulting pdf).

Thanks, Tadziu.  I had to study your example to understand it.  It
looks very robust because, except for TS/TE, it leaves the ms macros out
of the equation.  

I like to think I can float along atop -ms, and not punch through to
troff except when no -ms mechanism exists, such as local motion. Your
example demonstrates that when I start doing something -ms isn't really
set up for, it's better to drop into straight troff until the document
resumes its stately -ms pace.  

--jkl

Re: [Groff] tables side-by-side

2013-11-06 Thread James K. Lowden

On Sun, 03 Nov 2013 22:31:40 +0100 (CET)
Werner LEMBERG  wrote:

> >> DD isn't mentioned afaik in groff_ms or in usd-17.  That leads me
> >> with two questions:
> >
> > Agreed! It should be.
> 
> `DD' *is* mentioned:
> 
>   Other settings
> 
> Reg.   Definition  EffectiveDefault
> 
> DD  Display, table, eqn, pic spacing   next para.   0.5v
> MINGW   Minimum width between columns  next page2n
> 
> 
> A search for `DD' in the man page should quickly reveal its
> existence :-)

Thank you for pointing that out, Werner.  

My system is a little old because the sysadmin is lackadaisical about
upgrades.  I scanned groff_ms.7 version 1.19.2 before posting.  I see
DD has been added since.  For which I am, of course, grateful.  :-)

--jkl

Re: [Groff] Typesetting dashes

2013-11-22 Thread James K. Lowden

On Wed, 20 Nov 2013 14:59:26 -0500
Doug McIlroy  wrote:

> > I can't think of a situation where you would want to mix point sizes
> > on a line.
> 
> A fairly common case is small caps, as in acronyms. Another is mixed 
> fonts (e.g. using Courier for computer literals) with different 
> x-heights for fonts of the same nominal point size.

As it happens, I'm working on a document right now that frequently
mixes point sizes on a line.  The body is in Roman font, and the word
"xterm" is set in Helvetica to distinguish it.  If I don't reduce size
of it by one point, it looks too large against the rest.   

While I'm in the neighborhood, I wonder if commas in numbers get
special treatment?  Reading over my document, the number 34,800 looked
bad; the comma was squished over by the eight.  The effect was
especially noticable when the comma trails a 7.  To correct, 

.ds xterm \s-1\fH\&xterm\f[]\s+1
.ds comma \h'-5M',\h'7M'

38\*[comma]400 bits per second

because 1p was too much.

I tried math mode, too, but it seemed a little spacey.  Other
suggestions?  Or is it just me?  

--jkl

Re: [Groff] Typesetting dashes

2013-11-22 Thread James K. Lowden

On Fri, 22 Nov 2013 00:41:52 -0500
Peter Schaffter  wrote:

> On Thu, Nov 21, 2013, James K. Lowden wrote:
> > While I'm in the neighborhood, I wonder if commas in numbers get
> > special treatment?  Reading over my document, the number 34,800
> > looked bad; the comma was squished over by the eight.  The effect
> > was especially noticable when the comma trails a 7.  To correct, 
> > 
> > .ds xterm \s-1\fH\&xterm\f[]\s+1
> > .ds comma \h'-5M',\h'7M'
> > 
> > 38\*[comma]400 bits per second

Hi Peter, 

> 1.  Why aren't you using the ever-so-handy '.fzoom' for your
> Helvetica?

Ah, that would seem to be because I'm using out-of-date documentation.
I have 1.21 installed on another machine, but the MANPATH still
includes 1.19. Thank you for the pointer.

> 2.  Do your numbers have to line up?  If so, make sure kerning's
> disabled.  That solves the comma problem.

By which I suppose you mean the .kern request?  It doesn't seem to
matter, please see attached PDF output for this file -ms:

.NP
.ds comma \h'-5M',\h'7M'
.IP kerned
2,261,760 
.IP unkerned
.kern 0
2,261,760 
.kern
.IP custom
2\*[comma]261\*[comma]760 

Looking at the PDF with 400% magnification, I don't see a difference.
Looking that the ditroff (is that what to call it?) I think I know
why: the commands look identical.  Both output "t2,261,760" for the
number, see below.  

BTW, I'm not satisfied with any of them.  groff places the comma too
far east for my taste; I suppose I think the rule should be that it
land exactly halfway between the numbers, perhaps one pixel west.
Agreed the 1 and 7 are too far apart.  I didn't want to fuss with
individual digit pairs.  

In this example, I'm using groff 1.19, which is the stock install on my
yellowing OS X machine.  I normally don't pay a lot of attention to
which version I'm using, but perhaps in this case it matters?  

--jkl

x T ps
x res 72000 1 1
x init
p1
V84000
H72000
x X devtag:.col 1
x font 5 TR
f5
s1
V84000
H72000
md
DFd
tk
H76900
terned
wh2500
n12000 0
V96000
H97000
t2,261,760
n12000 0
V111600
H72000
x X devtag:.col 1
V111600
H72000
tunk
H86900
terned
wh2500
n12000 0
V123600
H97000
t2,261,760
n12000 0
V139200
H72000
x X devtag:.col 1
V139200
H72000
tcustom
wh2500
n12000 0
V151200
H97000
t2
H101500
t,
h700
t261
H119200
t,
h700
t760
n12000 0
V768000
H504000
n12000 0
x trailer
V792000
x stop

kern.pdf
Description: Adobe PDF document

Re: [Groff] Typesetting dashes

2013-11-21 Thread James K. Lowden

On Wed, 20 Nov 2013 14:59:26 -0500
Doug McIlroy  wrote:

> > I can't think of a situation where you would want to mix point sizes
> > on a line.
> 
> A fairly common case is small caps, as in acronyms. Another is mixed 
> fonts (e.g. using Courier for computer literals) with different 
> x-heights for fonts of the same nominal point size.

As it happens, I'm working on a document right now that frequently
mixes point sizes on a line.  The body is in Roman font, and the word
"xterm" is set in Helvetica to distinguish it.  If I don't reduce size
of it by one point, it looks too large against the rest.   

While I'm in the neighborhood, I wonder if commas in numbers get
special treatment?  Reading over my document, the number 34,800 looked
bad; the comma was squished over by the eight.  The effect was
especially noticable when the comma trails a 7.  To correct, 

.ds xterm \s-1\fH\&xterm\f[]\s+1
.ds comma \h'-5M',\h'7M'

38\*[comma]400 bits per second

because 1p was too much.

I tried math mode, too, but it seemed a little spacey.  Other
suggestions?  Or is it just me?  

--jkl

Re: [Groff] Future direction of groff

2014-02-04 Thread James K. Lowden

On Tue, 4 Feb 2014 15:43:11 -0500
"Eric S. Raymond"  wrote:

> I have to say, unfortunately, that I think the entire
> presentation-centric model within which groff lives just about run
> its course.  The future belongs to structural markup and stylesheets,
> because of the requirement for rendering in multiple output media
> including the Web.

Hmm, seems to me every document is presentation-centric, depending on
what that means.  Are you suggesting Postscript and PDF are not long
for this world?  Are we doomed to the eyesores produced by
lousy-browser ebook readers?  

> The one thing I thing we could usefully salvage from the groff model
> is the notion of stacked DSLs for special formatting tasks - pic, eqn,
> grap, chem and the like.  

Yes, yes!  Say it, bother!  Rebarbative it may be, but at least troff
syntax was intended human beings to use.  

> Cutting them loose from their groff-centric assumptions and making
> them generate more modern low-level formats like XSL-FO and SVG is a
> groff2 I could get behind 

Nothing stands in the way of another post-processor, groxslfo, right?
But then what?  What device recognizes that?  I know there's a
tremendous bunch of XMLy things, but generally they end up producing
Postscript or HTML.  The longer the chain, the more complex, and the
more difficult to promulgate formatting decisions.  

I wonder what presentation-centric -- I think you mean "paper-centric"
-- features troff is beholden to.  What is different about nonpaper
devices, that prevents troff by design from producing documents for
them?  

Margins?  Paper length?  Fonts?  I don't see how they stand in the
way.  On the contrary, the world would be a much better place if web
browsers natively rendered -ms macros (or similar).  It would make
writing documents -- and web pages! -- easier, and go some way toward
unifying the mess of formats and tools we use for technical
documentation.  

Take that one step further: suppose a version of xterm existed that
natively recognized ditroff instead of the boneheaded VT-100 standard?
Such a terminal could render printer-quality graphics in the
*terminal*, transforming assumptions (and experience) at the command
line.  

I'm sure there are some 1970's artifacts that could be dropped, and I
applaud (again) your insight that the input syntax need not be tied to
paper or PDF output.  But I'm not convinced that ditroff or its model
are obsolete.  In fact everything is a printer, just more or less.  

--jkl

Re: [Groff] Future direction of groff

2014-02-04 Thread James K. Lowden

On Tue, 4 Feb 2014 23:08:02 -0500
"Eric S. Raymond"  wrote:

> > I wonder what presentation-centric -- I think you mean
> > "paper-centric"
> > -- features troff is beholden to.  What is different about nonpaper
> > devices, that prevents troff by design from producing documents for
> > them?  
> 
> Oh, Goddess.  Page one, chapter one...go take a long look at
> doclifter.
...
> I speak with authority here, having been the guy who *solved* this
> problem after the XML-Docbook crowd told me it was impossible.
> 
> They were wrong, but they weren't far from being right.

Yes, I know.  I haven't looked at it in a while, but I digested TAOUP.
I don't mean to slight your work.  

What I'm suggesting is that your AI system to create structure from
naught was necessary only because the problem is posed that way.
Nothing inherent in the English prose or the formatting instructions
requires the semantics or the nested-object syntax model.  Those are
imposed by your target, unfortunate artifacts of the misbegotten,
unsupported belief that documents need to be structured that way.  

I didn't ask you why it was difficult to conjure structure where none
was intended.  My question is much simpler: what about the troff
*model* of presentation prevents it from generating web-digestible
artifacts?  You say "groxslfo" is unnecessary work.  In your expert
opinion, why?  

To me everything on the far side of XML-FO is unnecessary work, as is a
fair bit on the near side.  Unless, I suppose, your goal is to plug
into that environment.  

--jkl

Re: [Groff] space width

2014-02-18 Thread James K. Lowden

On Sat, 25 Jan 2014 16:15:28 -0700
Dave Kemper  wrote:

> On 11/18/13, Tadziu Hoffmann  wrote:
> > In the original troff (according to the Troff User's Manual)
> > a space was nominally 1/3 em and a thinspace was 1/6 em,
> > thus half a normal space.  In groff's TR font, a space
> > is nominally 1/4 em, but a thinspace is still only 1/6 em.
> > Isn't that strange?
> 
> I thought of this two-month-old post when I found
> http://www.heracliteanriver.com/?p=324, a lengthy article that
> exhaustively documents that until the early to mid 20th century,
> standard typesetting practice was to put more space between sentences
> than between words.  

http://xkcd.com/1285/

Published 2013-11-01, in eerie anticipation of Tadziu's post.  

--jkl

Re: [Groff] The future redux

2014-02-25 Thread James K. Lowden

On Tue, 25 Feb 2014 11:06:09 -0500
"Eric S. Raymond"  wrote:

> More precisely, it is not the presence of presentation-level requests
> from the year zero that makes groff-as-it-is unfit to play in the
> semantic-markup world, it is the fact that macro packages presently
> *cannot disable access to the lower level*.

Yes.  If macros can be "enforced" then the input language can
reliably be sent to another processor for independent interpretation.  

> man pages don't really need expressive typography.  

Man pages are constrained by xterm.  A better display system would
invite tables, graphs, equations, and links.  

> That is, I don't see any reason why a combination of stylesheets with
> in-document processing instructions to declare local exceptions

How are "local exceptions" different in kind from dropping in a bit of
raw troff?  

> In actual fact the stylesheet-based engines are not quite that good
> yet.

Despite two decades of development.  To me, any technology that fails
to meet its own objective in that much time has demonstrated it never
will.  That doesn't make the technology useless; it means the objective
is probably not attainable.  

I guess I'm labelled a presentationalist, but I'd like to take (local)
exception to it.  

Semantic markup is fine for its purpose.  If you want to denote one
string as a filename and another as a book title, and later decide all
filenames and titles will share some presentation characteristics,
that's just great.  I miss that sort of thing in ms, and even
sometimes in mdoc.  

But do you seriously think every presentation choice can be mapped to a
semantic one?  Is there a markup language that doesn't admit
straight-up bold and italic?  Doesn't the use of a stylesheet language
inherently constrain presentation?  

Even if it doesn't, no semantics can escape the influence of the
presentation medium.  The document is affected by the medium in which
it's presented. McLuhan said as much, and it's confirmed by our
experience.  Show me your favorite example of a single source document
that is rendered equally well in HTML and PDF.  I suspect readers of
this list will find aspects that favor one or the other.  

Semantic markup wasn't invented for multiple media.  It was invented
for multiple *authors*.  More precisely, it allowed authors to express
ideas meaningfully, and non-authors to decide questions of
presentation.  In one medium.  

At the time no one thought to call them Content Providers.  

That's telling.  We have this idea today -- because of the web, because
as programmers we're trained in abstraction, because publishers baldly
treat texts as mere your-content-goes-here filler -- that documents are
just stuff to pour in some vessel.  And sure, it can be done, if not
beautifully then at least usefully.  Hesse called it "the radio music
of life": the intent survives, despite the vessel.  

So let's please keep these two ideas distinct in our minds.  Semantic
markup is great for semantic purposes.  The goal of "write once,
read anywhere" involves either compromises or significant effort toward
tailoring for each presentation medium.  

But that was never the goal in the first place.  The goal of all
writing is to be *read*.  

Thanks for listening.  

--jkl

Re: [Groff] The future redux

2014-02-26 Thread James K. Lowden

On Wed, 26 Feb 2014 11:46:32 +
Ralph Corderoy  wrote:

> > > man pages don't really need expressive typography.  
> > 
> > Man pages are constrained by xterm.  A better display system would
> > invite tables, graphs, equations, and links.  
> 
> I don't think they are.  Or they didn't used to be.  It was common to
> see man pages with `.if n' and `.if t', with the troff presenting the
> same data in better form, e.g. ASCII art versus pic(1).  man pages
> used to be commonly printed and high-quality output desired

Hi Ralph, 

Like Deri, from time to time I render a man page with -Tps, when I want
to look over it carefully or find myself referring back to it while
working on something unfamiliar or fiddly.  But I would bet 4 people in
5 who type "man foo" don't know there's a typesetter behind it.  

So many people are so accustomed to nroff output of man pages that most
web sites emulate its single worst characteristic, monospace fonts.
And the results are either comical or tragic:

https://developer.apple.com/library/mac/documentation/darwin/reference/manpages/man7/groff_char.7.html
or
http://man.cx/groff_char(7)

Apparently you can have either acceptable formatting with monospace
fonts and forgo knowing what Å looks like, or you can see the character
while imagining how the page should be formatted.  :-(  

I submit to you that if our command-line environment weren't still using
1980s technology to emulate 1970s hardware, we would have more
graphical and unified documentation.  In other words, the terminal is
the problem.  

Luckily, the terminal is also the solution.  Or, rather, a different
terminal would be.  I call it VT-roff:

http://www.schemamania.org/troff/vt-roff.pdf

Just a small matter of programming.  ;-)  

--jkl

Re: [Groff] man pages (tangential to Future Redux)

2014-03-02 Thread James K. Lowden

On Sat, 1 Mar 2014 07:55:08 -0500
"Eric S. Raymond"  wrote:

> What we have now in the Linux/Unix documentation world is a large
> pre-hypertext pile of documents with no link structure (manual pages) 
> and a smaller, weirder one (info) with a sort-of half-assed link
> structure. My goal is to level the walls around both and merge them
> into the Web.

Hmm, so SEE ALSO is not a link structure?  Because it's semantic markup
without tools to, er, render the link structure operable?  

I think I understand your affection for HTML.  The browser exists and
functions, and continues to be actively developed.  If all
documentation were in the browser, it could be cross-referenced and
viewed wherever a browser is available.  I'd like something better, and
I bet you would, too, but it's what we have.  

I don't share your enthusiasm for the browser.  I find the browser
an inconvenient UI.  I particularly dislike the DocBook-inspired
page-per-section style, where I have to click on the "next page" link
for practically every paragraph.  

Back in the terminal, while I wish for a better viewer than less(1), I
rely on its search capability -- which, unlike most browsers uses
regular expressions -- to find sections or things I vaguely remember.
The bash reference manual thankfully reverted to a simple manpage a few
years back; now "/:-" immediately jumps to parameter expansion, and
"/^FILE" jumps to the configuration files.  

I can't fathom asciidoc as a standard.  It's demonstrably less
expressive than mdoc (or DocBook).  Once you get past a few simple
things -- titles, lists, bold and italic -- the metacharacter strings
build up and become just as arbitrary and weird as anything else.  And
still you can't draw a picture.  

Compared to a lot of people on this list, I'm a newtimer.  I arrived at
groff having resisted at every turn.  I wrote HTML, CSS, DocBook.  I
used txt2man to avoid learning mdoc.  Then one afternoon after I
couldn't get it to do what I wanted, I sat down an converted the whole
set (about 10 pages) to mdoc.  It wasn't so hard once I got over the
prejudice that it looked like Wordstar.  

What I discovered, in other words, is that in troff the writer
asymptotically approaches its capability.  The "simpler" the system, by
contrast, the more likely he will exponentially approach its
limitations.  

--jkl

Re: [Groff] Mission statement

2014-03-17 Thread James K. Lowden

On Mon, 17 Mar 2014 21:49:49 -0400
Peter Schaffter  wrote:

> Serving PDF manuals from the web has drawbacks, possibly
> serious ones: it assumes an installed PDF reader, and there's
> significant latency involved with firing one up.

In 2014, is there a single web browser remaining not configured to
display PDFs?  

Is the latency significant, really?  For an interactive application, a
second or two matters.  For documentation that takes time to read
anyway, I'm not sure it's that important.  

--jkl

Re: [Groff] Mission statement, second draft

2014-03-19 Thread James K. Lowden

On Tue, 18 Mar 2014 18:13:11 -0400
"Eric S. Raymond"  wrote:

> * Strange, irregular, archaic-seeming markup design compared to XML or
> even TeX.  Brian Kernignan called it "rebarbative" in *1979*.

Yes, and typeset "D is for Digital" with groff in 2011.  Also available
for Kindle.  

More telling is the next paragraph, 

"My first thought (a thought shared by many others) was that
this would be a glorious opportunity to replace TROFF with a new
formatting language: better designed, easier to work with, and of
course much faster. This remains a desirable goal, but, after quite a
bit of thought spread over several years, I am still not really much
closer to a better design, let alone an implementation."  

I likewise have yet to see a better syntax.  I've seen different, used
DocBook, digested the semantic-markup argument.  I've not seen better.

I do not understand why anyone would say XML is better.  It was not
intended as a syntax for manual input, and  looks it.  It does its
level best to hide the prose amid the tags.  If that's modern, then
modern has much to learn from archaic-seeming.  

I apologize for going off-topic.  It's just that I wouldn't want
Kernighan's meaning to be misconstrued.  He wasn't condemning troff
syntax.  He was admitting, however objectionable it might be, that he
found it impossible to improve on.  As has everyone since, claims to the
contrary notwithstanding.  

--jkl

Re: [Groff] Letterspacing

2014-03-27 Thread James K. Lowden

On Thu, 27 Mar 2014 19:29:13 -0400
Peter Schaffter  wrote:

> Would it not make more sense to have groff, more or less as-is,
> shoulder more of the burden of what we do manually, *using the same
> strategies*, to achieve better *lines*, rather than focussing on
> the whole paragraph?

This is an excellent idea.  Your key insight is that what you do by hand
could be automated with very little impact on the overall system, to
useful effect.  I doubt it's been fully explore  because the problem of
"how to set a paragraph" is considered solved, by TeX.  

It occurs to me that an algorithm that aims only at a better line -- for
some value of "better" -- is at a disadvantage versus the
paragraph-at-once approach.  In considering a line, a per-line
algorithm cannot steal letters from the previous line, the one already
set.  The best it can do is pack more letters on the current line.  

That might be OK.  It's what you do (IIUC) manually, because too much
whitespace is the major bugaboo.  

It also might be improved by a lookahead rule: rather than looking
strictly at the current line, consider the potential impact on the next
one, and perhaps "donate" a character or two to the line yet to come.  

--jkl

Re: [Groff] Formatting algorithm

2014-05-04 Thread James K. Lowden

On Sat, 3 May 2014 14:23:10 -0400
Peter Schaffter  wrote:

> > A straightforward way to pull this off would be to actualize the
> > notional copies of groff by forking. There would be one copy going
> > forward from each line break. That would evaluate the cost of
> > breaking at each word (or hyphenation point) on that line. At each
> > line break the copies would rendezvous to see which process should
> > be cloned to continue. Output of each process, both to standard
> > output and standard error, would be treasured up and only the
> > ultimate winner's output would finally be released.
> > 
> > This model is somewhat formidable.
> 
> No kidding.  And I fear it might break one of groff's greatest
> strengths, which is minimal demand on system resources.  

Are you really concerned about minimal resource requirements, or speed
of processing?  

Doug's description retains a one-pass algorithm, which is key for
speed.  A parallel approach (forking) will only work better and better
as Moore's Law continues to make our machines increasingly parallel.
Meanwhile, memory requirements remain minimal because the fundamental
size of the input -- the paragraph -- will not change.  

However formidable, fork-and-join is well suited to the problem and to
the foreseeable evolution of hardware.  

> I suspect I'm a voice crying in the wilderness here

Nah, not as long as you stay subscribed!  :-)

> but we need to consider that a greedy algorithm is almost always
> faster than a dynamic programming solution;

I wouldn't assume that, or assume that it matters.  

--jkl

[Groff] how to break a footnote

2014-05-19 Thread James K. Lowden

What do you do about long URLs in footnotes?  

.FS
http://db.lcs.mit.edu/projects/cstore/vldb.pdf
.FE

engenders a complaint in a 2C document:

dbformat.ms:983: warning [p 11, 3.6i, div `fn@div', 0.0i]: can't break
line

(I suppose I should be able to read that, but I haven't taken the time
to discover what fn@div might mean!)  

man 7 groff tells me 

\:Inserts  a  zero-width  break point (similar to \% 
but without a soft hyphen character).

but nothing I do with it seems to matter, e.g. 

.FS
http://db.lcs.mit.edu/projects/\:cstore/vldb.pdf
.FE

changes the warning, but not the appearance.  

What am I missing?  

--jkl

Re: [Groff] Markdown to MOM Using Pandoc

2014-07-23 Thread James K. Lowden

On Wed, 23 Jul 2014 16:14:55 -0400
Yves Cloutier  wrote:

> The Plan

As a groff fan, I find this idea attractive.  Not because I'd use it,
but because it could make make Markdown more functional and groff more
approachable.  

If I might offer a suggestion, take a page out of pic and grap: think
of your program as a preprocessor, and allow lines starting with a dot
to pass through transparently.  If you take pains to look for certain
pairs such as .TS/.TE, your user could embed troff requests and
intersperse other preprocessor code within a Markdown document.  

I'm sure you understand the advantages.  Among others, it would give
the Markdown access to eqn.  AIUI the Markdown alternative for
equations is embedded LaTeX.  I'd think the last thing anyone wants is
a document processed by two typesetters!  

> Right now I have hardcoded some default values into the script that
> generates the MOM code, but the idea will be that all those settings
> that relate to HOW you want your document to look (like page size,
> margins, heading styles) would be specified in a separate file, like
> a stylesheet, which you can specifiy to use when the document being
> compiled. This would allow for easily changing the look of your
> document simply by specifying a different "stylesheet".

Do you use stylesheets for documents?  I've been writing on a computer
since the days of Wordstar, and never felt the need.  I think most
people rely on the defaults and adjust them as needed, per-document.  

In any case, groff already has conditional processing and the .so
request.  A stylesheet, if that's what's wanted, can be contructed from
them.  In conjunction with the \V escape, the stylesheet could be
included based on the value of an environment variable.  

--jkl

Re: [Groff] \o and \z do not work for -Tutf8

2014-08-04 Thread James K. Lowden

On Mon, 04 Aug 2014 23:19:41 +0100
Keith Marshall  wrote:

> OTOH, when a typewriter overstrikes, the accumulation of all glyphs
> struck remains indelibly impressed on the printed page.

[OT]

The symbol set of APL took advantage of that property.  The language
defines some 60 symbols, more than fit on the special-order APL
Selectric typeball.  Some symbols were composed using the backspace.  

http://www.quadibloc.com/comp/aplint.htm
"The letters of the alphabet could be overstruck with squares, circles,
or several other symbols to increase the number of available variables."

--jkl

Re: [Groff] Automake migration proposal

2014-08-11 Thread James K. Lowden

On Mon, 11 Aug 2014 08:06:16 +0200 (CEST)
Werner LEMBERG  wrote:

> >>  extern "C" const char *Version_string = "1.22.2";
> > 
> > extern "C" const char *Version_string;
> > const char *Version_string = "1.22.2";
> > 
> > but I'm not sure if it's the proper way to solve this warning.
> 
> Does
> 
>   extern "C" {
> const char *Version_string = "1.22.2";
>   }
> 
> work better?

It should.  Stroustrup discusses the difference in 9.2.4 of The C++
Programming Language.  His examples: 

extern "C" { 
int g1; // definition
extern int g2;  // declaration
}

extern "C" int g1;  // declaration

Although, he says, it "looks odd at first glance", it's a consequence
of the fact that the first form admits other code being #included.  

In light of that, 

extern "C" const char *Version_string = "1.22.2";

is problematic because storage for the pointer hasn't been defined.  

My preference, though, would be 

extern "C" {
const char Version_string[] = "1.22.2";
}

unless there is a need to have Version_string point to something else. 

--jkl

Re: [Groff] Overview, Sept. 2014

2014-09-11 Thread James K. Lowden

On Wed, 10 Sep 2014 11:49:37 +0200
Ulrich Lauther  wrote:

> other modifications would really
> improve readability and maintainability:
> - capitalization of class names
> - a naming convention for class member variables
> - reducing the number of global variables

You want to tread lightly where style is concerned.  Whether or not
something is more "readable" depends very much on what you're used to.
There's no consensus in the C ++ community at large on the above
recommendations.  

Many people are accustomed to capitalizing classes and decorating
member names.  I think the first is a requirement in Java, and the
latter was popularized by Microsoft's ugly "m_varname" convention.   In
his books, Stroustrup uses capitalized class names and ordinary,
undecorated variable names.  

Stroustrup has observed that if you ask a room of experts for
suggestions on how to improve C++ and make it more accessible to the
beginner, you'll be deafened by silence.  If you want a lively
discussion, he says, ask where the curly braces should go.  

> - for each class a block of comments explaining what the
> class is all about

For this particular suggestion to appear on the groff list is a little
ironic, no? Since the epoch Unix source code has been documented with
man pages adjacent to the source in the tree, with its rich formatting
features. Surely in-line text-only documentation as comments would be a
retrograde step, and a long one at that?  Else we might as well close
up shop and rename the project Doxygen!  

--jkl

Re: [Groff] Overview, Sept. 2014

2014-09-11 Thread James K. Lowden

On Thu, 11 Sep 2014 19:57:43 +0200
Ulrich Lauther  wrote:

> As I understand it, the man-pages are directed at the user of a
> program who wants to know WHAT a program is supposed to do and how
> she can control it via options.

On Thu, 11 Sep 2014 14:48:51 -0600
Clarke Echols  wrote:

> Ulrich is correct.  I was responsible for the HP-UX manpages at HP for
> most of five years.
> 
> Manpages (I always used a combined single word) are for users,
> programmers, and system admistrators.

I think there's some small understanding.  I was referring to

> >>>  - for each class a block of comments explaining what the
> >>> class is all about

I understood that to refer to the reason the class exists and what it
does.  That's insignificantly different from section 3 of the manual;
the user is the programmer, and the purpose is to explain the *what*,
not the how.  

If by "all about" Urich meant a sentence or two, sure, a comment block
is fine.  If by "all about" he meant a description of the semantics of
the public interface (which is what I thought he meant) then ISTM that
belongs in a manual, not in the source code.  N'est-ce pas?

--jkl

Re: [Groff] License for files with »ideal« parts

2014-09-16 Thread James K. Lowden

On Tue, 16 Sep 2014 18:42:23 +0200 (CEST)
Carsten Kunze  wrote:

> is there any (AT&T) ideal(1) (open :-) source code anywhere?  I
> wonder that it has been developed at AT&T but it is not in DWB or
> Plan9...

Not from AT&T.  I corresponded with Chris Van Wyk a few years ago, when
he told me ideal belongs to Alcatel-Lucent and is still in use.  

BTW, in looking for it, I stumbled on
http://www.eprg.org/papers/202paper.pdf, "Revisiting a Summer Vacation:
Digital Restoration and Typesetter Forensics", which is both
interesting and, to the right person, entertaining.  

--jkl

Re: [Groff] [Heirloom] Double word space after :

2014-11-12 Thread James K. Lowden

On Wed, 12 Nov 2014 18:08:12 +0100 (CET)
Carsten Kunze  wrote:

> by default Heirloom troff inserts a double word space if a line ends
> with ":".  Is this correct US English typography?

It is not incorrect.  Typographical convention has varied over time and
treatment of the colon along with it.  So, "correct" is hard to pin
down.  

I was taught 500 moons ago that a colon may be followed by one or two
spaces depending on purpose.  Examples:  

1.  There is only one thing to fear: fear itself.  
2.  Proceed as follows:  First, assume a can opener.  

The troff behavior would seem to support that notion.  If the
succeeding clause is independent, put it on a different line and let
troff treat it as end-of-sentence.  If it's not, leave it in the
running text and let troff treat as end-of-word. 

--jkl

P.S. What is the German practice?

Re: [Groff] condition: OR of two string comparisons

2014-11-16 Thread James K. Lowden

On Sun, 16 Nov 2014 15:17:14 +0100
Tadziu Hoffmann  wrote:

> > [...] maybe we should bear in mind having a .elif and .else
> > too that don't need the .ifx to be .iex.
> 
> I don't think that's possible.  Since the code can't look ahead,
> it will not know whether an else is coming or not, so the "if"
> has to know whether (.ie) or not (.if) to save the test outcome
> for a following "else".  

Maybe I'm missing something, but from here it looks possible.  True, you
need a stack, but not very much, and no look-ahead.  

Every time you hit an "if", put T/F on the stack indicating the value of
"else". When you hit an else, do or don't, according to the saved
value.  When you get to "end if", pop the value (whether or not an
"else" was found).  

> Else we needlessly fill up the stack, which is simply bad design.

Depends on "fill" and "needlessly".  Branches don't usually nest very
deeply, and "else" is useful functionality.  

--jkl

Re: [Groff] condition: OR of two string comparisons

2014-11-26 Thread James K. Lowden

On Wed, 26 Nov 2014 17:31:44 +0100
Tadziu Hoffmann  wrote:

> The syntax of expressions and conditionals isn't what's keeping new
> users from using groff. As Peter has already pointed out, the entire
> working model of roff presents a much greater mental hurdle than the
> one given by an idiosyncratic expression syntax.  

If I could add to that, this is one user for whom expression syntax
represents no hurdle at all because as far as I know I've barely used
it. One can write fairly complex documents, leaning on ms, tbl, pic,
and eqn without ever writing a macro.  

The tide of new users is held to a trickle surely by the rebarbative
syntax.  IMO anyone willing to use find(1) or vi should cotton to groff
if they give it half a chance.  But I've never met anyone who'd
consider it for anything except man pages. 

> I use groff because it's a programmable text formatter that
> despite its peculiarities is based on a relatively simple
> conceptual model that I can understand and that I can usually
> coax into doing what I want.  

That would be me, too.  :-)

--jkl

Re: [Groff] Setting Text Along A Curve

2015-01-18 Thread James K. Lowden

On Sun, 18 Jan 2015 19:49:38 - (GMT)
(Ted Harding)  wrote:

> And the "Blue Book", i.e. the "PostScript Language Tutorial and
> Cookbook" can be downloaded (it was published in 1985) from:
> 
>   http://www-cdf.fnal.gov/offline/PostScript/BLUEBOOK.PDF
> 
> The circular text example, with its generating PostScript code, is
> given as Program 10, pages 168-171, of this version, in the Cookbook
> section. Program 11 (pages 172-175) gives the "Placing Text Along an
> Arbitrary Path" example, along with the generating code.

In musing about PostScript I came across "Mathematical Illustrations"
by Bill Casselman, http://www.math.ubc.ca/~cass/graphics/manual/ and
his example of text-on-a-path on page 3 of the preface,
http://www.math.ubc.ca/~cass/graphics/manual/pdf/preface.pdf.  The
technique is explicated in Appendix 7.  

--jkl

Re: [Groff] Building a troff parser

2015-03-05 Thread James K. Lowden

On Tue, 3 Mar 2015 01:00:35 -0500
Eric Andrew Lewis  wrote:

> In short, I'd like to make a program that does this:
> 
> $ explain "rm -rf *"
> rm -rf *
> ??? rm   remove files or directories
> ??? -r   remove directories and their contents recursively
> ??? -f   ignore nonexistent files, never prompt
> ??? *Remove (unlink) files matching this text pattern.

Apolgies for the OT turn I'm about to take, but you happen to have
tickled one of the reasons I decided to learn troff in the first
place.  

30 years ago I used a shell that in many ways was superior to Unix
shells then and now.  

If you typed a command with appropriate options, it executed it without
fuss.  If you typed a command and simply hit Enter (or Help?) it
brought up a form to fill out.  Once completed, you pressed OK (or
something) and the command was then executed equivalently.  Just
imagine having a form-based interface for tar or dd.  

Over time you learned shortcuts to things you did all the time, and
could always fall back on the form for help.  The interaction was much
nicer than bringing up a man page and then remembering what to do.
That shell didn't do globbing in the form, but that would be a natural
(optional) feature of a Unix version.  

I thought I would write a "form" tool that would work a little like
your explainshell.  I thought it might generate an HTML form, invoke
lynx, and process HTTP POST.   It would rely on a database of
options gleaned from the documentation.  Anyone could either add to the
database or fix up the documentation to be understandable to the doc
parser.  

I lost interest because ISTM anyone who might use it would prefer KDE
or somesuch, and because the state of manpage documentation left a lot
to be desired for that intended (re)purpose.  But if you have similar
ideas, I thought I'd mention it because you're in the HTML neighborhood
and  an active form would represent a novel improvement instead of a
re-tread of static manpages.  

We now return you to your regularly scheduled programming.  

--jkl

Re: [Groff] Cannot understand this error from prfroff

2015-10-27 Thread James K. Lowden

On Mon, 26 Oct 2015 23:04:36 +1300
Koz Ross  wrote:

> :0: macro error: diversion open while ejecting page (recovering)

It's interesting that the phrase "macro error" relies on a bit of
intution: 

$ printf '.di foo\nhello\n.bp\n' | groff
troff: automatically ending diversion `foo' on exit

$ printf '.di foo\nhello\n.bp\n' | groff -ms
:0: macro error: diversion open while ejecting page (recovering)

$ printf '.di foo\nhello\n.bp\n' | groff -me
Line  -- Unclosed block, footnote, or other diversion (foo)

$ printf '.di foo\nhello\n.bp\n' | groff -mm
troff: automatically ending diversion `foo' on exit

Whether or not the open diversion is a "macro error" depends not on
whether a macro was in force, but on whether the ms macro set was in
use.  It's a shame the name of the open diversion is available to ms
but isn't mentioned in the message.  

> What does this mean, and how can I avoid this?

The simple answer is that if you see a diversion error, look for the
use of a macro pair -- e.g., DS/DE, KS/KE, FS/FE -- where you opened a
pair and didn't close it.  The ms documentation calls a diversion a
"keep". Unfortunately it's not rigorous in noting which macros open
diversions, and there's no mention of one in AB/AE.  

To really understand the message, you'd have to know what a troff
diversion is.  Diversions are described in Kernighan's Troff User's
Manual and (more approachably) in Dougherty's Unix Text Processing
(http://www.oreilly.com/openbook/utp/).  

HTH.  

--jkl

Re: [Groff] Typesetting Markup Language (TML) - a Superset of Groff

2016-01-16 Thread James K. Lowden

On Sat, 16 Jan 2016 17:32:38 +0100
Steffen Nurpmeso  wrote:

> I think plain SGML is still an interesting language, much
> better than what XML made of it.  

As Dijkstra said of Algol, "an improvement over many of its
successors."  

> I had a time when i liked rst, but pimping POD is possibly nicer
> given how rst looks if you start real work with progamming stuff
> etc.  And then a nicely reduced ROFF (TeX, too) set of macros does
> look very clean!

I observe that my groff documents have the lowest markup/text ratio.
Less typing, more typesetting.  I suppose that's because, contrary to
XML, troff syntax was always meant to be typed by the user.  

My other observation is that all the new plaintext syntaxes run out of
gas before you get very far.  Try representing a table in markdown;
look for a way to do footnotes.  

--jkl

Re: [Groff] Typesetting Markup Language (TML) - a Superset of Groff

2016-01-24 Thread James K. Lowden

On Sat, 23 Jan 2016 12:17:35 -0500
Larry Kollar  wrote:

> IIRC, James Clark wrote the only open-source SGML parser around, and
> it didn?t support the full syntax. By contrast, there are XML parsers
> in just about every language out there including awk.

AFAIK Clark's jade is the only free SGML parser.  While it's true that
it leaves certain features unsupported, I never bumped into them while
writing the user guide for FreeTDS.  

I suspect that SGML included minimization because it was at least
sometimes intended to be written by by hand.  The unfulfilled hope of
XML was that tools would produce and use XML, and that we humans would
produce and use those tools.  

--jkl

Re: [Groff] pdfmark XN help

2016-01-30 Thread James K. Lowden

On Wed, 27 Jan 2016 20:53:03 +
Keith Marshall  wrote:

> it isn't abandoned, but I had to put it on hold, 

In 1962 Calvin Trillin founded a magazine, ''Beautiful Spot: A Magazine
of Parking.''[1]  Contrary to popular opinion, publication was never
cancelled, although the second issue hasn't come out yet.  


--jkl
[1]
http://www.nytimes.com/2002/02/12/books/for-trillin-parking-is-an-end-not-a-means.html

Re: [Groff] groff performance in respect to hardware platform

2016-03-24 Thread James K. Lowden

On Wed, 23 Mar 2016 23:21:37 -0400
Steve Izma  wrote:

> I'm wondering if anyone can tell me if groff benefits from running on
> multiple CPU cores and multiple CPUs.

Looking at spawn-pipe.c, the only parallelization you get in groff is
the pipeline of preprocessing, formatting, and rendering.  

ISTM that's all you *can* get because the formatting process --
determining which words go on each line -- is necessarily sequential.
The whole-paragraph formatting algorithm Doug McIllroy proposed some
time back worked in parallel, but each paragraph would still be rendered
serially.  

As others noted, rendering is expensive, and I bet as a practical
matter that's also sequential because the device holds so much state.
Maybe in theory it's be possible to denote, say, paragraphs in the
ditroff output, render each one independently as Postscript, and knit
them all together in the output.  But the driver holds "current"
information about e.g. the font, pen, and position, at least some of
which I would think affects the rendering logic.  If it could be done
in parallel, it would surely be more complex, and it's not obvious it
would be much faster.  

--jkl

Re: [Groff] groff performance in respect to hardware platform

2016-03-25 Thread James K. Lowden

On Fri, 25 Mar 2016 11:04:55 +
Ralph Corderoy  wrote:

> It probably won't be to your taste, but try mupdf(1) to see how
> snappily a PDF can be rendered compared to Okular.  

Interesting.  A little OT for this list, but mupdf appears to me to be 
just a tad faster than xpdf.  

My usual setup is a Mac laptop using remote X over ssh logged into a
NetBSD or Linux machine via wifi.  Most applications -- emacs, xterm,
email -- work just fine that way; after all at 1 MB/s the wifi is better
than Ethernet was when X was designed, and of course computers are
orders of magnitude faster.  

PDF viewers are not in the "just fine" category.  Although xpdf and
mupdf exhibit different "painting" patterns, both take 1-2 seconds to
display a page.  My theory: client-side font-rendering means the page
is displayed a pixel at a time.  

> Unfortunately, it doesn't watch the file, needing an `r' to reload.
> A shame, as a SIGUSR1 would do.

mupudf answers to SIGHUP, which is odd for a non-daemon process,
although as a read-only process I suppose it doesn't matter much what
it does with a real HUP signal.  How would SIGUSR1 be any different?
Simplest is probably just to stat the name every second and reload when
something changes.  

--jkl

[Groff] left of center

2016-04-18 Thread James K. Lowden

What's the right way to control where on the x axis a picture
appears when using -ms?  

In tbl, it's controlled with the "center" option for the table, and
defaults to left-justified.  In pic it's controlled by .PS, outside pic
itself.  

I have a page of interspersed tbl and dformat elements.  It would look
better if the dformat output began where the tables do, at the left
edge.  To do that, I guess I have to insert some
troff-in-pic-in-dformat stuff.  Even if I get it to work, ISTM it would
be quite fragile because I'd be breaking assumptions about positioning
that pic and dformat might be making.  

Thanks, 

--jkl

[Groff] sidebars

2016-04-18 Thread James K. Lowden

While I'm in the groff neighborhood, I'd like to ask for advice for a
style of layout I don't see any direct support for.  

Readers of this list might have read The Annotated Alice.  The text of
Alice in Wonderland is full size, and the outer edges of the page have
copious side-notes whose vertical location coincides with the text
they're commenting on.  Reading the text, the reader can easily refer
to the running commentary along the edge.  Depending on mood and
interest, sometimes the commentary is more interesting than the text
itself.  

Like footnotes, the sidenotes are distinguished from the main text by
the font size.  Unlike footnotes, they alternate location on odd and
even pages.  

I don't see any simple way of using .2C in -ms to emulate that layout,
nor does .MCO in MOM look suitable. I suppose it requires a whole
different set of macros.  How would you approach it?  

--jkl

Re: [Groff] sidebars

2016-04-19 Thread James K. Lowden

On Mon, 18 Apr 2016 15:57:11 -0400
Peter Schaffter  wrote:

> > Check out MOM's "Margin Notes".  I've never had need of them, but
> > they sound like just what you want.

Thank you, both.  The project I have in mind is big enough to justify
learning MOM.  

> If you're more comfortable with ms than mom, you can source Werner's
> originals.  

Which would be where?  I didn't find them in contrib.  

--jkl

Re: [Groff] left of center

2016-04-19 Thread James K. Lowden

On Mon, 18 Apr 2016 21:21:46 +0200
Tadziu Hoffmann  wrote:

> 
> > What's the right way to control where on the x axis a picture
> > appears when using -ms?  
> > It would look better if the dformat output began where the
> > tables do, at the left edge.
> 
> Looking at the code it appears ms offers no choice -- it always
> centers.  Your best option is to include a copy of the PS macro
> in your document, with the indent request removed.

Thanks.  It will be a little more work than that, because some pictures
in the document look fine centered.  Now at least I know where to
begin.  

--jkl

Re: [Groff] .if !dTS - GNU extension?

2016-04-28 Thread James K. Lowden

On Thu, 28 Apr 2016 19:10:22 +0200
Ingo Schwarze  wrote:

> Take a large manual, for example ksh(1).
> With the mandoc-based implementation of man(1), type
> 
>   $ man ksh
> 
> Then inside less(1), type
> 
>   :t read
> 
> to jump straight to the description of the "read" builtin command,

That's very inviting.  Why doesn't it work for *me*?  ;-)  

$ /usr/bin/mandoc $(man -w ksh) | less -t read
No tags file

--jkl

Re: [Groff] .if !dTS - GNU extension?

2016-04-29 Thread James K. Lowden

On Fri, 29 Apr 2016 10:56:52 +0200
Ingo Schwarze  wrote:

> It can't work when you pipe by hand. 

Thanks for the detailed explanation, Ingo.  I wrote the simplest
possible test, and it turned out to be too simple!  Einstein, right? 

>  4. Let the man(1) steering program fork and exec less(1).  

I'm using NetBSD 7.0, and my /usr/bin/man doesn't work as you
describe, at least not for pdksh and a few others I tried.  

> When stdout is not a terminal ... indexing isn't attempted, and
> when called as mandoc(1) rather than man(1), it isn't at all.

I'm pretty sure my man invokes mandoc as "mandoc".  Would that be a
problem?  

--jkl

Re: [Groff] .if !dTS - GNU extension?

2016-04-29 Thread James K. Lowden

On Fri, 29 Apr 2016 17:50:52 +0200
Ingo Schwarze  wrote:

> I was talking about the CVS HEAD version of the mandoc toolbox from
> bsd.lv, sorry that i didn't say that explicitly.

Ah, I see, OK.  

> > I'm pretty sure my man invokes mandoc as "mandoc".  Would that be a
> > problem?  
> 
> That cannot work.  If man(1) invokes an external mandoc(1) via fork
> and exec, there is no :t support.  You need a version of mandoc
> where /usr/bin/man and /usr/bin/mandoc are hard links to one and
> the same executable file.  Otherwise, man cannot know the name of
> the temporary ctags file that mandoc created, so it cannot pass it
> on to less.

If I may say so, that's an unfortunate, unnecessary limitation.
Instead of replacing /usr/bin/man, why not support a command-line
option to specify the name of a ctags file, and supply that name to
less?  NetBSD's man uses /etc/man.conf, and the existing syntax could
enable that behavior.  

I don't want to replace the man utility, or groff as manpage formatter
for everything.  (groff.7 and groff_ms.7 are two examples of files
mandoc can't render.)  Access to indexing would be a step forward,
though, and I'm not above being opportunistic.  :-)  

Have you experimented with a single index for all man pages?  I wonder
how useful that would be, or how it might be subdivided.  An index that
spanned pages would be helpful for sets of routines covered by
different man pages.  (groff.7 and groff_char.7 come to mind.)  

--jkl

Re: [Groff] .if !dTS - GNU extension?

2016-05-01 Thread James K. Lowden

On Sun, 1 May 2016 15:29:19 +0200
Ingo Schwarze  wrote:

> Besides, your idea causes an unreasonable amount of work and bloat.

My idea is evidently at odds with your goal to replace the entire
manual viewing & formatting subsystem.  I don't share that goal.  

> It doesn't just add a command line option to mandoc(1), which i
> consider almost prohibite bloat all by itself, 

You said that once already.  Any code can be be pejoratively labelled
"bloat" if the speaker considers unheipful.  What you're really
saying is that you're uninterested in helping people use mandoc without
adopting it as the user interface.  

> but requires changing *all* the man(1) programs, and basically, each
> system has its own of these.

Isn't that exactly what you propose to do?  Or is replacing a program
not changing it?  I consider changing a configuration file to be much
less intrusive, and easier, than replacing a binary.  

NetBSD + mandoc has been a net loss in functionality for me so far.
Apropos is less convenient, see below.  And the 5% of pages that don't
format correctly unfortunately include several that I frequent.  

> (groff.7 and groff_ms.7 are two examples of files mandoc can't
> render.)
> 
> The best way to deal with that is to make your packaging system
> install manual pages that use some of the more arcane groff extensions
> not supported by mandoc, or some gory low-level roff(7) not supported
> by mandoc, as preformatted files rather than as source code.  

Best?  Defined how?  Who says the man pages in question come through
the packaging system, and not from Github?  

As far as I can tell, you're defining "best" in terms of the goal of
delivering a base OS without a troff dependency for manual display.  I
really don't understand why you consider that desirable.  I would like
to see groff used more, not less.  

Ingo, even if I felt cooperative, you know, I'm cut off at the
beginning. Mandoc offers, last I checked, no way to verify correct
formatting except by visual inspection, and does not document what
subset of troff requests and macros it supports. Instead of "use mandoc
and fall back to troff on error" as a policy, mandoc implicitly demands
all-or-nothing adoption.  

> > Have you experimented with a single index for all man pages?
> 
> No.
> 
> > I wonder how useful that would be,
> 
> Not useful at all.  

So, lacking data, your conclusion is based on your assumptions.  

The rest of your answer is a list of specious objections based on
erroneous design choices.  The very idea that an index need rely on
output format is ridiculous.  The point of using an index is *not* to
use apropos, which is a much more generalized search.  

I can believe that searching for "read" in an index would be better
than scanning the page text with less.  That's what interested me to
start with.  By the same token, searching more than one page would
bring novel functionality that has been lacking in Unix man pages since
their inception (afaik).

And, no, apropos is not the answer.  For example, let's look for socket
functions.  

$ man -k -l socket
socket(n) - Open a TCP network connection
curl_multi_socket_action(3) - reads/writes available data given an
action curl_multi_socket(3) - reads/writes available data
CURLOPT_UNIX_SOCKET_PATH(3) - set Unix domain socket
CURLOPT_OPENSOCKETFUNCTION(3) - set callback for opening sockets
CURLOPT_OPENSOCKETDATA(3) - custom pointer passed to open socket
callback CURLOPT_EGDSOCKET(3) - set EGD socket path
CURLOPT_CLOSESOCKETFUNCTION(3) - callback to socket close replacement
function CURLOPT_CLOSESOCKETDATA(3) - pointer passed to the socket
close callback CURLMOPT_SOCKETFUNCTION(3) - callback informed about
what to wait for CURLMOPT_SOCKETDATA(3) - custom pointer passed to the
socket callback BIO_s_socket(3) - socket BIO
socket(2) - create an endpoint for communication
socketpair(2) - create a pair of connected sockets
dbus-cleanup-sockets(1) - clean up leftover sockets in a directory

Do you think that's atypical?  apropos doesn't respect section
numbers.  To get what I want, 

$ man -k -l socket | grep '(2)'
socket(2) - create an endpoint for communication
socketpair(2) - create a pair of connected sockets

but of course I have to *know* that, and I have to accept the idea that
I might miss something.  Like, say, unix(4).  Or bind(2), which
apropos used to return.  How did that get lost?  

> If apropos(1) tells you which page to look at

$ man -k socket | wc -l
1398

It doesn't.  The information density of that output is near zero.  Just
casting a wider net does not yield better results.  I'm amazed anyone
thinks 1400 lines of "socket" context is useful.  

What I would like is a comprehensive index of functions relevant to
socket(2).  From what I have read, good indexes can't be generated
automatically.  That suggests to me that index-generation should be
seeded by automatic construction (from the .Nd tag, say), and then
amplified/filtered by incrementa

[Groff] colorized man pages

2016-08-18 Thread James K. Lowden

http://boredzo.org/blog/archives/2016-08-15/colorized-man-pages-understood-and-customized

https://news.ycombinator.com/item?id=12296000

It's a little far afield, but some here might be interested in the
interest in manpages.  mdoc has no color features, leaving folks to
resort to some pretty brutal techniques. 

Raw VT-100 escape sequences, in 2016.  Where will it all end?  

--jkl

Re: [Groff] colorized man pages

2016-08-23 Thread James K. Lowden

On Tue, 23 Aug 2016 01:29:07 +0200
Tadziu Hoffmann  wrote:

> > Raw VT-100 escape sequences, in 2016.  Where will it all end?
> 
> Steering wheels.  On cars.  In 2016.  Where will it all end?
> 
> Seriously:  what's wrong with escape codes?  I mean, if you're
> still working with a text terminal, I'd expect escape codes to
> be your daily bread and butter, not something to scoff at.
> (Unless I'm missing the good-natured, approving irony here?)

Yes, but who is still working with a text terminal?  

In 1980, it wouldn't have been unusual, as you know, for a VT-100 to be
the user's single interface to the computer.  Any UI feature -- font
style & variations, menus, mutiple applications, etc. -- had to be
rendered on that one screen.  It's no wonder it became terrifically
complex. They developed programable fonts, 132-column displays,
alternate screens. It's a testament to human ingenuity.  

Today most of those features have been subsumed by the GUI.  Different
applications have different windows, different fonts, graphics, all
resizable. We have a potpourri of UI gadgetry barely imagined in those
days.  Yet the emulator remains as muscular and complex as ever, just
in case someone happens across an RS-232 cable and a line driver.  

Sadly, for all the advances, documentation has hardly budged, if indeed
it's advanced at all.  Even though a good deal of it is maintained in
typeset form, the output predominately is confined to the application
with the poorest text rendering capability: the VT-100 emulator.  

Because of poverty owing to neglect -- that is, necessity being the
mother of invention -- the author of the article I linked to decided
he'd like color in his man pages.  Where did he turn?  A style sheet in
the groff framework, perhaps?  Any kind of improvement to the
semantic-display connection?  No, he reached about as far down as
possible, and tweaked the control sequences emitted to the emulator.
Because he could.  Because, in a way, he *had* to, insofar as that
strange bit of arcania gave him the most leverage.  

So, yes, he's still working with a text terminal, after a fashion.
But the programmability of that text terminal is an accident of
history, its feature set long since made obsolete -- not useless, but
out-moded -- by graphical displays and GUIs.  That he reached for that
particular tool is a measure of how far we have come, and how far we
have not.  

--jkl

Re: [Groff] colorized man pages

2016-08-26 Thread James K. Lowden

On Thu, 25 Aug 2016 12:39:44 -0400
Peter Schaffter  wrote:

> > > Sadly, for all the advances, documentation has hardly budged,
> > > if indeed it's advanced at all.  Even though a good deal of
> > > it is maintained in typeset form, the output predominately is
> > > confined to the application with the poorest text rendering
> > > capability: the VT-100 emulator.
> 
> Am I the only one who finds that text at a terminal emulator with a
> well-chosen monspaced font and good contrast is much, much easier
> to read than a graphical representation of the same text (e.g. in a
> browser or pdf viewer)?

It's well established, is it not, that proportional fonts are easier to
read?  Isn't that why they dominate in books, magazines (remember 
them?) and the like?  

I use both.  When I'm scanning a man page for a particular feature,
such as a whether "groff -T" accepts "pdf" as an argument, viewing it
in a terminal is the quickest and most convenient.  When I'm reading
something longer to understand it for the first time, I much prefer
typeset text.  

Recently I had cause to consult Edward Moy's ctlseqs.ms from the xterm
distribution.  In HTML form it's clunky.  The PDF is crisp and
beautiful, easiest to read.  For scanning and double-checking, I kept
nroff output loaded in GNU less, too.  

Color and graphs can help illuminate material too.  In groff we have
pic, but it's unused in man pages.  Why?  Not because no one knows how
to use it, but because the typical man-page rendering environment
doesn't support it, despite the fact that's it's a child window in a
GUI!  

--jkl

Re: [Groff] colorized man pages

2016-08-26 Thread James K. Lowden

On Thu, 25 Aug 2016 00:40:15 +0200
Tadziu Hoffmann  wrote:

> > So, yes, he's still working with a text terminal, after
> > a fashion.  But the programmability of that text terminal
> > is an accident of history, its feature set long since made
> > obsolete -- not useless, but out-moded -- by graphical
> > displays and GUIs.  That he reached for that particular
> > tool is a measure of how far we have come, and how far
> > we have not.
> 
> Well said!

(Thank you.  I for one have found the discussion interesting.) 

> (Although I have to disagree about the word "obsolete",
> which implies that better alternatives exist *and are
> available for use*.  In this case, they were not.
> Should they have been?  Because there's an urgent need
> for them, or just because it's technically possible?
> Does such a need actually exist?  Do other issues
> have a higher priority?)

To be clear, I am not arguing

* that the GUI is better than the CLI, or vice versa
* that xterm is obsolete

I do think it's a shame that development of the command-line experience
ceased 3 decades ago, and that what we use today is hardly better
(within the 4 corners of the terminal) than what was commonplace in
1986.  About the only improvement I can point to is UTF-8.  

You make a good point: while some things could perhaps be better
done in another way, if that other way isn't as convenient (for
whatever reason) as a terminal window is, then the terminal is hardly
obsolete.  

I think you would agree that many parts of the xterm feature set are
obsolete, and some others vestigial.  Printing, for example, is
practically unused, as are alternate screens, emulated boldface,
132-column mode, double-width characters, and Tektronix mode. Doubtless
there are still applications (perhaps a dozen) that rely on one of
bracketed paste mode, privacy mode, or rectangular copy.  These
features add greatly to xterm's complexity, and yet are of little use
to the present-day user.  

At the same time, xterm's adherence to serial-terminal standards
limits its functionality considerably, and needlessly.  The most
obvious limitation is the character-cell model, limiting the display to
monospace fonts.  The display can be switched to Tektronix mode --
which really *is* obsolete -- but no "graphics mode" to draw charts or
render proportional fonts.  Consequently, we're using nroff -- the
product of 300 bps serial lines -- instead of troff to render our man
pages.  

Whenever I mention that, it usually elicits a reminder that there are
PDF and HTML viewers, and "text is what the terminal is for!"  But
the terminal isn't *for* text; it's *of* text.  The terminal supports
the command-line environment.  There's no rule that says command-line
environments need be limited to monospaced fonts in an NxM
character-cell grid.  That just happens to be where their evolution
ceased 30 years ago.  

--jkl

Re: [Groff] groff developments - query about any interest?

2016-11-14 Thread James K. Lowden

On Tue, 15 Nov 2016 10:56:28 +1100
John Gardner  wrote:

> There're modules out there to generate manpages using Markdown or
> other intermediate formats, but what we really need is something that
> can use existing option-configs and churn out correctly-formatted
> manpages without asking anything of the author. 

While I applaud the effort, I want to point out one pitfall.  

As you approach this problem, remember that "correctly formatted"
doesn't imply "complete".  A man page is more than a list of options.
What's missing in the example you linked to between SYNOPSIS and
OPTIONS is the meat of the matter: DESCRIPTION.  

If "without asking anything of the author" is part of the remit, then I
suggest you'll want some structured way to augment the output.  It
could be a database of Descriptions by name, or a template that, if
extant, has it's missing parts filled in rather than generated de
novo.  That would give people who'd like to improve the documentation a
framework to work in, and continue generating from the source as it
changes.  (You might also want some way to determine when the generated
output changes, version over version, to flag which augmented pages
might need review.)  

I did something like this for the Subversion project.  There, the
"help" input is kept in a bespoke structure to facilitate reproducing
shared text in different places.  I used Perl to convert the output to
man pages.  The framework would have taken some work to generalize and
keep up to date and, frankly, when I was done I was convinced that
going the *opposite* way would have been better: maintain proper man
pages, and generate "help" from them. Use .so to include shared parts.
Much easier and much higher quality because the formatting macros are
inserted directly, not algorithmically.  But of course that means the
team has to be willing to maintain man pages, which many regard as
impossible without a VAX handy.  

I learned the same lesson a different way using Doxygen: generating the
simple bits is pretty easy.  Getting quality documentation out of it
requires curation. 

--jkl

[Groff] sqlrpt: sqlite meets groff

2017-02-11 Thread James K. Lowden

https://github.com/jklowden/sqlrpt

Just a little utility to render SQLite data as tbl input.  

I know I'm 30 years late in observing that report writing is a natural
application of document preparation systems.  Better late than never!  

I find piping it through nroff makes for nicer output than that
produced by SQLite's command-line shell.  

Suggestions welcome.  

--jkl

Re: [Groff] [PATCH] mdate.sh: rewrite in Perl

2017-02-18 Thread James K. Lowden

On Sat, 18 Feb 2017 03:28:03 +
Colin Watson  wrote:

> This version is much shorter and easier to understand than the
> shell/awk version: we don't have to worry about convincing ls to
> produce output that we can parse, and we don't have to play games
> with the way that the same field may contain either the year or the
> time depending on how old the file is.

Attached please find fdate.c. It accepts filename argments and prints
the files' mtimes (and, optionally, names).  Like date(1), it accepts a
strftime format argument, defaulting to -mm-dd.

If you would like to use it instead, i can provide the necessary GNU
paperwork.  

$ fdate ~/projects/3/groff/*
2016-04-21
2013-11-12
2016-04-19

$ fdate -v +'%d %B %Y' ~/projects/3/groff/*
21 April 2016 /home/jklowden/projects/3/groff/groff
12 November 2013 /home/jklowden/projects/3/groff/mkgrft
19 April 2016 /home/jklowden/projects/3/groff/sidebar

--jkl


fdate.c
Description: Binary data

Re: [Groff] Nesting font macros in man pages

2017-04-27 Thread James K. Lowden

On Wed, 26 Apr 2017 09:46:43 -0400
"G. Branden Robinson"  wrote:

> 1. The slavish devotion to two-letter names for things, which like the
>man macro package and the oldest parts of *roff itself, make it
>self-anti-documenting.  

Having written one user guide in DocBook, I have to disagree.  

The troff system was designed to be typed at a keyboard.  The
dot-on-the-left rule might be ugly, and the requests/macros terse,
but the benefit to the user is relatively few keystrokes above those
needed for the text.  The nearest modern cousin might be HTML, with its
single-letter tags. 

Most SGML derivatives, on the other hand, presupposed Interleaf-like
tools that would shield users from the markup syntax. That "assume a can
opener" design theory freed them to be verbose. When the tools never
materialized, users started looking for a way to avoid their verbosity
tax.  Markdown is only the latest product of that search.  

Short names are actually *easier* to use than long ones!  Why?  

Brevity rewards experience.  

If .SH or .Sx is hard to read, they're also easy to write, and easy to
remember when transferring them from the manpage to the document at
hand.  If it's hard to remember what the "next page" request is, .bp is
no harder than remembering whether it's .BreakPage or .PageBreak.  

Stroustrup has an interesting observation about language brevity based
on the evolution of C++.  Users, he says, want novel features to be
"loud" and well-understood paradigms to be terse.  Trouble is, today's
novel is tomorrow's obvious.  As a result, C++ has been losing syntax,
making implicit that which 20 years ago was explicit.  But, of course,
the new language features recently added are "loud".  

In today's world, to most programmers troff is 100% novel, and they find
its terseness inscrutable and off-putting.  Too hard to understand!  

Ah.  But so easy to use.  

--jkl

Re: [Groff] Nesting font macros in man pages

2017-04-30 Thread James K. Lowden

On Fri, 28 Apr 2017 01:15:05 -0400
"G. Branden Robinson"  wrote:

> the benefit to the user is relatively few keystrokes above those
> > needed for the text.
> 
> Did you write your DocBook-based user guide in ed?
> 
> Nothing has come close to saving me more keystrokes than  at the
> shell prompt and CTRL-N in Vim.

At the time I used nedit.  Today I would use emacs.  There's no package
for docbook-mode that closes tags for you.  The xml modes I've used are
lame and wouldn't perform well on a large manual.  

DocBook is a complex beast.  It would be great if the editor could do
tab-completion for valid tags for the current parent and valid parents
for a selected region, and jump from beginning to end of a tag.  If
such a tool exists, that would be news to me.  

Meanwhile, what could be simpler than having the editor supply "tool
tip" text to explain .Sx or an "mdoc apropos" feature derived from
keywords in the manual?  The only reason there isn't better editor
facilitation for man and mdoc is ... lack of interest.  The problem is
technically much simpler.  

> We have ludicrously more processor horsepower and memory

Yes, and my contention is that we nevertheless have not developed a
system that is easier to use or learn.  troff macro packages are
simpler and impose less on the author than any alternative of similar
capacity.   The facilitation promised 2 decades ago to deal with the
verbosity and complexity of more recent designs never materialized,
leaving authors with a variety of lousy options. 

> They're programmers.  Some of them are difficult to coerce into
> writing documentation _at all_.

There is no evidence that programmers disinclined to write
documentation do a better job of it if it's made easier to do.  Good
documentation takes effort, and that effort is different from the
programming effort.  

The impetus for Markdown and Doxygen and the like is that good
documentation will emerge by lowering the bar.  It's simply not true.

The problem of writing documentation isn't in opening another file
(Doxygen) or understanding the markup system (Markdown).  It's in
capturing the syntax and semantics from the point of view of the user.
And, as you say, some people have the additional burden of caring
enough to bother.  

I have yet to see good documentation produced by those systems unless
great care was taken.  Having used Doxygen for C library with a 100
functions, I wouldn't use it again.  For equivalent time invested, I
get better results from mdoc, albeit without the pretty diagrams. 

>> when transferring them from the manpage

I was afraid that might not be clear.  What I meant is that as I'm
writing documentation, I'm refering to the mdoc manpage (say).  I've
never done that kind of work for a living.  I'm just another occasional
mdoc hacker, so I quite often have to relearn parts of it.  By
"transfer", I meant putting to use what I'm reading in my writing.  Two
letters aren't hard to remember, and most of the pairs carry some
mnemonic weight.  (That said, better editor facilitation would make the
work less tedious.)  

> C adopted designated initializers, a sop to those who can't recall
> what order a struct's fields come in. 

I guess you're implying that designated initializers, while verbose,
are easier to use, even for aficionados of a terse language like C.
As it happens, my current work is in C, and I adopted C11 as a
baseline, partly so I could use them.  

With today's machines, the compiler can do more than was feasible in
1978.  I'd argue we have bigger structures now, too.  Designated
initializers aren't so much a sop to the lazy as they are a gift of
clarity, because otherwise the programmer has to count structure
elements and supply any zeros preceding the elements to be
initialized.  Designated initializers also support re-initialization of
existing structures, something that otherwise required error-prone
member-by-member assignment.  

All to say that at least one fan of short macro names finds explicit
structure initialization useful.  

--jkl

Re: [Groff] Critique this bold-italic private macro for man pages

2017-05-03 Thread James K. Lowden

On Wed, 3 May 2017 22:06:10 +0200 (CEST)
Carsten Kunze  wrote:

> There are ways to detect the formatter but a manpage must not do
> this.  

Why not?  ISTM we'd have better manpages if they weren't constrained to
the rendering capability of a VT-100 terminal.   For example, equations
or pictures could augment the text, or replace some of it, when
"printed".  

--jkl

Re: [Groff] ASCII Minus Sign in man Pages

2017-05-03 Thread James K. Lowden

On Wed, 3 May 2017 13:42:55 -0400
Mike Bianchi  wrote:

> The  -  character exists on all keyboards.  It is not labeled minus
> or hyphen or endash.  It generates the decimal 45 (hex 0x2D, octal
> 055) character. That any *roff processor would give it a different
> meaning is most unfortunate.

IMO that is the principal that should be applied: every unadorned
character appearing in troff input should represent itself.  If  you
want something other than that, groff_char(7) describes your options.  

IIUC, this debate about how to render - and \- stems from a conflict in
historical practice.  Is the following correct?  

When troff was young, terminals were ascii and the - character
was 0x2d.  Manpage guidelines encouraged the use of \- for flags because
they rendered nicely in printed documents with no harm done to nroff
output.  They did that despite the obvious fact that the manpage is
there to describe what to type, and basically no one can type the
denoted character.  

Then Unicode pronounced that 0x2d was neither fish nor fowl,
and gave us hyphen, minus, and endash characters.  groff dutifully
mapped - onto hyphen \- onto minus.  But when terminals gained Unicode
capability, some of them lost cut-and-paste convenience.  The debate is
over how to recover that convenience.  

Oddly, my system doesn't exibit any cut-and-paste anomaly despite
using xterm with the "-en UTF-8" option.  Searching for - in less also
works.

If it's a UI issue we're confronting, perhaps it's really up to the UI
to deal with.  The man utility can certainly impose on nroff the
requirement that - and \- both render as 0x2d.  Then it shows up
correctly in the pager.  It is visually acceptable to the user, and
DTRT regarding the UI.  (Maybe that's what Ubuntu LTS does for me; I
don't know.) 

It's not obvious to me groff should make any change at all.  At most,
reverting the mapping of - so that it outputs 0x2d again would undo a
nonobvious, subtle change in favor of simplicity.  

Possibly some degree of outreach to the UI community would be service,
too.  

--jkl

Re: [Groff] devpdf U-fonts and Russian

2017-10-11 Thread James K. Lowden

On Sat, 07 Oct 2017 15:17:25 +0200 (CEST)
Werner LEMBERG  wrote:

> Another important issue is security.  PS, as a programming language,
> allows far too much things.

But PS, as a programming language, is under programmer control.  To
treat it as though it accepts input from untrusted anonymous sources
over the Internet is to unnecessarily bind the programmer's hands.  One
of its *strengths* is that it enables the programmer to do things its
authors did not foresee.  

The same person who uses PS normally uses a shell.  How is access to
the shell from within PS any more of a security hazard than access from
without?  

--jkl

Re: [Groff] problem with preconv and sample_docs.mom

2017-11-07 Thread James K. Lowden

On Sun, 05 Nov 2017 15:09:41 +0100
Bertrand Garrigues  wrote:

> +  /* uchardet 0.0.1 could return an empty string instead of NULL */
> +  if (charset && *charset) {
>  ret = (char *)calloc(strlen(charset) + 1, 1);
>  strcpy(ret, charset);
>}

As a logical matter, the calloc call should have the count as the first
argument: 

ret = (char *)calloc(1, strlen(charset) + 1);

--jkl

Re: [groff] 04/04: tmac: Move macro diagnostics away from `quotes'.

2017-11-23 Thread James K. Lowden

On Tue, 21 Nov 2017 18:13:33 +
Ralph Corderoy  wrote:

> I realised the `commonal
> garden' phrase I often heard as a child was `common or garden'

Then there's that famous hymn about a zoo animal, 

"Sadly, the cross-eyed bear".  

--jkl

Re: [groff] 04/04: tmac: Move macro diagnostics away from `quotes'.

2017-11-23 Thread James K. Lowden

On Wed, 22 Nov 2017 00:20:46 +0100
Tadziu Hoffmann  wrote:

> "English doesn't borrow from other languages.  English follows
> other languages down dark alleys, knocks them over and goes
> through their pockets for loose grammar."

That's clever, and was a bit true 5 centuries ago.  

Nowadays, English is the Borg of languages: It's always absorbing new
words, and resistance is futile.  

--jkl

Re: [groff] [UTROFF] references, summary, index

2017-12-06 Thread James K. Lowden

On Wed, 06 Dec 2017 12:24:58 +
Ralph Corderoy  wrote:

> > .   sy echo \\*[sum-list] | sed -e "s/@@@/n./g" |
> > iconv...
> 
> It may be better to use `.write', if other troffs have that, rather
> than allow the local echo(1) to perhaps interpret sum-list's content?

Else:

sy printf '%s\n' '\\*[sum-list]' | sed -e "s/@@@/n./g" 

would eliminate that issue. 

--jkl

Re: [groff] Merge conflict for "gnulib"

2018-02-03 Thread James K. Lowden

On Sat, 3 Feb 2018 15:08:41 +
Bjarni Ingi Gislason  wrote:

> CONFLICT (submodule): Merge conflict in gnulib

You can't have a conflict unless your local version contains changes
not present in the version on origin master.  If you don't care about
those changes, you can remove the file, check it out again from master,
and pull.  

--jkl

Re: [groff] groff as the basis for comprehensive documentation?

2018-04-19 Thread James K. Lowden

On Mon, 16 Apr 2018 13:19:31 -0500
Nate Bargmann  wrote:

> I'm still undecided on the Texinfo part, though it may serve as the
> portion that ties everything together.  I have man pages for utility
> programs of the project and will be writing man pages for the C
> library.  Being able to collate this nicely would be a great
> benefit.  

I went down your very same road some years ago, except I used jade and
SGML instead of XML for DocBook.  I found LaTex too confining and
complex.  Once I bothered to learn mdoc, I wished I'd started there.  

The roff language is the only markup language in current use that was

1.  designed to be typed by humans, and 
2.  designed to produce typeset documentation.  

I think there was hope, once upon a time, that a free implementation of
something like Interleaf would become the UI for DocBook, and that mere
mortals wouldn't have to balance their tags.  Needless to say, it never
came to pass.  Lyx isn't it.  

The design of the roff language, while not "modern", is minimalistic;
it has the least markup as a percentage of text.  It makes few
assumptions about how the text should appear, and those assumption are
well documented and easily adjusted.  The groff implementation is fast
and small.  As Hoare said of Algol, it is an improvement over its
successors.  

The full current capability of groff is harder to exploit than it could
be, however.  There's still a bias toward printed output.  To create a
document like Deri's, with hyperlinks, you have to understand the
system pretty well, and piece together a few documents, some of which
are incomplete.  Cross references in mdoc, for example, do not generate
links in HTML or PDF documents.  It's possible to produce presentation
slides, too, but you have to do a little digging.  

> Ideally, if the same sort of collation could be done with HTML, that
> would be perfect.

Groff is not the ideal system for generating HTML.  You might like to
believe that eqn, tbl, and pic could be processed with grohtml and come
out lovely on the other side, but that goal remains over the horizon.
It's pretty rare just to find manpages rendered in proportional HTML
fonts.  

--jkl

Re: [groff] groff as the basis for comprehensive documentation?

2018-04-19 Thread James K. Lowden

On Fri, 20 Apr 2018 01:44:06 +1000
John Gardner  wrote:

> > You might like to believe that eqn, tbl, and pic could be processed
> > with grohtml
> 
> I've seen grohtml's complexity and was bewildered.  Hence why I
> intend to write my own. The procedures for inferring structural or
> semantic metadata from low-level intermediate output commands will be
> an entertaining challenge. =)

For lack of a better term, I think it's an abstraction mismatch.  The
ditroff language presupposes a dot-addressable canvas, onto which lines
and strings of text are drawn.  That model fits most printers (these
days) and terminals.  But it doesn't describe HTML at all.  

I discussed HTML output with Ted Faber, of grap, upon a time.  He
produces HTML from a handful of his own macros.  ISTM ms macros map
onto HTML pretty well:

.SH => 
.PP => 
.I  => 
.B  => 

and so on.  But what, for example, is HTML to do with line
justification, and why should the browser honor the (implied) line
breaks, when it has its own line-wrapping logic and style sheet, and
the page size is dynamic?

Similarly, it's fairly easy, up to a point, to imagine tbl generating
HTML tables directly, easier (for me, anyway) than imagining how to
infer the table structure after it's passed through groff.  But then
there are little niggles like  and HTML's table/cell borders.  

I've said before that it's a terrible, terrible shame HTML ever was
invented.  It didn't need to be, and the cost is surely measured in
billions of man-hours.  When it was invented, the roff language was
already 10 years old and in widespread use.  It could have been adapted
to the www through the simple expediency of writing a ditroff interface
to the browser, such that the browser accepted and rendered ditroff
output, as a postprocessor.  The few novel aspects of that HTML
introduced (relative to roff) -- hypherlinks, POST -- could have been
added to roff.  The escape mechanism was already there.  And, voila!
The thousands of roff documents already extant would have been
instantantly accessible, and the skill of using the language extended
to a new realm.  We wouldn't be talking about how to
convert/adapt/generate, because there'd be only one language.  

Yeah, I inow the history.  Yet, even now, there's no low-level
dot-addressable interface to the browser.  It has to *interpret*
everything: SVG, PNG, you name it.   Why use anything if you can
reinvent it?
.rant off

--jkl

Re: [groff] groff as the basis for comprehensive documentation?

2018-04-21 Thread James K. Lowden

On Sat, 21 Apr 2018 12:59:16 -0500
Nate Bargmann  wrote:

> But why do we focus on presentation when authoring a document?  

Because the document we see is what we're creating.  The only purpose
of the document is to be read, and appearance matters to the reader.
The reader assuredly doesn't care about semantic markup.  

Semantic markup is an abstraction that assists the author.  It can
ensure that semantically similar things -- section headers, quotations,
etc. -- are rendered consistently.  It can help -- somewhat -- in
rendering the same input text pleasantly in different output formats.
It can help indexing programs and such knit together different
documents in ways that are useful to the user (and thus to the
author).  

There's no Higher Good honored by using semantic markup.  It's a tool
to use if it gets you where you want to go.  Ultimately, though, you
want your text to be read, and it is altogether fitting and proper that
you be concerned with its appearance.  

--jkl

Re: [groff] groff as the basis for comprehensive documentation?

2018-04-21 Thread James K. Lowden

On Sat, 21 Apr 2018 08:16:36 -0500
Nate Bargmann  wrote:

> > For lack of a better term, I think it's an abstraction mismatch.
> > The ditroff language presupposes a dot-addressable canvas, onto
> > which lines and strings of text are drawn.  That model fits most
> > printers (these days) and terminals.  But it doesn't describe HTML
> > at all.  
> 
> I suppose it depends on what one expects from the generated HTML.  As
> one who reads pages more than writes them, I've been impressed with
> the presentation on the man-pages project Web site (hosted at
> kernel.org). For example, here is the rendering of groff_man(7):
> 
> http://man7.org/linux/man-pages/man7/groff_man.7.html
> 
> I couldn't find the generator being used in the Git repository and a
> lot of it may be done with CSS.  The text is rendered using the 
> tag so it looks much like tty output though it is not fully justified
> yet the text blocks are indented.  Aside from the justification, the
> rest looks very familiar to me. 

I find very little to commend that version.  In fact, it's an excellent
example of the widespread dunderheaded monospace manpage rendering on
the web.  I invite you to compare it with something better: 

https://linux.die.net/man/7/groff_man

(Better url, too, btw: "man/7/groff_man" captures everything
"man-pages/man7/groff_man.7.html" does in half the space.)  

(One has to wonder, though, at the age of that document.  For
amusement, follow the link in See Also.)  

There's nothing in the manpage input text that specifies the font family
to be used.  The Postscript rendering uses proportional fonts and
italics, much more pleasant to read than nroff output in a terminal.
Why shouldn't the HTML output make the highest and best use of its
medium, instead of poorly emulating a 40 year-old obsolete hardware?  

> Note that I am only working with Groff's man macro package and do
> understand that other macro packages may have greater demands on the
> HTML generator.

It's not the demands on the HTML *generator* that present the problem.
The input to grohtml is devoid of information HTML itself demands, and
is thick with information it can't use.  

HTML needs: titles, paragraphs, tables
ditroff provides: positions, text, fonts

The only reason it works at all is that the pre-grohtml preprocessor
sneaks some useful information to the postprocessor via ditroff
escapes.  That allows grohtml to generate MathML from eqn output, for
example.  

While it's possible to work that way, I see no advantage to squirting
most of the needed information to the post-processor through the
formatter, when the formatter's work is discarded by the
post-processor.  Why bother?  just translate the macros directly.  

Which, I suppose, it what Ingo is doing already.  

--jkl

Re: [groff] Groff & tbl as a report generator

2018-07-24 Thread James K. Lowden

On Tue, 24 Jul 2018 12:26:57 -0500
Blake McBride  wrote:

> Then, a few years ago, I thought of generating groff/tbl input
> instead and then calling those tools to generate the final PDF output.

You're not the only one.  https://github.com/jklowden/sqlrpt

I wrote it last year to make my use of SQLite more convenient.  I
normally use it instead of the SQLite command-line utility to view
data.  Not only does it make wide columns of text easier to read, it
represents large numbers accurately.  (The SQLite utility will drop
digits that don't fit in the column-width).   

--jkl

[groff] grohtml shortcomings

2018-12-03 Thread James K. Lowden

I converted the Heimdal kf.1 page to html recently, and found what I
consider to be problems with both appearance and HTML style. 

https://github.com/heimdal/heimdal/blob/1d4ebc0df798cb1d8edca910b806e55c6c19bccb/appl/kf/kf.1

GNU groff version 1.22.3

Command:

groff -man -T html man1/kf.1 > html1/kf.1.html

The appearance problem is layout.  The problem might be related to the
mdoc .Oo and ..Oc macros.   The SYNOPSIS reads as follows:  

SYNOPSIS

kf [

−p port |
−−port=port ] [ 
−l login |
−−login=login ] [ 
−c ccache |
−−ccache=ccache ]
[−F | −-forwardable]
[−G | −-no-forwardable]
[−h | −-help]
[−−version] host ...

Those  tags aren't desirable.  I suspect they are artifacts of the
linefeeds in the input: 

.Nd securely forward tickets
.Sh SYNOPSIS
.Nm
.Oo
.Fl p Ar port |
.Fl Fl port Ns = Ns Ar port
.Oc

My other complaint is the way style is used.  grohtml offers no way
to include alternative CSS at the top of each page, and imposes quite a
bit of style on a per-tag basis, instead of by class.  That overrides
site-provided CSS, which is pretty much never what you want.  

IMO groff should produce valid, unstyled HTML by default, with classes
assigned to divisions, spans, and other tags named to reflect the input
macros that generated them.  Optionally it would generate either a
default  or one provided by the user.  

For example, at the top is