Hi,

John Gardner wrote on Wed, Sep 16, 2020 at 03:25:54AM +1000:
> Somebody wrote:

>> this *isn't* the same to me: attributes are for metadata
>> and tag contents are for data

> That's what I mean: there isn't always an obvious distinction
> between data and metadata.

To get back a bit closer to the topic of the list: the roff(7)
language and its macro sets actually face exactly the same challenge.

In very broad strokes, mostly, metadata is supposed to be passed
as arguments to requests.  For example: .ad .bp .ce .cp .di .ft .hy
.in .it .ll .ls .mk .ne .nr .pl .po .ps .rj .rm .rt .sp .ss .ta
.ti ...  Macro sets mostly follow: .HP .MT .PD .RS .RE .TP .UR ...

Data (or text, in the case off roff) is supposed to be contained in,
well, text lines.  Just like HTML elements can nest and contain
text, so can roff requests introduce blocks of text lines, some
to be closed by explicit end requests, some to automatically end
when a certain condition is met.  There are fewer examples:
.ce (again!) .di .rj ...
Again, macro sets mostly follow: .EX .HP .MT .RS .TP .UR ...

But in even fewer cases, the paradigm is violated by allowing
arbitrary content in arguments, for example to avoid awkwardly long
syntax, like in .do .if .nop ...

I'm not aware of roff(7) requests turning text data into metadata
(well, arguably expecting those where the fact that roff(7)
is not only a markup language but a Turing-complete programming
language makes it unavoidable, e.g. .while .de ... and the like).


Then man(7) macros are very weird (which is one of the factors
making them harder to use) in so far as they provide the feature of
*optional* next line scope: for .B .I .SB .SM .SH .SS ...
you can provide text data as arguments (like data as attributes
in HTML) or on the next line (like element content in HTML).
Besides, man(7) has at least one case where text data *must* be
in an argument and even *precedes* metadata: .IP


The mdoc macros are somewhat different; most of them take text data
as arguments, so the whole concept is less pronounced there.
Of course, the concept of blocks having content also exists,
but there are four different kinds of blocks (with examples):

 * multi-line blocks
    - explicit (with start- and end macro)  .Bd/.Ed, .Bl/.El, ...
    - implicit (end automatically)          .Sh .Ss .It ...
 * in-line blocks
    - explicit (with start- and end macro)  .Do/.Dc, .Oo/.Oc, ...
    - implicit (end at the end of the line) .Dq,     .Op


To summarize, it is useful to consider the distinction of metadata
and data when designing a markup language and to mostly handle it
in some systematic way, but there are very different ways to design
such a system.  Also, experience teaches it is not possible to be
strict about it, and zealously striving for rigidity in this respect
is usually counter-productive, whereas totally disregarding the
concept and assigning syntax in a completely random manner (as for
example DocBook does it) isn"t good either.

I would call the roff(7) language itself unusually well-designed
in this particular respect (though it has of course other quirks).
The man(7) macros feel below average, but that is somewhat mitigated
by the extreme smallness of the language.  The mdoc(7) macros maybe
under-emphasize the concept.  That is mostly inconsequential though
because mdoc(7) generally discourages adding any metadata to the
document whatsoever (except the for the macros themselves, of
course).  In mdoc(7), you rarely need arguments: occasional -type
-width -offset -compact, that's all, basically.

Yours,
  Ingo

Reply via email to