Re: qapi: [RFC] Doc comment convention for @arg: sections

2023-03-24 Thread Peter Maydell
On Fri, 24 Mar 2023 at 12:05, Markus Armbruster  wrote:
>
> Peter Maydell  writes:
>
> > On Thu, 23 Mar 2023 at 14:48, Markus Armbruster  wrote:
> >>
> >> The QAPI schema doc comment language provides special syntax for command
> >> and event arguments, struct and union members, alternate branches,
> >> enumeration values, and features: "sections" starting with @arg:.
> >>
> >> By convention, we format them like this:
> >>
> >> # @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
> >> #   do eiusmod tempor incididunt ut labore et dolore magna
> >> #   aliqua.
> >>
> >> Okay for names as short as "name", but we have much longer ones.  Their
> >> description gets squeezed against the right margin, like this:
> >>
> >> # @dirty-sync-missed-zero-copy: Number of times dirty RAM 
> >> synchronization could
> >> #   not avoid copying dirty pages. This is 
> >> between
> >> #   0 and @dirty-sync-count * 
> >> @multifd-channels.
> >> #   (since 7.1)
> >>
> >> The description is effectively 50 characters wide.  Easy enough to read,
> >> but awkward to write.
> >
> > The documentation also permits a second form:
> >
> >  # @argone:
> >  # This is a two line description
> >  # in the first style.
>
> Yes.  We use this in exactly one place: the guest agent's GuestOSInfo.
>
> > We tend to use that for type names, not field names, but IIRC
> > it's the same handling for both.
>
> Kind of.
>
> Definition documentation consist of "sections".
>
> The first section (called "body" in the code) starts with a line of the
> from
>
> # @NAME:
>
> Nothing may follow the colon.
>
> Ordinary text may follow.  Indentation is not stripped.

I guess this has changed since I added the rst stuff. Back
at the time (assuming my email comments at that time are
correct) this was all basically in the same code path, so
the "allow field descriptions that start on the following line"
falls out of having to handle "allow section definitions with
text that starts on the following line".

> Our current doc comment syntax has two layers:
>
> 1. The upper layer uses home-grown markup (= heading, @def: for special
>definition lists, @ref to reference QAPI names, tag: special
>sections).
>
> 2. The lower layer is reStructuredText.
>
> Parsing mirrors this:
>
> 1. parser.py parses the upper layer into an internal representation.
>
> 2. Sphinx extension qapidoc.py maps this internal representation to
>Sphinx's.  It feeds its text parts to the rST parser, and splices its
>output into the Sphinx IR.
>
> I'm wary of blurring the boundary between the two.  If we use rST syntax
> for argument sections, parser.py effectively parses a limited subset of
> rST.  Second-guessing the real rST parser doesn't feel wise to me.

I didn't mean to say "use rst syntax entirely throughout"
so much as "use the same rules for multi-line syntax that rst does,
not a subtly different set of rules". We could keep our @markup stuff.

thanks
-- PMM



Re: qapi: [RFC] Doc comment convention for @arg: sections

2023-03-24 Thread Markus Armbruster
Peter Maydell  writes:

> On Thu, 23 Mar 2023 at 14:48, Markus Armbruster  wrote:
>>
>> The QAPI schema doc comment language provides special syntax for command
>> and event arguments, struct and union members, alternate branches,
>> enumeration values, and features: "sections" starting with @arg:.
>>
>> By convention, we format them like this:
>>
>> # @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
>> #   do eiusmod tempor incididunt ut labore et dolore magna
>> #   aliqua.
>>
>> Okay for names as short as "name", but we have much longer ones.  Their
>> description gets squeezed against the right margin, like this:
>>
>> # @dirty-sync-missed-zero-copy: Number of times dirty RAM 
>> synchronization could
>> #   not avoid copying dirty pages. This is 
>> between
>> #   0 and @dirty-sync-count * 
>> @multifd-channels.
>> #   (since 7.1)
>>
>> The description is effectively 50 characters wide.  Easy enough to read,
>> but awkward to write.
>
> The documentation also permits a second form:
>
>  # @argone:
>  # This is a two line description
>  # in the first style.

Yes.  We use this in exactly one place: the guest agent's GuestOSInfo.

> We tend to use that for type names, not field names, but IIRC
> it's the same handling for both.

Kind of.

Definition documentation consist of "sections".

The first section (called "body" in the code) starts with a line of the
from

# @NAME:

Nothing may follow the colon.

Ordinary text may follow.  Indentation is not stripped.

Next are "argument sections".  These start with another "# @NAME:" line,
but here text may follow the colon.  If it does, additional text needs
to be indented to the start of the text following the colon.  Required
indentation is stripped.

If text doesn't follow the colon, required indentation is zero, so
nothing is stripped.  So yes, it's the same handling as for the body
section, but it's different code that happens to behave the same in a
special case.

> I'll re-mention here something I said back in 2020 when we were
> landing the rST-conversion of the qapi docs:
>
> There is rST syntax for field lists and option lists:
> https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#field-lists
> which are kind of similar to what we're doing with @foo: stuff
> markup, and which handle indentation like this:
>
> :Hello: This field has a short field name, so aligning the field
> body with the first line is feasible.
>
> :Number-of-African-swallows-required-to-carry-a-coconut: It would
> be very difficult to align the field body with the left edge
> of the first line. It may even be preferable not to begin the
> body on the same line as the marker.
>
> The differences to what we have today are:
>  * indent of lines 2+ is determined by the indent of line 2, not 1

I consider that an improvement.

>  * lines 2+ must be indented, so anything that currently uses
>"no indent, start in column 0" would need indenting. (This would
>be a lot of change to our current docs text.)

Actually, it's just GuestOSInfo's doc comment.

> At the time I think I basically went with "whatever requires the
> minimum amount of change to the existing doc comments and parser
> to get them into a shape that Sphinx is happy with". But if we're
> up for a wide reformatting then maybe it would be worth following
> the rST syntax?

Valid question!

Our current doc comment syntax has two layers:

1. The upper layer uses home-grown markup (= heading, @def: for special
   definition lists, @ref to reference QAPI names, tag: special
   sections).

2. The lower layer is reStructuredText.

Parsing mirrors this:

1. parser.py parses the upper layer into an internal representation.

2. Sphinx extension qapidoc.py maps this internal representation to
   Sphinx's.  It feeds its text parts to the rST parser, and splices its
   output into the Sphinx IR.

I'm wary of blurring the boundary between the two.  If we use rST syntax
for argument sections, parser.py effectively parses a limited subset of
rST.  Second-guessing the real rST parser doesn't feel wise to me.

A more radical approach would be to ditch the upper layer completely.
What would we lose?

parser.py ensures definition documentation

* adheres to a common structure,

* the things it documents exist, and

* documentation is complete[*].

Losing this feels bad to me.

Could we do it in qapidoc.py instead?

> PS: I see with a quick grep we also have misformatted field docs;
> check out for instance the HTML rendering of the bps_max etc
> fields of BlockDeviceInfo, which is weird because the second
> line of the field docs in the sources is wrongly indented.

Yes, that needs fixing.


[*] There is hole marked TODO in the code.  Resolving it is trivial in
the generator code.  The problem is fixing up the schema.




Re: qapi: [RFC] Doc comment convention for @arg: sections

2023-03-24 Thread Markus Armbruster
Peter Maydell  writes:

> On Fri, 24 Mar 2023 at 12:05, Markus Armbruster  wrote:
>>
>> Peter Maydell  writes:
>>
>> > On Thu, 23 Mar 2023 at 14:48, Markus Armbruster  wrote:
>> >>
>> >> The QAPI schema doc comment language provides special syntax for command
>> >> and event arguments, struct and union members, alternate branches,
>> >> enumeration values, and features: "sections" starting with @arg:.
>> >>
>> >> By convention, we format them like this:
>> >>
>> >> # @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
>> >> #   do eiusmod tempor incididunt ut labore et dolore magna
>> >> #   aliqua.
>> >>
>> >> Okay for names as short as "name", but we have much longer ones.  Their
>> >> description gets squeezed against the right margin, like this:
>> >>
>> >> # @dirty-sync-missed-zero-copy: Number of times dirty RAM 
>> >> synchronization could
>> >> #   not avoid copying dirty pages. This 
>> >> is between
>> >> #   0 and @dirty-sync-count * 
>> >> @multifd-channels.
>> >> #   (since 7.1)
>> >>
>> >> The description is effectively 50 characters wide.  Easy enough to read,
>> >> but awkward to write.
>> >
>> > The documentation also permits a second form:
>> >
>> >  # @argone:
>> >  # This is a two line description
>> >  # in the first style.
>>
>> Yes.  We use this in exactly one place: the guest agent's GuestOSInfo.
>>
>> > We tend to use that for type names, not field names, but IIRC
>> > it's the same handling for both.
>>
>> Kind of.
>>
>> Definition documentation consist of "sections".
>>
>> The first section (called "body" in the code) starts with a line of the
>> from
>>
>> # @NAME:
>>
>> Nothing may follow the colon.
>>
>> Ordinary text may follow.  Indentation is not stripped.
>
> I guess this has changed since I added the rst stuff. Back
> at the time (assuming my email comments at that time are
> correct) this was all basically in the same code path, so
> the "allow field descriptions that start on the following line"
> falls out of having to handle "allow section definitions with
> text that starts on the following line".

Could be the same path in qapidoc.py, but has always been separate in
parser.py, as far as I remember.

>> Our current doc comment syntax has two layers:
>>
>> 1. The upper layer uses home-grown markup (= heading, @def: for special
>>definition lists, @ref to reference QAPI names, tag: special
>>sections).
>>
>> 2. The lower layer is reStructuredText.
>>
>> Parsing mirrors this:
>>
>> 1. parser.py parses the upper layer into an internal representation.
>>
>> 2. Sphinx extension qapidoc.py maps this internal representation to
>>Sphinx's.  It feeds its text parts to the rST parser, and splices its
>>output into the Sphinx IR.
>>
>> I'm wary of blurring the boundary between the two.  If we use rST syntax
>> for argument sections, parser.py effectively parses a limited subset of
>> rST.  Second-guessing the real rST parser doesn't feel wise to me.
>
> I didn't mean to say "use rst syntax entirely throughout"
> so much as "use the same rules for multi-line syntax that rst does,
> not a subtly different set of rules". We could keep our @markup stuff.

Fair point.




Re: qapi: [RFC] Doc comment convention for @arg: sections

2023-03-23 Thread Peter Maydell
On Thu, 23 Mar 2023 at 14:48, Markus Armbruster  wrote:
>
> The QAPI schema doc comment language provides special syntax for command
> and event arguments, struct and union members, alternate branches,
> enumeration values, and features: "sections" starting with @arg:.
>
> By convention, we format them like this:
>
> # @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
> #   do eiusmod tempor incididunt ut labore et dolore magna
> #   aliqua.
>
> Okay for names as short as "name", but we have much longer ones.  Their
> description gets squeezed against the right margin, like this:
>
> # @dirty-sync-missed-zero-copy: Number of times dirty RAM synchronization 
> could
> #   not avoid copying dirty pages. This is 
> between
> #   0 and @dirty-sync-count * 
> @multifd-channels.
> #   (since 7.1)
>
> The description is effectively 50 characters wide.  Easy enough to read,
> but awkward to write.

The documentation also permits a second form:

 # @argone:
 # This is a two line description
 # in the first style.

We tend to use that for type names, not field names, but IIRC
it's the same handling for both.

I'll re-mention here something I said back in 2020 when we were
landing the rST-conversion of the qapi docs:

There is rST syntax for field lists and option lists:
https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#field-lists
which are kind of similar to what we're doing with @foo: stuff
markup, and which handle indentation like this:

:Hello: This field has a short field name, so aligning the field
body with the first line is feasible.

:Number-of-African-swallows-required-to-carry-a-coconut: It would
be very difficult to align the field body with the left edge
of the first line. It may even be preferable not to begin the
body on the same line as the marker.

The differences to what we have today are:
 * indent of lines 2+ is determined by the indent of line 2, not 1
 * lines 2+ must be indented, so anything that currently uses
   "no indent, start in column 0" would need indenting. (This would
   be a lot of change to our current docs text.)

At the time I think I basically went with "whatever requires the
minimum amount of change to the existing doc comments and parser
to get them into a shape that Sphinx is happy with". But if we're
up for a wide reformatting then maybe it would be worth following
the rST syntax?

PS: I see with a quick grep we also have misformatted field docs;
check out for instance the HTML rendering of the bps_max etc
fields of BlockDeviceInfo, which is weird because the second
line of the field docs in the sources is wrongly indented.

-- PMM



qapi: [RFC] Doc comment convention for @arg: sections

2023-03-23 Thread Markus Armbruster
The QAPI schema doc comment language provides special syntax for command
and event arguments, struct and union members, alternate branches,
enumeration values, and features: "sections" starting with @arg:.

By convention, we format them like this:

# @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
#   do eiusmod tempor incididunt ut labore et dolore magna
#   aliqua.

Okay for names as short as "name", but we have much longer ones.  Their
description gets squeezed against the right margin, like this:

# @dirty-sync-missed-zero-copy: Number of times dirty RAM synchronization 
could
#   not avoid copying dirty pages. This is 
between
#   0 and @dirty-sync-count * @multifd-channels.
#   (since 7.1)

The description is effectively 50 characters wide.  Easy enough to read,
but awkward to write.

The awkward squeeze against the right margin makes people go beyond it,
which produces two undesirables: arguments about style, and descriptions
that are unnecessarily hard to read, like this one:

# @postcopy-vcpu-blocktime: list of the postcopy blocktime per vCPU.  This 
is
#   only present when the postcopy-blocktime 
migration capability
#   is enabled. (Since 3.0)

Ugly, too.

I'd like to change the convention to

# @arg:
# Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.

# @dirty-sync-missed-zero-copy:
# Number of times dirty RAM synchronization could not avoid
# copying dirty pages.  This is between 0 and @dirty-sync-count
# * @multifd-channels.  (since 7.1)

# @postcopy-vcpu-blocktime:
# list of the postcopy blocktime per vCPU.  This is only present
# when the postcopy-blocktime migration capability is
# enabled.  (Since 3.0)

We may want to keep short descriptions one the same line, like

# @multifd-bytes: The number of bytes sent through multifd (since 3.0)

We could instead do

# @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Another option would be

# @arg:
# Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
# eiusmod tempor incididunt ut labore et dolore magna aliqua.

or even

# @arg: Lorem ipsum dolor sit amet, consectetur adipiscing elit,
# sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

but I find these less readable.

A bulk reformatting of doc comments will mess up git-blame, which will
be kind of painful[*].  But so is the status quo.

Thoughts?



[*] Yes, I'm aware of (and grateful for) --ignore-rev & friends.  It's
still kind of painful.