Kevin Wolf <kw...@redhat.com> writes:

> Am 31.07.2020 um 11:01 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kw...@redhat.com> writes:
>> 
>> > Am 30.07.2020 um 17:11 hat Eric Blake geschrieben:
>> >> > JSON is a exceptionally poor choice for a DSL, or even a configuration
>> >> > language.
>> >> > 
>> >> > Correcting our mistake involves a flag day and a re-learn.  We need to
>> >> > weigh costs against benefits.
>> >> > 
>> >> > The QAPI schema language has two layers:
>> >> > 
>> >> > * JSON, with a lexical and a syntactical sub-layer (both in parser.py)
>> 
>> An incompatible dialect of JSON with a doc comment language, actually.
>> 
>> The need to keep doc generation working could complicate replacing the
>> lower layer.
>
> Good point, we would have to keep the comment parser either way to be
> used on top of the YAML (or whatever) parser.
>
> Whatever parser we use would have to actually make comments available
> rather than immediately filtering them out. This might exist for most
> languages, but it will probably not be the most commonly used parser
> either (or at least it will not allow using a simple interface like
> json.loads() in Python).
>
>> >> > 
>> >> > * QAPI, with a context-free and a context-dependend sub-layer (in
>> >> >    expr.py and schema.py, respectively)
>> >> > 
>> >> > Replacing the JSON layer is possible as long as the replacement is
>> >> > sufficiently expressive (not a tall order).
>> >> 
>> >> I'm open to the idea, if we want to attempt it, and agree with the
>> >> assessment that it is not a tall order.
>> 
>> Careful, "not a tall order" is meant to apply to the "sufficiently
>> expressive" requirement for a replacemnt syntax.
>> 
>> On actually replacing the lower layer, I wrote "we need to weigh costs
>> against benefits."
>> 
>> > I'm not so sure about that. I mean, it certainly sounds doable if need
>> > be, but getting better syntax highlighting by default in some editors
>> > feels like a pretty weak reason to switch out the complete schema
>> > language.
>> >
>> > At first I was going to say "but if you don't have anything else to do
>> > with your time...", but it's actually not only your time, but the time
>> > of everyone who has development branches or downstream repositories and
>> > will suffer rather nasty merge conflicts. So this will likely end up
>> > having a non-negligible cost.
>> 
>> Yup.
>> 
>> > So is there more to it or are we really considering doing this just
>> > because editors can tell more easily what to do with a different syntax?
>> 
>> If memory serves, the following arguments have been raised:
>> 
>> 1. A chance to improve ergonomics for developers
>> 
>>    Pain points include
>> 
>>    - Confusion
>> 
>>      It claims to be JSON, but it's not.
>> 
>>    - Need to learn another syntax
>> 
>>      Sunk cost for old hands, but it's a valid point all the same.
>
> We use a similar (same?) form of "almost JSON" for QMP which will still
> exist. So we'd be moving from having to learn one (non-standard)
> language to two languages (one still non-standard and another one that
> is hopefully more standard).

QMP is JSON (no almost).  It accepts single-quoted strings as an
extension (but does not produce them).  This is permitted by the RFC.
We can get rid of the extension if it irks us.

There's also the QMP-generating language in the template string of
qobject_from_jsonf_nofail() & friends.  Helps keeping C code readable.
This template language is definitely not JSON (not even almost).

>>    - Poor tool support
>> 
>>      JSON tools don't work.  Python tools do, but you may have to work
>>      around the issue of true, false.
>
> This is mostly the editor question this patch was about, right? Or are
> people trying to use more sophisticated tools on it?

I occasionally paste schema bits into Python (working around the
true/false issue), for quick ad hoc hackery, where hooking into the real
frontend would be overkill.

Anything more advanced than that would be a bad idea, in my opinion.
Use the real frontend.

>>    - Excessive quoting
>> 
>>    - Verbosity
>> 
>>      When all you have is KEY: VALUE, defining things with multiple
>>      properties becomes verbose like
>> 
>>          'status': { 'type': 'DirtyBitmapStatus',
>>                      'features': [ 'deprecated' ] }
>> 
>>      We need syntactic sugar to keep vebosity in check for the most
>>      common cases.  More complexity.
>
> I don't think this is something any of the suggested languages would
> address.

Correct.

>>    - No trailing comma in arrays and objects
>> 
>>    - No way to split long strings for legibility
>> 
>>    - The doc comment language is poorly specified
>> 
>>    - Parse error reporting could be better (JSON part) / could hardly be
>>      worse (doc comment part)
>
> Has anyone looked into what error messages are like for the suggested
> alternatives? "error reporting could be better" is something that is
> true for a lot of software.
>
> The doc comment part is not going to change unless we get rid of
> comments and actually make documentation part of the objects themselves.
> This might not be very readable.

With decent string syntax, the doc comment blocks could be turned into
strings.  But then we'd parse the strings instead, so no real change.

> Or I should rather say, making the doc comment part change is possible,
> but would require the same changes as with our current lower layer
> language and parser.
>
>> 2. Not having to maintain our own code for the lower layer
>> 
>>    I consider this argument quite weak.  parser.py has some 400 SLOC.
>>    Writing and rewriting it is sunk cost.  Keeping it working has been
>>    cheap.  Keeping the glue for some off-the-shelf parser working isn't
>>    free, either.  No big savings to be had here, sorry.
>> 
>>    Almost half of parser.c is about doc comments, and it's the hairier
>>    part by far.  Peter has patches to drag the doc comment language
>>    closer to rST.  I don't remember whether they shrink parser.py.
>> 
>> 3. Make the schema more easily consumable by other programs
>> 
>>    Use of a "standard" syntax instead of our funky dialect of JSON means
>>    other programs can use an off-the-shelf parser instead of using or
>>    reimplementing parser.py.
>> 
>>    Valid point for programs that parse the lower layer, and no more, say
>>    for basic syntax highlighting.
>> 
>>    Pretty much irrelevant for programs that need to go beyond the lower
>>    layer.  Parsing the lower layer is the easy part.  The code dealing
>>    with the upper layer is much larger (expr.py and schema.py), and it
>>    actually changes as we add features to the schema language.
>>    Duplicating it would be a Bad Idea.  Reuse the existing frontend
>>    instead.
>
> Do other programs that go beyond syntax highlighting actually get to
> parse our QAPI schema definitions? Or would they rather deal with the
> return value of query-qmp-schema?

query-qmp-schema is for introspecting QMP.  It tells you what *this*
build of QEMU's QMP can do.  The schema tells you what QMP can do in
*any* build of this version of QEMU, and more.

To introspect QMP, process output of query-qmp-schema.

To work with the QAPI schema, interface with the frontend from
scripts/qapi/.

> Neither the QAPI schema nor a YAML file with the same structure are a
> standard approach to describe JSON documents. So even if we replace JSON
> in the lower layer, the whole thing (and as you say, the upper layer is
> the more interesting part) still stays non-standard in a way and more
> advanced tools can't be used with it.
>
> And of course, even if we did use something more standard like JSON
> Schema or whatever exists for YAML, we would still have to massively
> extend it because the QAPI schema contains much more information than
> just what would be needed to validate some input. We control all aspects
> of generated C code with it.

Yup.  "IDL for QMP" is just one aspect of QAPI.


Reply via email to