Re: Notes on Generating Python signatures for QMP RPCs

Markus Armbruster Thu, 03 Feb 2022 02:07:11 -0800

John Snow <js...@redhat.com> writes:

> On Thu, Jan 27, 2022 at 9:03 AM Markus Armbruster <arm...@redhat.com> wrote:
>>
>> John Snow <js...@redhat.com> writes:
>>
>> > Hiya, I was experimenting with $subject and ran into a few points of
>> > interest. This is basically an informal status report from me. I've
>> > CC'd some of the usual suspects for people who care about SDKs and API
>> > design and such.
>> >
>> > This is just a list of some observations I had, so not everything
>> > below is a question or an action item. Just sharing some notes.
>> >
>> > (0) This experiment concerned generating signatures based on
>> > introspection data, dynamically at runtime. In this environment type
>> > hints are not required, as they are not actually used at runtime.
>> > However, I added them anyway as an exercise for dynamic documentation
>> > purposes. (i.e. `help(auto_generated_function)` showing type hints can
>> > still be useful -- especially without access to QAPI doc blocks.)
>> > Determining type information is also necessary for generating the
>> > marshaling/unmarshaling functions to communicate with the server.
>> >
>> > (1) QAPI types the return of many commands as an empty object. That's
>> > literally indeed what happens on the wire, and it makes sense in that
>> > if these commands were ever to return anything, it is a "compatible
>> > evolution" to include new fields in such an object. In Python, this
>> > does not make much sense, though; as this is somewhat hard to
>> > annotate:
>> >
>> > async def stop() -> Literal[{}]: ...
>> >
>> > The more pythonic signature is:
>> >
>> > async def stop() -> None: ...
>> >
>> > I feel like it's spiritually equivalent, but I am aware it is a
>> > distinct choice that is being made. It could theoretically interfere
>> > with a choice made in QAPI later to explicitly return Null. I don't
>> > think we'd do that, but it's still a choice of abstraction that
>> > reduces the resolution of distinct return signatures.
>>
>> Having functions take any number of arguments and return one result is
>> traditional in programming, so we hardly notice how odd the asymmetry
>> actually is.
>>
>> QAPI commands are symmetric: they take any number of (named) arguments,
>> and return any number of (named) results.  Can also be viewed as taking
>> one exactly one argument (always an object) and returning exactly one
>> value (always an object).
>>
>> Except some QAPI commands actually return return a list of objects, and
>> some return a single, unnamed value.  The former are intentional, the
>> latter are screwups from the early days.
>>
>> > (1.5) Do we have a formal definition for what we consider to be a
>> > "compatible evolution" of the schema?
>>
>> Section "Compatibility considerations" in docs/devel/qapi-code-gen.rst
>> tries.  I wouldn't call it "a formal definition".
>>
>> >                                       I've got a fairly good idea, but
>> > I am not sure how it's enforced. Is it just Markus being very
>> > thorough?
>>
>> Yup, with help from other reviewers such as Eric Blake, and enabled by
>> the relative simplicity of the rules.
>>
>> >           If we add more languages to the generator, we probably can't
>> > burden Markus with knowing how to protect the compatibility of every
>> > generator.
>>
>> qapi-code-gen.rst applies the compatibility razor to the wire protocol
>> only.  It doesn't concern itself with language bindings at all.
>>
>
> The generator code certainly does, though.


Correct.  Since the generator generates C bindings, it must care about
language bindings (one).

What it doesn't concern itself so far with is *stability* of this
binding: it's just for use within QEMU.

>                                            For example, "C name"
> collisions are guarded against by the generator -- we do not let it
> get as far as blowing up clang.

"Guards against name collisions" feels like an overly charitable
description.  "Makes an effort" is closer to the truth.

>> We've only ever had C bindings.  Designed for internal use only, where
>> compatible evolution is pretty much irrelevant.
>>
>> We've talked about bindings for external use, both long ago (C), and
>> more recently (Rust, Python, ...).
>>
>> [*] Compatible evolution of language bindings for QAPI is a complex
>> topic.  Not today.
>>
>
> OK, agreed. It's a big topic. I think it's not fully possible to cover
> all cases in all places, but I do seem to recall Marc-Andre having
> some opinions about a few things we could do to staple it down just a
> little bit, and maybe that's enough. I'd like to talk about it
> eventually. Marc-Andre could cover rust, I could cover Python, and
> Victor/Andrea could help us cover golang.

Yes, it's something we should talk about at some point.

>> >            We might need more assertions for invariants in the
>> > generator itself ... but to protect "evolution", we need points of
>> > reference to test against. Do we have anything for this? Do we need
>> > one? Should I write a test?
>>
>> So far, we don't have software to reason about QAPI schema *changes*.
>> The generator checks the current QAPI schema, it doesn't consider the
>> change from the previous version.
>>
>> A program to analyze a schema change (the difference between two schema
>> versions) could be useful.  It could also be difficult to write.
>>
>
> Yeh. Something not worth doing until we chat about what compatible
> evolution looks like for other language bindings, because otherwise we
> don't know what we're designing for. Later.
>
>> > (2) There are five commands that are exempted from returning an
>> > object. qom-get is one.
>>
>> This is pragma 'command-returns-exceptions'.  Like its buddy pragmas, it
>> exists because we started enforcing the rules only after our rule
>> breaking had calcified in the external interface.
>>
>> >                         However, what I didn't really explicitly
>> > realize is that this doesn't mean that only five commands don't return
>> > an object -- we also actually allow for a list of objects, which
>> > *many* commands use. There's no technical issue here, just an
>> > observation. It is no problem at all to annotate Python commands as
>> > "-> SomeReturnType" or "-> List[SomeDifferentReturnType]" or even "->
>> > str:" as needed.
>> >
>> > (3) Over the wire, the order of arguments to QMP commands is not
>> > specified.
>>
>> Correct.
>>
>> >            In generating commands procedurally from introspection
>> > data, I am made aware that there are several commands in which
>> > "optional" arguments precede "required" arguments. This means that
>> > function generation in Python cannot match the stated order 1:1.
>> >
>> > That's not a crisis per se. For generating functions, we can use a
>> > stable sort to bubble-up the required arguments, leaving the optional
>> > ones trailing. However, it does mean that depending on how the QAPI
>> > schema is modified in the future, the argument order may change
>> > between versions of a generative SDK. I'd like to avoid that, if I
>> > can.
>> >
>> > One trick I have available to me in Python is the ability to stipulate
>> > that all (QAPI) "optional" arguments are keyword-only. This means that
>> > Optional parameters can be re-ordered arbitrarily without any breakage
>> > in the generative python API. The only remaining concern is if the
>> > *mandatory* arguments are re-ordered.
>> >
>> > (In fact, I could stipulate that ALL arguments in Python bindings are
>> > keyword-only, but I think that's going overboard and hurts usability
>> > and readability.)
>>
>> This would match the wire protocol, which only uses keyword, never
>> positional.  But your usability argument is valid.
>>
>> > Marc-Andre has mentioned this before, but it might be nice to actually
>> > specify a canonical ordering of arguments for languages that require
>> > such things, and to make sure that we do not break this ordering
>> > without good reason.
>> >
>> > (Of course, SDK stability is not fully possible, and if this
>> > functionality is desired, then it's time to use libvirt, hint hint
>> > hint! However, we can avoid pointless churn in generated code and make
>> > it easier to use and experiment with.)
>>
>> See [*] above.
>>
>> > (4) StrOrNull is a tricky design problem.
>> >
>> > In Python, generally, omitted arguments are typed like this:
>> > async def example_command(arg: Optional[int] = None) -> None: ...
>> >
>> > Most Python programmers would recognize that signature as meaning that
>> > they can omit 'arg' and some hopefully good default will be chosen.
>> > However, in QAPI we do have the case where "omitted" is distinct from
>> > "explicitly provided null". This is ... a bit challenging to convey
>> > semantically. Python does not offer the ability to tell "what kind of
>> > None" it received; i.e. unlike our generated QMP marshalling
>> > functions, we do not have a "has_arg" boolean we can inspect.
>>
>> Yes.
>>
>> The QAPI schema language's semantics of optional are at odds with
>> Python's.
>>
>> In Python, an absent argument defaults to some value, either one the
>> programmer provided, or None.  This makes semantics of "absent" obvious
>> from the signature.
>>
>
> "None" is never an assumed default in Python, it's always explicitly
> assigned as part of the signature. It is just by principle of least
> surprise that we all agree to use "None" as that default.

You're right.

>> In QAPI, an absent value is distinct from any present value.  Semantics
>> of "absent" depend the function's *code*, not just its signature.  I
>> consider this a design mistake.
>>
>
> Because of the above, Python and C are actually similar: the true
> default depends on code, not the signature.

Do you mean QAPI and C?

>> In many (most?) cases, code treats absent just like one specific present
>> value.  We could lift this from code to the schema: permit specifying a
>> default value in the schema, default absent to this value, don't
>> generate has_FOO.
>>
>
> The problem is when the default is something dynamically determined;
> we couldn't specify a static default. Maybe you regard that as a
> feature! but it's probably a lot of work to turn back the hands of
> time there now, so...

I suspect dynamic defaults are fairly rare.

Even when you "only" have Python's "absent defaults to a value, and you
can't distinguish absent from present with that value", you can still
code up dynamic defaults.  It takes a bit more effort than a static
default, but I *like* that.  When making things more complex takes more
effort, the chances of complex getting chosen only when it's genuinely
useful improve :)

>> The remainder could then be handled as the ugly wart it is: have both
>> has_FOO and FOO in Python, just like in C.  Or handle it using one of
>> the ideas you describe next.
>
> There's no way to do that automatically that I know of... without
> resorting to type signatures that utilize **kwargs, which are
> untypable and don't reveal their arguments upon introspection. I think
> it's just not possible.
>
>>
>> > So how do we write a function signature that conveys the difference
>> > between "omitted" and "explicitly nulled" ...?
>> >
>> > One common trick in Python is to create a new sentinel singleton, and
>> > name it something like "Default" or "Unspecified" or "Undefined". Many
>> > programmers use the ellipsis `...` value for this purpose. Then, we
>> > can check if a value was omitted (`...`) or explicitly provided
>> > (`None`). It is very unlikely that these sentinels would organically
>> > collide with user-provided values (Unless they were trying to
>> > explicitly invoke default behavior.)
>> >
>> > However, `...` isn't supported as a type and using it as the default
>> > value invalidates the typing of the field. As far as I can tell, it
>> > CANNOT be typed. We could create our own sentinel, but IMO, this
>> > creates a much less readable signature:
>> >
>> > async def example_command(arg: Union[int, qmp.Default] = qmp.Default)
>> > -> None: ...
>> >
>> > This probably doesn't communicate "This parameter is actually
>> > optional" to a casual Python programmer, so I think it's a dead end.
>> >
>> > The last thing I can think of here is to instead introduce a special
>> > sentinel that represents the explicit Null instead. We could use a
>> > special Null() type that means "Explicitly send a null over the wire."
>> >
>> > This value comes up fairly infrequently, so most signatures will
>> > appear "Pythonic" and the jankiness will be confined to the few
>> > commands that require it, e.g.
>> >
>> > async def example_command(arg: Optional[Union[int, Null]] = None) -> None: 
>> > ...
>> >
>> > The above would imply an optional argument that can be omitted, can be
>> > provided with an int, or can be provided with an explicit Null. I
>> > think this is a good compromise.
>>
>> More so, I think, if we manage to substantially reduce "absent is
>> distinct from any present value".
>>
>
> It only seems to show up a handful of times so far, it's not very
> widespread as-is. Keeping it from spreading would be good, I assume.
> Sweeping the dust into this particular corner seems like the correct
> way to minimize API weirdness.
>
>> > (5) Generating functions from introspection data is difficult because
>> > all of the structures are anonymous.
>>
>> Introspection is about the wire format.  Type names are not part of it,
>> and that's why we hide them in introspection.
>>
>
> Yep. I'm not asking for that to change, necessarily. Just saying "Hey,
> I tried to autogenerate an SDK based on Introspection data and YOU
> WON'T BELIEVE WHAT HAPPENED NEXT".
>
> Except you'd believe it, because you designed it. For me, trying to do
> it and seeing the result was the fastest way to truly come to terms
> with how introspection data worked. Some of the lessons learned from
> this exercise also apply to the task of generating function signatures
> in general, too.
>
>> >                                      The base type for most objects
>> > becomes `Dict[str, Any]` but this isn't very descriptive. For Python
>> > 3.8+, we can do a little better and use `Dict[Literal["name", "node"],
>> > Any]` to help suggest what keys are valid, but we don't have access to
>> > an in-line definition that pairs key names with values.
>> >
>> > Python 3.8+ would allow us the use of TypedDict, but those have to be
>> > generated separately ... AND we still don't have a name for them, so
>> > it'd be a little hogwash to have a function like:
>> >
>> > async def some_command(arg: Anon321) -> None: ...
>> >
>> > That doesn't really tell me, the human, much of anything. The best
>> > that could perhaps be done is to create type aliases based on the name
>> > of the argument it is the data type for, like "ArgObject". It's a bit
>> > messy. For now, I've just stuck with the boring `Dict[Literal[...],
>> > Any]` definition.
>>
>> I'm afraid you're trying to press introspection into a role it's not
>> designed for.
>
> Yes.
>
>>
>> Why not generate from the QAPI schema?
>
> I was hoping you'd say that! =)
> GOOD NEWS: I have also done that!

Well played, sir!

> I tried it both ways to find the strengths and weaknesses of each
> approach so I would have an informed basis for upstreaming the code
> for either. The dynamic runtime method has its uses in that it will
> always match the host API exactly. It cannot be used for static type
> analysis of a script, though (The code doesn't exist until you connect
> to the server!). The statically generated version has a lot more
> information to work with and can be used for static type analysis, but
> includes some parameters and functions that may not exist on your
> server's instance.
>
> The two approaches aren't even necessarily mutually exclusive.
>
>>
>> > (6) Dealing with variants is hard. I didn't get a working
>>
>> Do you mean alternates?
>>
>
> No, alternates at the python signature level are easy: Union[BranchA, 
> BranchB].
>
> In my code, I stubbed out the case where 'meta-type': 'object' has a
> 'variants' key. There's no blocker here, I just didn't get around to
> un-stubbing it. I need to refactor a bit to make this work -- this
> requires some flattening in the client code, unlike "normal" objects
> which come pre-flattened over the wire. Just remarking that it appears
> to be the most complex case for digesting introspection information.
>
>> > implementation for them within one day of hacking, so I stubbed them
>> > out. There's no major blocker here, just reporting that I still have
>> > to finish this part of the experiment. I'm pretty happy that Markus
>> > simplified the union types we have, though. To my knowledge, I got
>> > everything else working perfectly.
>> >
>> > (7) I have no idea what to do about functions that "may not return".
>> > The QGA stuff in particular, I believe, is prone to some weirdness
>> > that violates the core principles of the QMP spec.
>>
>> Yes.
>>
>> docs/interop/qmp-spec.txt dictates a command sends either a success or
>> an error response.  Makes sense.
>>
>> QGA has a few commands that shut down the guest.  How could such a
>> command send a success response?  If it sends it before it initiates
>> shutdown, response transmission races with shutdown.  The easy way out
>> is violating qmp-spec.txt.  Thus, 'success-response': false.  Just for
>> QGA.
>>
>
> Oh, whoops, I already have the information we need. O:-)
> (Assuming that 'success-response' is visible in the introspection data, 
> anyway.

qapi/introspect.json:

    ##
    # @SchemaInfoCommand:
    [...]
    # TODO: @success-response (currently irrelevant, because it's QGA, not QMP)
    #
    # Since: 2.5
    ##
    { 'struct': 'SchemaInfoCommand',
      'data': { 'arg-type': 'str', 'ret-type': 'str',
                '*allow-oob': 'bool' } }

The TODO neglects to spell out "and QGA doesn't support introspection so
far".

>> >                                                    Maybe we can add a
>> > "NORETURN" feature flag to those commands in the schema so that
>> > clients can be aware of which commands may break the expectation of
>> > always getting an RPC reply?
>>
>> For a normal command, bindings marshal, send the command, receive the
>> response, unmarshal.
>>
>> For a 'success-response': false command, they only marshal and send.
>>
>> > (8) Thanks for reading. I'm still buried under my holiday inbox, but I
>> > am trying like hell to catch up on everything. I know I missed a few
>> > calls in which API design was discussed, and I apologize for that.
>> > Please send me invitations using "to: js...@redhat.com" to ensure I do
>> > not miss them. I am also frantically trying to clean up the Async QMP
>> > project I was working on to have more mental bandwidth for other
>> > tasks, but it's dragging on a bit longer than I had anticipated.
>> > Please accept my apologies for being somewhat reclusive lately.
>> >
>> > I'll (try to) send a status overview of the various projects I'm
>> > working on later to help set priority and discuss with the community
>> > what my goals are and what I'd like to do. I have an awful lot of code
>> > I've built up in local branches that I would like to share, but I'm
>> > already sending code upstream as fast as I can, so maybe I'll just do
>> > an overview at some point and point to unfinished code/experiments so
>> > it's at least not completely unwitnessed work.
>> >
>> > I hope 2022 is treating you all well,
>>
>> Happy new year to you, too!
>>
>
> How is it already February?

Crazy, isn't it?

> Thanks for taking a look at my notes!

You're welcome!

Re: Notes on Generating Python signatures for QMP RPCs

Reply via email to