schema: allow resolve_type to be used for built-in types

Markus Armbruster Mon, 22 Jan 2024 05:13:28 -0800

John Snow <js...@redhat.com> writes:

> On Tue, Jan 16, 2024 at 6:09 AM Markus Armbruster <arm...@redhat.com> wrote:
>>
>> John Snow <js...@redhat.com> writes:
>>
>> > allow resolve_type to be used for both built-in and user-specified
>> > type definitions. In the event that the type cannot be resolved, assert
>> > that 'info' and 'what' were both provided in order to create a usable
>> > QAPISemError.
>> >
>> > In practice, 'info' will only be None for built-in definitions, which
>> > *should not fail* type lookup.
>> >
>> > As a convenience, allow the 'what' and 'info' parameters to be elided
>> > entirely so that it can be used as a can-not-fail version of
>> > lookup_type.
>>
>> The convenience remains unused until the next patch.  It should be added
>> there.
>
> Okie-ducky.
>
>>
>> > Note: there are only three callsites to resolve_type at present where
>> > "info" is perceived to be possibly None:
>> >
>> >     1) QAPISchemaArrayType.check()
>> >     2) QAPISchemaObjectTypeMember.check()
>> >     3) QAPISchemaEvent.check()
>> >
>> >     Of those three, only the first actually ever passes None;
>>
>> Yes.  More below.
>
> Scary...


I know...

>> >                                                               the other two
>> >     are limited by their base class initializers which accept info=None, 
>> > but
>>
>> They do?
>
> In the case of QAPISchemaObjectTypeMember, the parent class
> QAPISchemaMember allows initialization with info=None. I can't fully
> trace all of the callsites, but one of them at least is in types.py:
>
>>     enum_members = members + [QAPISchemaEnumMember('_MAX', None)]

I see.

We may want to do the _MAX thingy differently.  Not now.

> which necessitates, for now, info-less QAPISchemaEnumMember, which
> necessitates info-less QAPISchemaMember. There are others, etc.

Overriding an inherited attribute of type Optional[T] so it's
non-optional T makes mypy unhappy?

>> >     neither actually use it in practice.
>> >
>> > Signed-off-by: John Snow <js...@redhat.com>
>>
>> Hmm.
>
> Scary.
>
>>
>> We look up types by name in two ways:
>>
>> 1. Failure is a semantic error
>>
>>    Use .resolve_type(), passing real @info and @what.
>>
>>    Users:
>>
>>    * QAPISchemaArrayType.check() resolving the element type
>>
>>      Fine print: when the array type is built-in, we pass None @info and
>>      @what.  The built-in array type's element type must exist for
>>      .resolve_type() to work.  This commit changes .resolve_type() to
>>      assert it does.
>>
>>    * QAPISchemaObjectType.check() resolving the base type
>>
>>    * QAPISchemaObjectTypeMember.check() resolving the member type
>>
>>    * QAPISchemaCommand.check() resolving argument type (if named) and
>>      return type (which is always named).
>>
>>    * QAPISchemaEvent.check() resolving argument type (if named).
>>
>>    Note all users are in .check() methods.  That's where type named get
>>    resolved.
>>
>> 2. Handle failure
>>
>>    Use .lookup_type(), which returns None when the named type doesn't
>>    exist.
>>
>>    Users:
>>
>>    * QAPISchemaVariants.check(), to look up the base type containing the
>>      tag member for error reporting purposes.  Failure would be a
>>      programming error.
>>
>>    * .resolve_type(), which handles failure as semantic error
>>
>>    * ._make_array_type(), which uses it as "type exists already"
>>       predicate.
>>
>>    * QAPISchemaGenIntrospectVisitor._use_type(), to look up certain
>>      built-in types.  Failure would be a programming error.
>>
>> The next commit switches the uses where failure would be a programming
>> error from .lookup_type() to .resolve_type() without @info and @what, so
>> failure trips its assertion.  I don't like it, because it overloads
>> .resolve_type() to serve two rather different use cases:
>>
>> 1. Failure is a semantic error; pass @info and @what
>>
>> 2. Failure is a programming error; don't pass @info and what
>>
>> The odd one out is of course QAPISchemaArrayType.check(), which wants to
>> use 1. for the user's types and 2. for built-in types.  Let's ignore it
>> for a second.
>
> "Let's ignore what motivated this patch" aww...

Just for a second, I swear!

>> I prefer to do 2. like typ = .lookup_type(); assert typ.  We can factor
>> this out into its own helper if that helps (pardon the pun).
>>
>> Back to QAPISchemaArrayType.check().  Its need to resolve built-in
>> element types, which have no info, necessitates .resolve_type() taking
>> Optional[QAPISourceInfo].  This might bother you.  It doesn't bother me,
>> unless it leads to mypy complications I can't see.
>
> Well, with this patch I allowed it to take Optional[QAPISourceInfo] -
> just keep in mind that QAPISemError *requires* an info object, even
> though the typing there is also Optional[QAPISourceInfo] ... It will
> assert that info is present in __str__.
>
> Actually, I'd love to change that too - and make it fully required -
> but since built-in types have no info, there's too many places I'd
> need to change to enforce this as a static type.
>
> Still.

Invariant: no error reports for built-in types.

Checked since forever by asserting info is not None, exploiting the fact
that info is None exactly for built-in types.

This makes info: Optional[QAPISourceInfo] by design.

Works.

Specializing it to just QAPISourceInfo moves the assertion check from
run time to compile time.  Might give a nice feeling, but I don't think
it's practical everywhere, and it doesn't really matter anyway.

Using a special value of QAPISourceInfo instead of None would also get
rid of the Optional, along with the potential of checking at compile
time.  Good trade *if* it simplifies the code.  See also the very end of
my reply.

>> We can simply leave it as is.  Adding the assertion to .resolve_type()
>> is fine.
>>
>> Ot we complicate QAPISchemaArrayType.check() to simplify
>> .resolve_type()'s typing, roughly like this:
>>
>>             if self.info:
>>                 self.element_type = schema.resolve_type(
>>                     self._element_type_name,
>>                     self.info, self.info.defn_meta)
>>             else:               # built-in type
>>                 self.element_type = schema.lookup_type(
>>                     self._element_type_name)
>>                 assert self.element_type
>>
>> Not sure it's worth the trouble.  Thoughts?
>
> I suppose it's your call, ultimately. This patch exists primarily to
> help in two places:
>
> (A) QAPISchemaArrayType.check(), as you've noticed, because it uses
> the same path for both built-in and user-defined types. This is the
> only place in the code where this occurs *at the moment*, but I can't
> predict the future.
>
> (B) Calls to lookup_type in introspect.py which look up built-in types
> and must-not-fail. It was cumbersome in the old patchset, but this one
> makes it simpler.
>
> I suppose at the moment, having the assert directly in resolve_type
> just means we get to use the same helper/pathway for both user-defined
> and built-in types, which matches the infrastructure we already have,
> which doesn't differentiate between the two. (By which I mean, all of
> the Schema classes are not split into built-in and user-defined types,
> so it is invisible to the type system.)

Yes.

> I could add conditional logic to the array check, and leave the
> lookup_type calls in introspect.py being a little cumbersome - my main
> concern with that solution is that I might be leaving a nasty
> booby-trap in the future if someone wants to add a new built-in type
> or something gets refactored to share more code pathways. Maybe that's
> not fully rational, but it's why I went the way I did.

In my mind, .resolve_type() is strictly for resolving types during
semantic analysis: look up a type by name, report an error if it doesn't
exist.

Before this patch:

(A) QAPISchemaArrayType.check() works.  The invariant check is buried
somewhat deep, in QAPISourceError.

(B) introspect.py works.  The invariant is not checked there.

(C) QAPISchemaVariants.check() works.  A rather losely related invariant
is checked there: the tag member's type exists.

This patch conflates two changes.

One, it adds an invariant check right to .resolve_type().  Impact:

    (A) Adds an invariant check closer to the surface.

    (B) Not touched.

    (C) Not touched.

No objection.

Two, it defaults .resolve_type()'s arguments to None.  Belongs to the
next patch.

The next patch overloads .resolve_type() to serve two use cases,
1. failure is a semantic error, and 2. failure is a programming error.
The first kind passes the arguments, the second doesn't.  Impact:

    (A) Not touched.

    (B) Adds invariant checking, in the callee.

    (C) Pushes the invariant checking into the callee.

I don't like overloading .resolve_type() this way.  Again: in my mind,
it's strictly for resolving the user's type names in semantic analysis.

If I drop this patch and the next one, mypy complains

    scripts/qapi/schema.py:1219: error: Argument 1 has incompatible type 
"QAPISourceInfo | None"; expected "QAPISourceInfo"  [arg-type]
    scripts/qapi/introspect.py:230: error: Incompatible types in assignment 
(expression has type "QAPISchemaType | None", variable has type 
"QAPISchemaType")  [assignment]
    scripts/qapi/introspect.py:233: error: Incompatible types in assignment 
(expression has type "QAPISchemaType | None", variable has type 
"QAPISchemaType")  [assignment]

Retaining the assertion added in this patch takes care of the first one.

To get rid of the two in introspect.py, we need to actually check the
invariant:

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 67c7d89aae..4679b1bc2c 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -227,10 +227,14 @@ def _use_type(self, typ: QAPISchemaType) -> str:
 
         # Map the various integer types to plain int
         if typ.json_type() == 'int':
-            typ = self._schema.lookup_type('int')
+            type_int = self._schema.lookup_type('int')
+            assert type_int
+            typ = type_int
         elif (isinstance(typ, QAPISchemaArrayType) and
               typ.element_type.json_type() == 'int'):
-            typ = self._schema.lookup_type('intList')
+            type_intList = self._schema.lookup_type('intList')
+            assert type_intList
+            typ = type_intList
         # Add type to work queue if new
         if typ not in self._used_types:
             self._used_types.append(typ)

Straightforward enough, although with a bit of notational overhead.

We use t = .lookup_type(...); assert t in three places then.  Feel free
to factor it out into a new helper.

> (P.S. I still violently want to create an info object that represents
> built-in definitions so I can just get rid of all the
> Optional[QAPISourceInfo] types from everywhere. I know I tried to do
> it before and you vetoed it, but the desire lives on in my heart.)

Once everything is properly typed, the cost and benefit of such a change
should be more clearly visible.

For now, let's try to type what we have, unless what we have complicates
typing too much.

[...]

Re: [PATCH v2 09/19] qapi/schema: allow resolve_type to be used for built-in types

Reply via email to