Victor Toso <victort...@redhat.com> writes: > This generator has two goals: > 1. Mechanical validation of QAPI examples > 2. Generate the examples in a JSON format to be consumed for extra > validation. > > The generator iterates over every Example section, parsing both server > and client messages. The generator prints any inconsistency found, for > example: > > | Error: Extra data: line 1 column 39 (char 38) > | Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017 > | Data: {"execute": "cancel-vcpu-dirty-limit"}, > | "arguments": { "cpu-index": 1 } }
What language does it parse? Can you give a grammar? Should the parser be integrated into the doc parser, i.e. class QAPIDoc? > The generator will output other JSON file with all the examples in the Another JSON file, or other JSON files? > QAPI module that they came from. This can be used to validate the > introspection between QAPI/QMP to language bindings, for example: > > | { "examples": [ > | { > | "id": "ksuxwzfayw", > | "client": [ > | { > | "sequence-order": 1 Missing comma > | "message-type": "command", > | "message": > | { "arguments": > | { "device": "scratch", "size": 1073741824 }, > | "execute": "block_resize" > | }, > | } ], > | "server": [ > | { > | "sequence-order": 2 Another one. > | "message-type": "return", > | "message": { "return": {} }, Extra comma. > | } ] > | } > | ] } Indentation is kind of weird. The JSON's Valid structure and semantics are not documented. We've developed a way specify that, you might've heard of it, it's called "QAPI schema" ;-P Kidding aside, we've done this before. E.g. docs/interop/firmware.json specifies firmware descriptors. We have some in pc-bios/descriptors/. > Note that the order matters, as read by the Example section and > translated into "sequence-order". A language binding project can then > consume this files to Marshal and Unmarshal, comparing if the results > are what is to be expected. > > RFC discussion: > https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html > > Signed-off-by: Victor Toso <victort...@redhat.com> > --- > scripts/qapi/dumpexamples.py | 208 +++++++++++++++++++++++++++++++++++ > scripts/qapi/main.py | 3 +- > 2 files changed, 210 insertions(+), 1 deletion(-) > create mode 100644 scripts/qapi/dumpexamples.py > > diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py > new file mode 100644 > index 0000000000..55d9f13ab7 > --- /dev/null > +++ b/scripts/qapi/dumpexamples.py Let's name this examples.py. It already does a bit more than dump (namely validate), and any future code dealing with examples will likely go into this file. > @@ -0,0 +1,208 @@ > +""" > +Dump examples for Developers > +""" > +# Copyright (c) 2023 Red Hat Inc. > +# > +# Authors: > +# Victor Toso <victort...@redhat.com> > +# > +# This work is licensed under the terms of the GNU GPL, version 2. We should've insisted on v2+ for the QAPI generator back when we had a chance. *Sigh* > +# See the COPYING file in the top-level directory. > + > +# Just for type hint on self > +from __future__ import annotations > + > +import os > +import json > +import random > +import string > + > +from typing import Dict, List, Optional > + > +from .schema import ( > + QAPISchema, > + QAPISchemaType, > + QAPISchemaVisitor, > + QAPISchemaEnumMember, > + QAPISchemaFeature, > + QAPISchemaIfCond, > + QAPISchemaObjectType, > + QAPISchemaObjectTypeMember, > + QAPISchemaVariants, pylint warns scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaEnumMember imported from schema (unused-import) scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaObjectTypeMember imported from schema (unused-import) scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaVariants imported from schema (unused-import) > +) > +from .source import QAPISourceInfo > + > + > +def gen_examples(schema: QAPISchema, > + output_dir: str, > + prefix: str) -> None: > + vis = QAPISchemaGenExamplesVisitor(prefix) > + schema.visit(vis) > + vis.write(output_dir) The other backends have this at the end of the file. Either order works, but consistency is nice. > + > + > +def get_id(random, size: int) -> str: pylint warns scripts/qapi/dumpexamples.py:44:11: W0621: Redefining name 'random' from outer scope (line 17) (redefined-outer-name) > + letters = string.ascii_lowercase > + return ''.join(random.choice(letters) for i in range(size)) > + > + > +def next_object(text, start, end, context) -> (Dict, bool): > + # Start of json object > + start = text.find("{", start) > + end = text.rfind("}", start, end+1) Single quotes, please, for consistency with quote use elsewhere. Rules of thumb: * Double quotes for english text (e.g. error messages), so we don't have to escape apostrophes. * Double quotes when they reduce the escaping * Else single quotes More elsewhere, not flagged. > + > + # try catch, pretty print issues Is this comment useful? > + try: > + ret = json.loads(text[start:end+1]) > + except Exception as e: pylint warns scripts/qapi/dumpexamples.py:57:11: W0703: Catching too general exception Exception (broad-except) Catch JSONDecodeError? > + print("Error: {}\nLocation: {}\nData: {}\n".format( > + str(e), context, text[start:end+1])) Errors need to go to stderr. Have you considered using QAPIError to report these errors? > + return {}, True > + else: > + return ret, False > + > + > +def parse_text_to_dicts(text: str, context: str) -> (List[Dict], bool): Before I review the parser, I'd like to know the (intended) language being parsed. > + examples, clients, servers = [], [], [] > + failed = False > + > + count = 1 > + c, s = text.find("->"), text.find("<-") > + while c != -1 or s != -1: > + if c == -1 or (s != -1 and s < c): > + start, target = s, servers > + else: > + start, target = c, clients > + > + # Find the client and server, if any > + if c != -1: > + c = text.find("->", start + 1) > + if s != -1: > + s = text.find("<-", start + 1) > + > + # Find the limit of current's object. > + # We first look for the next message, either client or server. If > none > + # is avaible, we set the end of the text as limit. > + if c == -1 and s != -1: > + end = s > + elif c != -1 and s == -1: > + end = c > + elif c != -1 and s != -1: > + end = (c < s) and c or s pylint advises scripts/qapi/dumpexamples.py:91:12: R1706: Consider using ternary (c if c < s else s) (consider-using-ternary) > + else: > + end = len(text) - 1 > + > + message, error = next_object(text, start, end, context) > + if error: > + failed = True > + > + if len(message) > 0: > + message_type = "return" > + if "execute" in message: > + message_type = "command" > + elif "event" in message: > + message_type = "event" > + > + target.append({ > + "sequence-order": count, > + "message-type": message_type, > + "message": message > + }) > + count += 1 > + > + examples.append({"client": clients, "server": servers}) > + return examples, failed > + > + > +def parse_examples_of(self: QAPISchemaGenExamplesVisitor, Uh, shouldn't this be a method of QAPISchemaGenExamplesVisitor? > + name: str): > + > + assert(name in self.schema._entity_dict) Makes both pycodestyle and pylint unhappy. Better: assert name in self.schema._entity_dict pylint warns scripts/qapi/dumpexamples.py:120:19: W0212: Access to a protected member _entity_dict of a client class (protected-access) here and two more times below. > + obj = self.schema._entity_dict[name] > + > + if not obj.info.pragma.doc_required: > + return > + > + assert((obj.doc is not None)) Better: assert obj.doc is not None > + module_name = obj._module.name > + > + # We initialize random with the name so that we get consistent example > + # ids over different generations. The ids of a given example might > + # change when adding/removing examples, but that's acceptable as the > + # goal is just to grep $id to find what example failed at a given test > + # with minimum chorn over regenerating. churn from? > + random.seed(name, version=2) You're reinitializing the global PRNG. Feels unclean. Create your own here? > + > + for s in obj.doc.sections: > + if s.name != "Example": docs/devel/qapi-code-gen.rst section "Definition documentation": A tagged section starts with one of the following words: "Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:". The section ends with the start of a new section. You're missing "Examples". docs/sphinx/qapidoc.py uses s.name.startswith('Example'). > + continue > + > + if module_name not in self.target: > + self.target[module_name] = [] > + > + context = f'''{name} at {obj.info.fname}:{obj.info.line}''' > + examples, failed = parse_text_to_dicts(s.text, context) > + if failed: > + # To warn user that docs needs fixing > + self.failed = True > + > + for example in examples: > + self.target[module_name].append({ > + "id": get_id(random, 10), > + "client": example["client"], > + "server": example["server"] > + }) > + > + > +class QAPISchemaGenExamplesVisitor(QAPISchemaVisitor): > + > + def __init__(self, prefix: str): > + super().__init__() > + self.target = {} @target maps what to what? > + self.schema = None > + self.failed = False > + > + def visit_begin(self, schema): > + self.schema = schema > + > + def visit_end(self): > + self.schema = None > + assert not self.failed, "Should fix the docs" Unless I'm misreading the code, this asserts "all the examples parse fine." Misuse of assert for reporting errors. > + > + def write(self: QAPISchemaGenExamplesVisitor, > + output_dir: str) -> None: > + for filename, content in self.target.items(): > + pathname = os.path.join(output_dir, "examples", filename) > + odir = os.path.dirname(pathname) > + os.makedirs(odir, exist_ok=True) > + result = {"examples": content} > + > + with open(pathname, "w") as outfile: pylint warns scripts/qapi/dumpexamples.py:180:17: W1514: Using open without explicitly specifying an encoding (unspecified-encoding) Recommend to pass encoding='utf=8'. > + outfile.write(json.dumps(result, indent=2, sort_keys=True)) > + > + def visit_command(self: QAPISchemaGenExamplesVisitor, > + name: str, > + info: Optional[QAPISourceInfo], > + ifcond: QAPISchemaIfCond, > + features: List[QAPISchemaFeature], > + arg_type: Optional[QAPISchemaObjectType], > + ret_type: Optional[QAPISchemaType], > + gen: bool, > + success_response: bool, > + boxed: bool, > + allow_oob: bool, > + allow_preconfig: bool, > + coroutine: bool) -> None: > + > + if gen: > + parse_examples_of(self, name) Why only if gen? > + > + def visit_event(self: QAPISchemaGenExamplesVisitor, > + name: str, > + info: Optional[QAPISourceInfo], > + ifcond: QAPISchemaIfCond, > + features: List[QAPISchemaFeature], > + arg_type: Optional[QAPISchemaObjectType], > + boxed: bool): > + > + parse_examples_of(self, name) Examples in definition comments for types are silently ignored. Should we do something with them? > diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py > index 316736b6a2..9482439fa9 100644 > --- a/scripts/qapi/main.py > +++ b/scripts/qapi/main.py > @@ -13,6 +13,7 @@ > > from .commands import gen_commands > from .common import must_match > +from .dumpexamples import gen_examples > from .error import QAPIError > from .events import gen_events > from .introspect import gen_introspect > @@ -53,7 +54,7 @@ def generate(schema_file: str, > gen_commands(schema, output_dir, prefix, gen_tracing) > gen_events(schema, output_dir, prefix) > gen_introspect(schema, output_dir, prefix, unmask) > - > + gen_examples(schema, output_dir, prefix) > > def main() -> int: > """ You provide some type annotations, but mypy isn't happy: $ mypy --config-file scripts/qapi/mypy.ini scripts/qapi/dumpexamples.py scripts/qapi/parser.py:566: error: Function is missing a return type annotation scripts/qapi/parser.py:570: error: Function is missing a return type annotation scripts/qapi/dumpexamples.py:44: error: Function is missing a type annotation for one or more arguments scripts/qapi/dumpexamples.py:49: error: Function is missing a type annotation for one or more arguments scripts/qapi/dumpexamples.py:49: error: Syntax error in type annotation scripts/qapi/dumpexamples.py:49: note: Suggestion: Use Tuple[T1, ..., Tn] instead of (T1, ..., Tn) scripts/qapi/dumpexamples.py:65: error: Syntax error in type annotation scripts/qapi/dumpexamples.py:65: note: Suggestion: Use Tuple[T1, ..., Tn] instead of (T1, ..., Tn) scripts/qapi/dumpexamples.py:66: error: Need type annotation for "clients" (hint: "clients: List[<type>] = ...") scripts/qapi/dumpexamples.py:66: error: Need type annotation for "servers" (hint: "servers: List[<type>] = ...") scripts/qapi/dumpexamples.py:117: error: Function is missing a return type annotation scripts/qapi/dumpexamples.py:120: error: "None" has no attribute "_entity_dict" scripts/qapi/dumpexamples.py:121: error: "None" has no attribute "_entity_dict" scripts/qapi/dumpexamples.py:161: error: Need type annotation for "target" (hint: "target: Dict[<type>, <type>] = ...") scripts/qapi/dumpexamples.py:165: error: Function is missing a type annotation scripts/qapi/dumpexamples.py:168: error: Function is missing a return type annotation scripts/qapi/dumpexamples.py:168: note: Use "-> None" if function does not return a value scripts/qapi/dumpexamples.py:200: error: Function is missing a return type annotation Found 15 errors in 2 files (checked 1 source file) I think before I dig deeper, we should discuss my findings so far. Here's my .pylintrc, in case you want to run pylint yourself: disable= consider-using-f-string, fixme, invalid-name, missing-docstring, too-few-public-methods, too-many-arguments, too-many-branches, too-many-instance-attributes, too-many-lines, too-many-locals, too-many-statements, unused-argument, unused-wildcard-import,