Paolo Bonzini <pbonz...@redhat.com> writes:

> JSON is LL(1) and our parser indeed needs only 1 token lookahead.
> Saving the parser context is mostly unnecessary;

"Mit Kanonen auf Spatzen" (shooting cannons at sparrows).

>                                                  we can replace it
> with peeking at the next token, or remove it altogether when the
> restore only happens on errors.  The token list is destroyed anyway
> on errors.
>
> The only interesting thing is that parse_keyword always eats
> a TOKEN_KEYWORD, even if it is invalid, so it must come last in
> parse_value (otherwise, NULL is returned, parse_literal is invoked
> and it tries to peek beyond end of input).  This is caught by
> /errors/unterminated/literal, which actually checks for an unterminated
> keyword. ಠ_ಠ
>
> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
> ---
>  qobject/json-parser.c | 59 
> ++++++++++++++++++---------------------------------
>  1 file changed, 21 insertions(+), 38 deletions(-)
>
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index ac991ba..7a287ea 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -296,23 +296,6 @@ static QObject 
> *parser_context_peek_token(JSONParserContext *ctxt)
>      return token;
>  }
>  
> -static JSONParserContext parser_context_save(JSONParserContext *ctxt)
> -{
> -    JSONParserContext saved_ctxt = {0};
> -    saved_ctxt.tokens.pos = ctxt->tokens.pos;
> -    saved_ctxt.tokens.count = ctxt->tokens.count;
> -    saved_ctxt.tokens.buf = ctxt->tokens.buf;
> -    return saved_ctxt;
> -}
> -
> -static void parser_context_restore(JSONParserContext *ctxt,
> -                                   JSONParserContext saved_ctxt)
> -{
> -    ctxt->tokens.pos = saved_ctxt.tokens.pos;
> -    ctxt->tokens.count = saved_ctxt.tokens.count;
> -    ctxt->tokens.buf = saved_ctxt.tokens.buf;
> -}
> -

This saves and restores tokens, which is an array @buf of @count tokens
with a cursor @pos.

@buf and count only ever change in parser_context_new().  Saving and
restoring them has always been pointless.

What this actually does is saving and restoring the cursor.

>  static void tokens_append_from_iter(QObject *obj, void *opaque)
>  {
>      JSONParserContext *ctxt = opaque;
> @@ -364,7 +347,6 @@ static void parser_context_free(JSONParserContext *ctxt)
>  static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
>  {
>      QObject *key = NULL, *token = NULL, *value, *peek;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
>      peek = parser_context_peek_token(ctxt);
>      if (peek == NULL) {
> @@ -402,7 +384,6 @@ static int parse_pair(JSONParserContext *ctxt, QDict 
> *dict, va_list *ap)
>      return 0;
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>      qobject_decref(key);
>  
>      return -1;

Before your patch: on error, we rewind the cursor to where it was on
entry.  Useful in a backtracking parser, but this shouldn't be one.

Now: we leave it wherever we error out.

Fine, because the sole caller parse_object() won't actually use it on
error.

> @@ -412,9 +393,8 @@ static QObject *parse_object(JSONParserContext *ctxt, 
> va_list *ap)
>  {
>      QDict *dict = NULL;
>      QObject *token, *peek;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
> -    token = parser_context_pop_token(ctxt);
> +    token = parser_context_peek_token(ctxt);
>      if (token == NULL) {
>          goto out;
>      }
> @@ -425,6 +405,7 @@ static QObject *parse_object(JSONParserContext *ctxt, 
> va_list *ap)
>  
>      dict = qdict_new();
>  
> +    parser_context_pop_token(ctxt);
>      peek = parser_context_peek_token(ctxt);
>      if (peek == NULL) {
>          parse_error(ctxt, NULL, "premature EOI");
> @@ -465,7 +446,6 @@ static QObject *parse_object(JSONParserContext *ctxt, 
> va_list *ap)
>      return QOBJECT(dict);
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>      QDECREF(dict);
>      return NULL;
>  }

I'm not 100% when exactly parser_context_peek_token() returns null, but
I think can see what your patch does anyway.

Before: if we can parse an object, advance cursor behind the object and
return it, else leave the cursor alone and return null.  The sole caller
parse_value() relies on this behavior to try alternatives until one
succeeds.

After: if we can parse an object, same as before, else advance the
cursor to right before the offending token and return null.

Consider QMP input { 1 [ 2 ] }.

    json_parser_parse_err():
        parse_value() @ pos=0:
            triy parse_object() @ pos=0:
                consume '{'
                parse_pair() @ pos=1:
                    parse_value() @ pos=1:
                        try ..., consume 1, now pos=2
                    fail with "key is not a string in object"
                propagate failure
            try parse_array() @ pos=2:
                consume [ 2 ], return the QList, now pos=5
            return the QList
        throw away the error (caller json_parser_parse() passed errp=NULL)
        return the QList

Fortunately, that's not valid QMP, and we get

    {"error": {"class": "GenericError", "desc": "Expected 'object' in QMP 
input"}}

Shouldn't be hard to fix. we just have to complete the job of turning
this thing into a recursive descent parser.  I'll give it a shot.

[...]

Reply via email to