Re: [Qemu-devel] [PATCH v4 06/22] qobject: Perform %% interpolation in qobject_from_jsonf()
On 08/09/2017 04:06 AM, Markus Armbruster wrote: > Eric Blakewrites: > >> We want -Wformat to catch blatant programming errors in format >> strings that we pass to qobject_from_jsonf(). But if someone >> were to pass a JSON string "'%s'" as the format string, gcc would >> insist that it be paired with a char* argument, even though our >> lexer does not recognize % sequences inside a JSON string. You can >> bypass -Wformat checking by passing the Unicode escape \u0025 for >> %, but who wants to remember to type that? So the solution is that >> anywhere we are relying on -Wformat checking, the caller should >> pass the usual printf %% escape sequence where a literal % is >> wanted on output. >> >> +bool double_quote = *ptr++ == '"'; >> >> str = qstring_new(); >> -while (*ptr && >> - ((double_quote && *ptr != '"') || (!double_quote && *ptr != >> '\''))) { >> +while (*ptr && *ptr != "'\""[double_quote]) { > > Simpler: > >bool quote = *ptr++; > > and then > >while (*ptr && *ptr != quote) { Well, 'char quote' rather than 'bool quote', but yes, I like it. > > Have you considered splitting the patch into one to simplify, and one to > implement %%? Will split. >> @@ -455,13 +452,15 @@ static QObject *parse_escape(JSONParserContext *ctxt, >> va_list *ap) >> { >> JSONToken *token; >> >> -if (ap == NULL) { >> -return NULL; >> -} >> - >> token = parser_context_pop_token(ctxt); >> assert(token && token->type == JSON_ESCAPE); >> >> +if (ap == NULL) { >> +parse_error(ctxt, token, "escape parsing for '%s' not requested", >> +token->str); >> +return NULL; >> +} >> + > > When I manage to fat-finger a '%' into my QMP input, I now get this > error message instead of "Invalid JSON syntax". Makes me go "what is > escape parsing, and how do I request it?" Not an improvement, I'm > afraid. Pre-patch, I see: $ qemu-kvm -nodefaults -nographic -qmp stdio {"QMP": {"version": {"qemu": {"micro": 91, "minor": 9, "major": 2}, "package": "(qemu-2.10.0-0.1.rc1.fc26)"}, "capabilities": []}} {'execute':%s} {"error": {"class": "GenericError", "desc": "JSON parse error, Missing value in dict"}} {'execute':%%} {"error": {"class": "GenericError", "desc": "Invalid JSON syntax"}} {"error": {"class": "GenericError", "desc": "JSON parse error, expecting value"}} I find it odd that NOT calling parse_error() but still returning NULL changes the behavior on what error message eventually gets emitted; but I also agree that the QMP case should drive what error message (if any) is needed in parse_escape(). I'll play with it some more (the parser's error handling is weird). >> +/* In vararg form, %% must occur in strings */ >> +/* qobject_from_jsonf("%%"); */ >> +/* qobject_from_jsonf("{%%}"); */ >> + >> +/* In vararg form, strings must not use % except for %% */ >> +/* qobject_from_jsonf("'%s'", "unpaired"); */ > > Could use g_test_trap_subprocess(). Not sure it's worth the bother. I don't know - this is one case where proving we abort on invalid usage might actually be a good validation of the contract. > > I hate code in comments. Better: > >/* The following intentionally cause assertion failures */ >#if 0 >/* In vararg form, %% must occur in strings */ >qobject_from_jsonf("%%"); If I don't use the g_test_trap_subprocess() trick, then I can override checkpatch's complaints about #if 0. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v4 06/22] qobject: Perform %% interpolation in qobject_from_jsonf()
Eric Blakewrites: > We want -Wformat to catch blatant programming errors in format > strings that we pass to qobject_from_jsonf(). But if someone > were to pass a JSON string "'%s'" as the format string, gcc would > insist that it be paired with a char* argument, even though our > lexer does not recognize % sequences inside a JSON string. You can > bypass -Wformat checking by passing the Unicode escape \u0025 for > %, but who wants to remember to type that? So the solution is that > anywhere we are relying on -Wformat checking, the caller should > pass the usual printf %% escape sequence where a literal % is > wanted on output. > > Note that since % can only appear in JSON inside a string, we don't > have to teach the lexer how to parse any new % sequences, but instead > only have to add code to the parser. Likewise, the parser is still > where we reject any attempt at mid-string % interpolation other than > %% (this is only a runtime failure, rather than compile-time), but > since we already document that qobject_from_jsonf() asserts on invalid > usage, presumably anyone that is adding a new usage will have tested > that their usage doesn't always fail. > > Simplify qstring_from_escaped_str() while touching it, by using > bool, a more compact conditional, and qstring_append_chr(). > Likewise, improve the error message when parse_escape() is reached > without interpolation (for example, if a client sends garbage > rather than JSON over a QMP connection). > > The testsuite additions pass under valgrind, proving that we are > indeed passing the reference of anything given through %p to the > returned containing object, even when more than one object is > interpolated. > > Signed-off-by: Eric Blake > --- > qobject/json-lexer.c | 6 -- > qobject/json-parser.c | 49 - > qobject/qjson.c | 4 ++-- > tests/check-qjson.c | 50 ++ > 4 files changed, 80 insertions(+), 29 deletions(-) > > diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c > index b846d2852d..599b7446b7 100644 > --- a/qobject/json-lexer.c > +++ b/qobject/json-lexer.c > @@ -32,9 +32,11 @@ > * Extension for vararg handling in JSON construction, when using > * qobject_from_jsonf() instead of qobject_from_json() (this lexer > * actually accepts multiple forms of PRId64, but parse_escape() later > - * filters to only parse the current platform's spelling): > + * filters to only parse the current platform's spelling; meanwhile, > + * JSON only allows % inside strings, where the parser handles %%, so > + * we do not need to lex it here): The parenthesis is becoming unwieldy. Turn it into a note... > * > - * %(PRI[du]64|(l|ll)?[ud]|[ipsf]) > + * %(PRI[du]64|(l|ll)?[ud]|[ipsf%]) > * ... here? > */ > > diff --git a/qobject/json-parser.c b/qobject/json-parser.c > index 388aa7a386..daafe77a0c 100644 > --- a/qobject/json-parser.c > +++ b/qobject/json-parser.c > @@ -120,25 +120,21 @@ static int hex2decimal(char ch) > * \n > * \r > * \t > - * \u four-hex-digits > + * \u four-hex-digits > + * > + * Additionally, if @percent is true, all % in @token must be doubled, > + * replaced by a single % will be in the result; this allows -Wformat > + * processing of qobject_from_jsonf(). > */ > static QString *qstring_from_escaped_str(JSONParserContext *ctxt, > - JSONToken *token) > + JSONToken *token, bool percent) > { > const char *ptr = token->str; > QString *str; > -int double_quote = 1; > - > -if (*ptr == '"') { > -double_quote = 1; > -} else { > -double_quote = 0; > -} > -ptr++; > +bool double_quote = *ptr++ == '"'; > > str = qstring_new(); > -while (*ptr && > - ((double_quote && *ptr != '"') || (!double_quote && *ptr != > '\''))) { > +while (*ptr && *ptr != "'\""[double_quote]) { Simpler: bool quote = *ptr++; and then while (*ptr && *ptr != quote) { Have you considered splitting the patch into one to simplify, and one to implement %%? > if (*ptr == '\\') { > ptr++; > > @@ -205,12 +201,13 @@ static QString > *qstring_from_escaped_str(JSONParserContext *ctxt, > goto out; > } > } else { > -char dummy[2]; > - > -dummy[0] = *ptr++; > -dummy[1] = 0; > - > -qstring_append(str, dummy); > +if (*ptr == '%' && percent) { > +if (*++ptr != '%') { > +parse_error(ctxt, token, "invalid %% sequence in > string"); > +goto out; > +} > +} > +qstring_append_chr(str, *ptr++); > } > } > > @@ -455,13 +452,15 @@ static QObject *parse_escape(JSONParserContext *ctxt, > va_list *ap)
[Qemu-devel] [PATCH v4 06/22] qobject: Perform %% interpolation in qobject_from_jsonf()
We want -Wformat to catch blatant programming errors in format strings that we pass to qobject_from_jsonf(). But if someone were to pass a JSON string "'%s'" as the format string, gcc would insist that it be paired with a char* argument, even though our lexer does not recognize % sequences inside a JSON string. You can bypass -Wformat checking by passing the Unicode escape \u0025 for %, but who wants to remember to type that? So the solution is that anywhere we are relying on -Wformat checking, the caller should pass the usual printf %% escape sequence where a literal % is wanted on output. Note that since % can only appear in JSON inside a string, we don't have to teach the lexer how to parse any new % sequences, but instead only have to add code to the parser. Likewise, the parser is still where we reject any attempt at mid-string % interpolation other than %% (this is only a runtime failure, rather than compile-time), but since we already document that qobject_from_jsonf() asserts on invalid usage, presumably anyone that is adding a new usage will have tested that their usage doesn't always fail. Simplify qstring_from_escaped_str() while touching it, by using bool, a more compact conditional, and qstring_append_chr(). Likewise, improve the error message when parse_escape() is reached without interpolation (for example, if a client sends garbage rather than JSON over a QMP connection). The testsuite additions pass under valgrind, proving that we are indeed passing the reference of anything given through %p to the returned containing object, even when more than one object is interpolated. Signed-off-by: Eric Blake--- qobject/json-lexer.c | 6 -- qobject/json-parser.c | 49 - qobject/qjson.c | 4 ++-- tests/check-qjson.c | 50 ++ 4 files changed, 80 insertions(+), 29 deletions(-) diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c index b846d2852d..599b7446b7 100644 --- a/qobject/json-lexer.c +++ b/qobject/json-lexer.c @@ -32,9 +32,11 @@ * Extension for vararg handling in JSON construction, when using * qobject_from_jsonf() instead of qobject_from_json() (this lexer * actually accepts multiple forms of PRId64, but parse_escape() later - * filters to only parse the current platform's spelling): + * filters to only parse the current platform's spelling; meanwhile, + * JSON only allows % inside strings, where the parser handles %%, so + * we do not need to lex it here): * - * %(PRI[du]64|(l|ll)?[ud]|[ipsf]) + * %(PRI[du]64|(l|ll)?[ud]|[ipsf%]) * */ diff --git a/qobject/json-parser.c b/qobject/json-parser.c index 388aa7a386..daafe77a0c 100644 --- a/qobject/json-parser.c +++ b/qobject/json-parser.c @@ -120,25 +120,21 @@ static int hex2decimal(char ch) * \n * \r * \t - * \u four-hex-digits + * \u four-hex-digits + * + * Additionally, if @percent is true, all % in @token must be doubled, + * replaced by a single % will be in the result; this allows -Wformat + * processing of qobject_from_jsonf(). */ static QString *qstring_from_escaped_str(JSONParserContext *ctxt, - JSONToken *token) + JSONToken *token, bool percent) { const char *ptr = token->str; QString *str; -int double_quote = 1; - -if (*ptr == '"') { -double_quote = 1; -} else { -double_quote = 0; -} -ptr++; +bool double_quote = *ptr++ == '"'; str = qstring_new(); -while (*ptr && - ((double_quote && *ptr != '"') || (!double_quote && *ptr != '\''))) { +while (*ptr && *ptr != "'\""[double_quote]) { if (*ptr == '\\') { ptr++; @@ -205,12 +201,13 @@ static QString *qstring_from_escaped_str(JSONParserContext *ctxt, goto out; } } else { -char dummy[2]; - -dummy[0] = *ptr++; -dummy[1] = 0; - -qstring_append(str, dummy); +if (*ptr == '%' && percent) { +if (*++ptr != '%') { +parse_error(ctxt, token, "invalid %% sequence in string"); +goto out; +} +} +qstring_append_chr(str, *ptr++); } } @@ -455,13 +452,15 @@ static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap) { JSONToken *token; -if (ap == NULL) { -return NULL; -} - token = parser_context_pop_token(ctxt); assert(token && token->type == JSON_ESCAPE); +if (ap == NULL) { +parse_error(ctxt, token, "escape parsing for '%s' not requested", +token->str); +return NULL; +} + if (!strcmp(token->str, "%p")) { return va_arg(*ap, QObject *); } else if (!strcmp(token->str, "%i")) { @@ -490,7 +489,7 @@ static QObject