Daniel P. Berrangé <berra...@redhat.com> writes: > On Fri, Aug 02, 2024 at 02:26:05PM -0500, Eric Blake wrote: >> My next patch needs to convert text from an untrusted input into an >> output representation that is suitable for display on a terminal is >> useful to more than just the json-writer; the text should normally be >> UTF-8, but blindly allowing all Unicode code points (including ASCII >> ESC) through to a terminal risks remote-code-execution attacks on some >> terminals. Extract the existing body of json-writer's quoted_strinto >> a new helper routine mod_utf8_sanitize, and generalize it to also work >> on data that is length-limited rather than NUL-terminated. [I was >> actually surprised that glib does not have such a sanitizer already - >> Google turns up lots of examples of rolling your own string >> sanitizer.] >> >> If desired in the future, we may want to tweak whether the output is >> guaranteed to be ASCII (using lots of \u escape sequences, including >> surrogate pairs for code points outside the BMP) or if we are okay >> passing printable Unicode through (we still need to escape control >> characters). But for now, I went for minimal code churn, including >> the fact that the resulting function allows a non-UTF-8 2-byte synonym >> for U+0000. >> >> Signed-off-by: Eric Blake <ebl...@redhat.com> >> --- >> include/qemu/unicode.h | 3 ++ >> qobject/json-writer.c | 47 +---------------------- >> util/unicode.c | 84 ++++++++++++++++++++++++++++++++++++++++++ >> 3 files changed, 88 insertions(+), 46 deletions(-) > > I was going to ask for a unit test, but "escaped_string" in > test-qjson.c looks like it will be covering this sufficiently
check-qjson.c, and other test cases torture it some more. > well already, that we don't need to test it in isolation. > > Reviewed-by: Daniel P. Berrangé <berra...@redhat.com> > > > With regards, > Daniel