On Fri, Aug 02, 2024 at 02:26:05PM -0500, Eric Blake wrote:
> My next patch needs to convert text from an untrusted input into an
> output representation that is suitable for display on a terminal is
> useful to more than just the json-writer; the text should normally be
> UTF-8, but blindly allowing all Unicode code points (including ASCII
> ESC) through to a terminal risks remote-code-execution attacks on some
> terminals.  Extract the existing body of json-writer's quoted_strinto
> a new helper routine mod_utf8_sanitize, and generalize it to also work
> on data that is length-limited rather than NUL-terminated.  [I was
> actually surprised that glib does not have such a sanitizer already -
> Google turns up lots of examples of rolling your own string
> sanitizer.]
> 
> If desired in the future, we may want to tweak whether the output is
> guaranteed to be ASCII (using lots of \u escape sequences, including
> surrogate pairs for code points outside the BMP) or if we are okay
> passing printable Unicode through (we still need to escape control
> characters).  But for now, I went for minimal code churn, including
> the fact that the resulting function allows a non-UTF-8 2-byte synonym
> for U+0000.
> 
> Signed-off-by: Eric Blake <ebl...@redhat.com>
> ---
>  include/qemu/unicode.h |  3 ++
>  qobject/json-writer.c  | 47 +----------------------
>  util/unicode.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 88 insertions(+), 46 deletions(-)

I was going to ask for a unit test, but "escaped_string" in
test-qjson.c  looks like it will be covering this sufficiently
well already, that we don't need to test it in isolation.

Reviewed-by: Daniel P. Berrangé <berra...@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to