Markus Armbruster <arm...@redhat.com> writes: > Eric Blake <ebl...@redhat.com> writes: > >> On 02/05/2013 09:22 AM, Markus Armbruster wrote: >>> Command memchar-write takes data and size parameter. Begs the >>> question what happens when data doesn't match size. >>> >>> With format base64, qmp_memchar_write() copies the full data argument, >>> regardless of size argument. >>> >>> With format utf8, qmp_memchar_write() copies size bytes from data, >>> happily reading beyond data. Copies crap from the heap or even >>> crashes. >>> >>> Drop the size parameter, and always copy the full data argument. >>> >>> Signed-off-by: Markus Armbruster <arm...@redhat.com> >>> --- >>> hmp.c | 4 +--- >>> qapi-schema.json | 4 +--- >>> qemu-char.c | 8 +++----- >>> qmp-commands.hx | 4 +--- >>> 4 files changed, 6 insertions(+), 14 deletions(-) >> >>> if (has_format && (format == DATA_FORMAT_BASE64)) { >>> write_data = g_base64_decode(data, &write_count); >>> } else { >>> write_data = (uint8_t *)data; >>> + write_count = strlen(data); >>> } >> >> Obviously, base64 is the only way to write an embedded NUL. But what > > Consider the JSON string "this \\u0000 is fun". But our JSON parser > chokes on it, so we can ignore it until that's fixed. See my "[PATCH] > check-qjson: More thorough testing of UTF-8 in strings", specifically > the comment right at the beginning of utf8_string(). > >> happens if the user requests base64 encoding, but the utf8 string that >> got passed in through JSON is not a valid base64-encoded string? Does >> g_base64_decode report an error in that case, and should you be handling >> the error here? > > Good question. I wish it had occured to GLib developers: > http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode > > Seriously, I need to find out what the heck g_base64_decode() does when > it's fed crap. If it fails in a reliable and detectable manner, I need > to handle the failure. If not, I need to replace the function. > > Moreover, I should document which characters outside the base64 alphabet > are silently ignored, if any. All whitespace? Just newlines?
As far as I can tell, it never fails, but silently ignores characters outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes. Oh, and it does something weird when padding appears in the middle. Craptastic. We can either document this behavior as a feature, or we replace the function by one that accepts only valid base64. If we do the latter, we better specify the language we want to accept now, but I guess we could choose to delay the actual checking post 1.4. There's another use of g_base64_decode() in qmp_guest_file_write(). Which means guest agent command guest-file-write is similarly ill-defined. Mike, anything to be done there? ---<test program>--- #include <glib.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { char **ap, *b64; unsigned char *buf; size_t sz, i; for (ap = argv + 1; *ap; ap++) { printf("in : %s\n", *ap); buf = g_base64_decode(*ap, &sz); printf("out:"); for (i = 0; i < sz; i++) { printf(" %02x", buf[i]); } putchar('\n'); b64 = g_base64_encode(buf, sz); printf("b64: %s\n", b64); free(buf); } } ---<test run>--- in : 1 out: b64: in : 1= out: b64: in : 1== out: b64: in : 1=== out: d4 b64: 1A== in : 12 out: b64: in : 123 out: b64: in : 1234 out: d7 6d f8 b64: 1234 in : =1234 out: 03 5d b7 b64: A123 in : 1=234 out: d4 0d b7 b64: 1A23 in : <>?,./watch this!@#$%^&*()_+ out: ff 06 ad 72 1b 61 b64: /watchth in : /watchthis+ out: ff 06 ad 72 1b 61 b64: /watchth