Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote: This keeps ModemManager from crashing deep in the DBus libraries when a SMS Get() or List() DBus operation finds a message that isn't valid UTF-8 and/or has embedded NUL characters. I'll be putting up a separate patch as a proposal for how to avoid this problem in the new API. Sounds fine; though in general we know the encoding that the message comes in with, and we know we need to convert to UTF-8 for D-Bus (and really, everything should be UTF-8 at the boundaries, it would be just horrid to expose any charset encoding details to clients and I don't think we have to). So we should be able to convert to UTF-8 without any real loss of fidelity when reading the message from the modem, and we should be able to convert from UTF-8 to a suitable charset (whatever we've selected from CSCS) when sending messages too. In what cases would we want to send or receive essentially binary data via SMS? AFAIK most of these cases show up as base64 or hex-string SMS if they aren't intended for human consumption. In any case, applied, thanks! Dan - Nathan ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
On Tuesday 27 September 2011, Dan Williams wrote: On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote: This keeps ModemManager from crashing deep in the DBus libraries when a SMS Get() or List() DBus operation finds a message that isn't valid UTF-8 and/or has embedded NUL characters. I'll be putting up a separate patch as a proposal for how to avoid this problem in the new API. Sounds fine; though in general we know the encoding that the message comes in with, and we know we need to convert to UTF-8 for D-Bus (and really, everything should be UTF-8 at the boundaries, it would be just horrid to expose any charset encoding details to clients and I don't think we have to). So we should be able to convert to UTF-8 without any real loss of fidelity when reading the message from the modem, and we should be able to convert from UTF-8 to a suitable charset (whatever we've selected from CSCS) when sending messages too. In what cases would we want to send or receive essentially binary data via SMS? AFAIK most of these cases show up as base64 or hex-string SMS if they aren't intended for human consumption. In any case, applied, thanks! Dan - Nathan ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list Hi Dan, Here's a case you might want to consider. Sometimes mobile networks send WAP pushes to signify that an MMS is ready for retrieval, or send the user revised phone settings etc. Now we can't currently do much with those but they are binary and can contain embedded \x00 which Dbus really won't transfer in a string. But it's no good just ignoring them and not showing the client the message because they take up valuable SIM based SMS slots, if we did then the user would not know of their existence and eventually the SIM would silently fill. I had one WAP push last week that spanned 4 SMS slots. On Wader we worked around this problem by Zipping and Base64 encoding the message before sending over Dbus [0] [1], but I couldn't think of a elegant solution without changing the spec. If you come up with a better solution please let me know and we can revise what we have. [0]:https://forge.betavine.net/plugins/scmsvn/viewcvs.php/trunk/src/core/wader/common/mal.py?root=bcmrev=1210r1=1193r2=1210 [1]:https://forge.betavine.net/plugins/scmsvn/viewcvs.php/trunk/src/core/wader/common/encoding.py?root=bcmrev=1210r1=1194r2=1210 Andrew ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
On Tue, 2011-09-27 at 13:18 -0500, Dan Williams wrote: On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote: This keeps ModemManager from crashing deep in the DBus libraries when a SMS Get() or List() DBus operation finds a message that isn't valid UTF-8 and/or has embedded NUL characters. I'll be putting up a separate patch as a proposal for how to avoid this problem in the new API. Sounds fine; though in general we know the encoding that the message comes in with, and we know we need to convert to UTF-8 for D-Bus (and really, everything should be UTF-8 at the boundaries, it would be just horrid to expose any charset encoding details to clients and I don't think we have to). So we should be able to convert to UTF-8 without any real loss of fidelity when reading the message from the modem, and we should be able to convert from UTF-8 to a suitable charset (whatever we've selected from CSCS) when sending messages too. In what cases would we want to send or receive essentially binary data via SMS? AFAIK most of these cases show up as base64 or hex-string SMS if they aren't intended for human consumption. In any case, applied, thanks! But the patch does appear to break the testcases, could you send along a patch to update those as well? /test_pdu3_8bit: ** ERROR:test-sms.c:231:test_pdu3_8bit: assertion failed (g_value_get_string(value) == (\xe8\x32\x9b\xfd\x46\x97\xd9\xec\x37 \xde)): (\\xe82\\x9b\\xfdF\\x97\\xd9\\xec7\\xde == \3502\233\375F \227\331\3547\336) Thanks, Dan ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
On Tue, Sep 27, 2011 at 2:18 PM, Dan Williams d...@redhat.com wrote: On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote: This keeps ModemManager from crashing deep in the DBus libraries when a SMS Get() or List() DBus operation finds a message that isn't valid UTF-8 and/or has embedded NUL characters. I'll be putting up a separate patch as a proposal for how to avoid this problem in the new API. Sounds fine; though in general we know the encoding that the message comes in with, and we know we need to convert to UTF-8 for D-Bus (and really, everything should be UTF-8 at the boundaries, it would be just horrid to expose any charset encoding details to clients and I don't think we have to). So we should be able to convert to UTF-8 without any real loss of fidelity when reading the message from the modem, and we should be able to convert from UTF-8 to a suitable charset (whatever we've selected from CSCS) when sending messages too. In what cases would we want to send or receive essentially binary data via SMS? AFAIK most of these cases show up as base64 or hex-string SMS if they aren't intended for human consumption. We do do that conversion to UTF-8 when we know the transmission character set, GSM-7 or UCS2. The one fly in this ointment is that one of the possible encodings is, in fact, 8-bit data (TP-DCS value of 04 or f4) with no associated character set. The particular case that brought this to my attention was a test SMS from a carrier that was supposed to contain, I believe, a polyphonic ringtone for some Nokia handset. I'll see about updating the testcases. - Nathan ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
On Tue, 2011-09-27 at 14:55 -0400, Nathan Williams wrote: On Tue, Sep 27, 2011 at 2:18 PM, Dan Williams d...@redhat.com wrote: On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote: This keeps ModemManager from crashing deep in the DBus libraries when a SMS Get() or List() DBus operation finds a message that isn't valid UTF-8 and/or has embedded NUL characters. I'll be putting up a separate patch as a proposal for how to avoid this problem in the new API. Sounds fine; though in general we know the encoding that the message comes in with, and we know we need to convert to UTF-8 for D-Bus (and really, everything should be UTF-8 at the boundaries, it would be just horrid to expose any charset encoding details to clients and I don't think we have to). So we should be able to convert to UTF-8 without any real loss of fidelity when reading the message from the modem, and we should be able to convert from UTF-8 to a suitable charset (whatever we've selected from CSCS) when sending messages too. In what cases would we want to send or receive essentially binary data via SMS? AFAIK most of these cases show up as base64 or hex-string SMS if they aren't intended for human consumption. We do do that conversion to UTF-8 when we know the transmission character set, GSM-7 or UCS2. The one fly in this ointment is that one of the possible encodings is, in fact, 8-bit data (TP-DCS value of 04 or f4) with no associated character set. The particular case that brought this to my attention was a test SMS from a carrier that was supposed to contain, I believe, a polyphonic ringtone for some Nokia handset. Ok, I suppose we could also expose the data as a byte array in the Get() method call along with the 'text' argument. Since it seems like we can probably tell whether it's supposed to be a string or not. Dan ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list
PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
This keeps ModemManager from crashing deep in the DBus libraries when a SMS Get() or List() DBus operation finds a message that isn't valid UTF-8 and/or has embedded NUL characters. I'll be putting up a separate patch as a proposal for how to avoid this problem in the new API. - Nathan From b4be9e8cfa79cfb1d63e69a151078c75f38131d9 Mon Sep 17 00:00:00 2001 From: Nathan Williams n...@chromium.org Date: Fri, 23 Sep 2011 17:21:15 -0400 Subject: [PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean. When receiving a SMS message with raw 8-bit data, sanitize it by replacing non-ASCII characters with \xNN escape sequences. This prevents a problem further down the line where the body of the message is passed into DBus as a string, and DBus chokes because the string isn't valid UTF-8. Once the ModemManager SMS API can support non-string message bodies, this should be revisited. BUG=chrome-os-partner:5953 TEST=Run network_ModemManagerSMS.py with the PDU from this bug. Change-Id: Ic33a365f9a065c49a325e047e4c3f5e81450fa1f Reviewed-on: http://gerrit.chromium.org/gerrit/8232 Reviewed-by: Eric Shienbrood e...@chromium.org Tested-by: Nathan J. Williams n...@chromium.org Commit-Ready: Nathan J. Williams n...@chromium.org --- src/mm-sms-utils.c | 21 +++-- 1 files changed, 19 insertions(+), 2 deletions(-) diff --git a/src/mm-sms-utils.c b/src/mm-sms-utils.c index 3f56a64..89eae4b 100644 --- a/src/mm-sms-utils.c +++ b/src/mm-sms-utils.c @@ -13,6 +13,9 @@ * Copyright (C) 2011 Red Hat, Inc. */ +#include ctype.h +#include stdio.h + #include glib.h #include mm-charsets.h @@ -200,8 +203,22 @@ sms_decode_text (const guint8 *text, int len, SmsEncoding encoding, int bit_offs g_free (unpacked); } else if (encoding == MM_SMS_ENCODING_UCS2) utf8 = g_convert ((char *) text, len, UTF8, UCS-2BE, NULL, NULL, NULL); -else if (encoding == MM_SMS_ENCODING_8BIT) -utf8 = g_strndup ((const char *)text, len); +else if (encoding == MM_SMS_ENCODING_8BIT) { +/* DBus may choke on non-UTF8 strings, so we have some sanitizing to do */ +char *p; +int i; +utf8 = g_malloc0 (4*len+1); /* Worst case: Every byte becomes \xFF */ +p = utf8; +for (i = 0 ; i len ; i++) { +if (isascii (text[i]) text[i] != '\0') +*p++ = text[i]; +else { +sprintf(p, \\x%02x, text[i]); +p += 4; +} +} +*p = '\0'; +} else utf8 = g_strdup (); -- 1.7.3.1 ___ networkmanager-list mailing list networkmanager-list@gnome.org http://mail.gnome.org/mailman/listinfo/networkmanager-list