Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.

2011-09-27 Thread Dan Williams
On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote:
 This keeps ModemManager from crashing deep in the DBus libraries when
 a SMS Get() or List() DBus operation finds a message that isn't valid
 UTF-8 and/or has embedded NUL characters.
 
 I'll be putting up a separate patch as a proposal for how to avoid
 this problem in the new API.

Sounds fine; though in general we know the encoding that the message
comes in with, and we know we need to convert to UTF-8 for D-Bus (and
really, everything should be UTF-8 at the boundaries, it would be just
horrid to expose any charset encoding details to clients and I don't
think we have to).  So we should be able to convert to UTF-8 without any
real loss of fidelity when reading the  message from the modem, and we
should be able to convert from UTF-8 to a suitable charset (whatever
we've selected from CSCS) when sending messages too.

In what cases would we want to send or receive essentially binary data
via SMS?  AFAIK most of these cases show up as base64 or hex-string SMS
if they aren't intended for human consumption.

In any case, applied, thanks!

Dan

 - Nathan
 ___
 networkmanager-list mailing list
 networkmanager-list@gnome.org
 http://mail.gnome.org/mailman/listinfo/networkmanager-list


___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.

2011-09-27 Thread Andrew Bird (Sphere Systems)
On Tuesday 27 September 2011, Dan Williams wrote:
 On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote:
  This keeps ModemManager from crashing deep in the DBus libraries when
  a SMS Get() or List() DBus operation finds a message that isn't valid
  UTF-8 and/or has embedded NUL characters.
  
  I'll be putting up a separate patch as a proposal for how to avoid
  this problem in the new API.
 
 Sounds fine; though in general we know the encoding that the message
 comes in with, and we know we need to convert to UTF-8 for D-Bus (and
 really, everything should be UTF-8 at the boundaries, it would be just
 horrid to expose any charset encoding details to clients and I don't
 think we have to).  So we should be able to convert to UTF-8 without any
 real loss of fidelity when reading the  message from the modem, and we
 should be able to convert from UTF-8 to a suitable charset (whatever
 we've selected from CSCS) when sending messages too.
 
 In what cases would we want to send or receive essentially binary data
 via SMS?  AFAIK most of these cases show up as base64 or hex-string SMS
 if they aren't intended for human consumption.
 
 In any case, applied, thanks!
 
 Dan
 
  - Nathan
  
  ___
  networkmanager-list mailing list
  networkmanager-list@gnome.org
  http://mail.gnome.org/mailman/listinfo/networkmanager-list
 
 ___
 networkmanager-list mailing list
 networkmanager-list@gnome.org
 http://mail.gnome.org/mailman/listinfo/networkmanager-list

Hi Dan,
Here's a case you might want to consider. Sometimes mobile networks 
send 
WAP pushes to signify that an MMS is ready for retrieval, or send the user 
revised phone settings etc. Now we can't currently do much with those but they 
are binary and can contain embedded \x00 which Dbus really won't transfer in a 
string. But it's no good just ignoring them and not showing the client the 
message because they take up valuable SIM based SMS slots, if we did then the 
user would not know of their existence and eventually the SIM would silently 
fill. I had one WAP push last week that spanned 4 SMS slots. On Wader we worked 
around this problem by Zipping and Base64 encoding the message before sending 
over Dbus [0] [1], but I couldn't think of a elegant solution without changing 
the spec. If you come up with a better solution please let me know and we can 
revise what we have.

[0]:https://forge.betavine.net/plugins/scmsvn/viewcvs.php/trunk/src/core/wader/common/mal.py?root=bcmrev=1210r1=1193r2=1210
[1]:https://forge.betavine.net/plugins/scmsvn/viewcvs.php/trunk/src/core/wader/common/encoding.py?root=bcmrev=1210r1=1194r2=1210

Andrew

___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.

2011-09-27 Thread Dan Williams
On Tue, 2011-09-27 at 13:18 -0500, Dan Williams wrote:
 On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote:
  This keeps ModemManager from crashing deep in the DBus libraries when
  a SMS Get() or List() DBus operation finds a message that isn't valid
  UTF-8 and/or has embedded NUL characters.
  
  I'll be putting up a separate patch as a proposal for how to avoid
  this problem in the new API.
 
 Sounds fine; though in general we know the encoding that the message
 comes in with, and we know we need to convert to UTF-8 for D-Bus (and
 really, everything should be UTF-8 at the boundaries, it would be just
 horrid to expose any charset encoding details to clients and I don't
 think we have to).  So we should be able to convert to UTF-8 without any
 real loss of fidelity when reading the  message from the modem, and we
 should be able to convert from UTF-8 to a suitable charset (whatever
 we've selected from CSCS) when sending messages too.
 
 In what cases would we want to send or receive essentially binary data
 via SMS?  AFAIK most of these cases show up as base64 or hex-string SMS
 if they aren't intended for human consumption.
 
 In any case, applied, thanks!

But the patch does appear to break the testcases, could you send along a
patch to update those as well?

/test_pdu3_8bit: **
ERROR:test-sms.c:231:test_pdu3_8bit: assertion failed
(g_value_get_string(value) == (\xe8\x32\x9b\xfd\x46\x97\xd9\xec\x37
\xde)): (\\xe82\\x9b\\xfdF\\x97\\xd9\\xec7\\xde == \3502\233\375F
\227\331\3547\336)

Thanks,
Dan


___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.

2011-09-27 Thread Nathan Williams
On Tue, Sep 27, 2011 at 2:18 PM, Dan Williams d...@redhat.com wrote:

 On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote:
  This keeps ModemManager from crashing deep in the DBus libraries when
  a SMS Get() or List() DBus operation finds a message that isn't valid
  UTF-8 and/or has embedded NUL characters.
 
  I'll be putting up a separate patch as a proposal for how to avoid
  this problem in the new API.

 Sounds fine; though in general we know the encoding that the message
 comes in with, and we know we need to convert to UTF-8 for D-Bus (and
 really, everything should be UTF-8 at the boundaries, it would be just
 horrid to expose any charset encoding details to clients and I don't
 think we have to).  So we should be able to convert to UTF-8 without any
 real loss of fidelity when reading the  message from the modem, and we
 should be able to convert from UTF-8 to a suitable charset (whatever
 we've selected from CSCS) when sending messages too.


 In what cases would we want to send or receive essentially binary data
 via SMS?  AFAIK most of these cases show up as base64 or hex-string SMS
 if they aren't intended for human consumption.


We do do that conversion to UTF-8 when we know the transmission character
set, GSM-7 or UCS2. The one fly in this ointment is that one of the possible
encodings is, in fact, 8-bit data (TP-DCS value of 04 or f4) with no
associated character set. The particular case that brought this to my
attention was a test SMS from a carrier that was supposed to contain, I
believe, a polyphonic ringtone for some Nokia handset.

I'll see about updating the testcases.

- Nathan
___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.

2011-09-27 Thread Dan Williams
On Tue, 2011-09-27 at 14:55 -0400, Nathan Williams wrote:
 
 
 On Tue, Sep 27, 2011 at 2:18 PM, Dan Williams d...@redhat.com wrote:
 On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote:
  This keeps ModemManager from crashing deep in the DBus
 libraries when
  a SMS Get() or List() DBus operation finds a message that
 isn't valid
  UTF-8 and/or has embedded NUL characters.
 
  I'll be putting up a separate patch as a proposal for how to
 avoid
  this problem in the new API.
 
 
 Sounds fine; though in general we know the encoding that the
 message
 comes in with, and we know we need to convert to UTF-8 for
 D-Bus (and
 really, everything should be UTF-8 at the boundaries, it would
 be just
 horrid to expose any charset encoding details to clients and I
 don't
 think we have to).  So we should be able to convert to UTF-8
 without any
 real loss of fidelity when reading the  message from the
 modem, and we
 should be able to convert from UTF-8 to a suitable charset
 (whatever
 we've selected from CSCS) when sending messages too.
 
 In what cases would we want to send or receive essentially
 binary data
 via SMS?  AFAIK most of these cases show up as base64 or
 hex-string SMS
 if they aren't intended for human consumption.
 
 
 We do do that conversion to UTF-8 when we know the transmission
 character set, GSM-7 or UCS2. The one fly in this ointment is that one
 of the possible encodings is, in fact, 8-bit data (TP-DCS value of
 04 or f4) with no associated character set. The particular case that
 brought this to my attention was a test SMS from a carrier that was
 supposed to contain, I believe, a polyphonic ringtone for some Nokia
 handset.

Ok, I suppose we could also expose the data as a byte array in the Get()
method call along with the 'text' argument.  Since it seems like we can
probably tell whether it's supposed to be a string or not.

Dan



___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list


PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.

2011-09-26 Thread Nathan Williams
This keeps ModemManager from crashing deep in the DBus libraries when
a SMS Get() or List() DBus operation finds a message that isn't valid
UTF-8 and/or has embedded NUL characters.

I'll be putting up a separate patch as a proposal for how to avoid
this problem in the new API.

- Nathan
From b4be9e8cfa79cfb1d63e69a151078c75f38131d9 Mon Sep 17 00:00:00 2001
From: Nathan Williams n...@chromium.org
Date: Fri, 23 Sep 2011 17:21:15 -0400
Subject: [PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.

When receiving a SMS message with raw 8-bit data, sanitize it by
replacing non-ASCII characters with \xNN escape sequences. This
prevents a problem further down the line where the body of the message
is passed into DBus as a string, and DBus chokes because the string
isn't valid UTF-8.

Once the ModemManager SMS API can support non-string message bodies,
this should be revisited.

BUG=chrome-os-partner:5953
TEST=Run network_ModemManagerSMS.py with the PDU from this bug.

Change-Id: Ic33a365f9a065c49a325e047e4c3f5e81450fa1f
Reviewed-on: http://gerrit.chromium.org/gerrit/8232
Reviewed-by: Eric Shienbrood e...@chromium.org
Tested-by: Nathan J. Williams n...@chromium.org
Commit-Ready: Nathan J. Williams n...@chromium.org
---
 src/mm-sms-utils.c |   21 +++--
 1 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/mm-sms-utils.c b/src/mm-sms-utils.c
index 3f56a64..89eae4b 100644
--- a/src/mm-sms-utils.c
+++ b/src/mm-sms-utils.c
@@ -13,6 +13,9 @@
  * Copyright (C) 2011 Red Hat, Inc.
  */
 
+#include ctype.h
+#include stdio.h
+
 #include glib.h
 
 #include mm-charsets.h
@@ -200,8 +203,22 @@ sms_decode_text (const guint8 *text, int len, SmsEncoding encoding, int bit_offs
 g_free (unpacked);
 } else if (encoding == MM_SMS_ENCODING_UCS2)
 utf8 = g_convert ((char *) text, len, UTF8, UCS-2BE, NULL, NULL, NULL);
-else if (encoding == MM_SMS_ENCODING_8BIT)
-utf8 = g_strndup ((const char *)text, len);
+else if (encoding == MM_SMS_ENCODING_8BIT) {
+/* DBus may choke on non-UTF8 strings, so we have some sanitizing to do */
+char *p;
+int i;
+utf8 = g_malloc0 (4*len+1); /* Worst case: Every byte becomes \xFF */
+p = utf8;
+for (i = 0 ; i  len ; i++) {
+if (isascii (text[i])  text[i] != '\0')
+*p++ = text[i];
+else {
+sprintf(p, \\x%02x, text[i]);
+p += 4;
+}
+}
+*p = '\0';
+}
 else
 utf8 = g_strdup ();
 
-- 
1.7.3.1

___
networkmanager-list mailing list
networkmanager-list@gnome.org
http://mail.gnome.org/mailman/listinfo/networkmanager-list