Hi,

Alexander Malysh wrote:
Hi,

Am 20.07.2006, 12:48 Uhr, schrieb Peter Christensen <[EMAIL PROTECTED]>:

Hi Alex,

Awesome initiative! I've been hoping for this to happen for quite a

Thanks!

while. There are a few issues though:

1. In the gwlib/latin1_to_gsm.h, <SP> (space) is replaced with <ESC> (0x1B), and <ESC> is mapped to NRP instead of just <ESC>. (If you follow me)


ok, here was a typo, changed <SP> to 0x20 but <ESC> should be NRP because it's non representable in GSM.


I see your point. Assuming that kannel is updated if and when the GSM charset is extended further in the future, the <ESC> really should be NRP, but then again, I've experienced a few gateways which required you to transmit the escape sign yourself for some reason... They probably used iso-8559-1 charset and I needed € or whatever, and in such cases the charset_utf8_to_gsm wouldn't be called anyway. My thought was primarily in case the GSM charset was changed further. (In short, I can live without the <ESC> :D)


2. For some odd reason, smsbox trims the message to 160 characters, while it is in utf-8 format... My usual charset test message which contains all GSM characters except the Greek ones (wasn't possible before now), looks like this:

Test: @£$¥èéùìòÇ
Øø
Åå_ÆæßÉ !"#¤%&'()*+,-./0123456789:;<=>?¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà^{}\[~]|€

Which in UTF-8 takes up 163 octets, but only 141 septets in GSM. When transmitting, the € is omitted, and judging from a ngrep of data transfered from smsbox to bearerbox, it is smsbox which does the trimming. For the record, the string is exactly 160 octets long when € is omitted. Apparently it uses the size of the GSM string to determine when to split, but the trimming/splitting is done on the UTF-8 string. Obviously it is sms_split, which is to blame, but why is this function used at all if splitting is done in bearerbox (according to comments in source) - this problem is probably not directly related to the utf-8 patch.

hmm, strange... I will look in smsbox code if you don't beat me ;)
The smsbox check max allowed messages from config and try to split message sms_split. If there more as allowed smsbox send only allowed count.


Heh, beating you probably wouldn't solve anything :o)
Actually I would have looked at it myself, if it wasn't because it apparently split the message just to combine the lot again, which seemed kinda silly.

Med venlig hilsen / Best regards

Peter Christensen

Developer
------------------
Cool Systems ApS

Tel: +45 2888 1600
Mai: [EMAIL PROTECTED]
www: www.coolsystems.dk


Alexander Malysh wrote:
Hi all,
at http://www.kannel.org/~amalysh/kannel-utf8.patch is a not so huge patch that converts internal kannel charset to UTF-8. Please note that I didn't add smsbox compatibility code, means smsbox expect text body to be encoded in UTF-8 as default also MOs will be forwarded in UTF-8. It could be workarounded with charset cgi variable.
 Please test it and send feedback/patches.
I will maintain this patch for a while as long as we don't decide to commit it to CVS.
 --Thanks,
Alex






--Thanks,
Alex



Reply via email to