Hi,
Alexander Malysh wrote:
Hi,
Am 20.07.2006, 12:48 Uhr, schrieb Peter Christensen <[EMAIL PROTECTED]>:
Hi Alex,
Awesome initiative! I've been hoping for this to happen for quite a
Thanks!
while. There are a few issues though:
1. In the gwlib/latin1_to_gsm.h, <SP> (space) is replaced with <ESC>
(0x1B), and <ESC> is mapped to NRP instead of just <ESC>. (If you
follow me)
ok, here was a typo, changed <SP> to 0x20 but <ESC> should be NRP
because it's non representable in GSM.
I see your point. Assuming that kannel is updated if and when the GSM
charset is extended further in the future, the <ESC> really should be
NRP, but then again, I've experienced a few gateways which required you
to transmit the escape sign yourself for some reason... They probably
used iso-8559-1 charset and I needed € or whatever, and in such cases
the charset_utf8_to_gsm wouldn't be called anyway. My thought was
primarily in case the GSM charset was changed further. (In short, I can
live without the <ESC> :D)
2. For some odd reason, smsbox trims the message to 160 characters,
while it is in utf-8 format... My usual charset test message which
contains all GSM characters except the Greek ones (wasn't possible
before now), looks like this:
Test: @£$¥èéùìòÇ
Øø
Åå_ÆæßÉ
!"#¤%&'()*+,-./0123456789:;<=>?¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà^{}\[~]|€
Which in UTF-8 takes up 163 octets, but only 141 septets in GSM. When
transmitting, the € is omitted, and judging from a ngrep of data
transfered from smsbox to bearerbox, it is smsbox which does the
trimming. For the record, the string is exactly 160 octets long when €
is omitted.
Apparently it uses the size of the GSM string to determine when to
split, but the trimming/splitting is done on the UTF-8 string.
Obviously it is sms_split, which is to blame, but why is this function
used at all if splitting is done in bearerbox (according to comments
in source) - this problem is probably not directly related to the
utf-8 patch.
hmm, strange... I will look in smsbox code if you don't beat me ;)
The smsbox check max allowed messages from config and try to split
message sms_split. If there more as allowed smsbox send only allowed count.
Heh, beating you probably wouldn't solve anything :o)
Actually I would have looked at it myself, if it wasn't because it
apparently split the message just to combine the lot again, which seemed
kinda silly.
Med venlig hilsen / Best regards
Peter Christensen
Developer
------------------
Cool Systems ApS
Tel: +45 2888 1600
Mai: [EMAIL PROTECTED]
www: www.coolsystems.dk
Alexander Malysh wrote:
Hi all,
at http://www.kannel.org/~amalysh/kannel-utf8.patch is a not so huge
patch that converts internal kannel charset to UTF-8. Please note
that I didn't add smsbox compatibility code, means smsbox expect text
body to be encoded in UTF-8 as default also MOs will be forwarded in
UTF-8. It could be workarounded with charset cgi variable.
Please test it and send feedback/patches.
I will maintain this patch for a while as long as we don't decide to
commit it to CVS.
--Thanks,
Alex
--Thanks,
Alex