Folks,

I always believed that Ç was in GSM 7 bit alphabet, but not ç (it
is stupid, but that's beyond the point).

But I was pointed to that document recently:

http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT

excerpts:

#       This table contains the data the Unicode Consortium has on how
#       ETSI GSM 03.38 7-bit default alphabet characters map into Unicode.
#       This mapping is based on ETSI TS 100 900 V7.2.0 (1999-07), with
#       a correction of 0x09 to *small* c-cedilla, instead of *capital*
#       C-cedilla.
#
(...)
#
#       The ETSI GSM 03.38 specification shows an uppercase C-cedilla
#       glyph at 0x09. This may be the result of limited display
#       capabilities for handling characters with descenders. However, the
#       language coverage intent is clearly for the lowercase c-cedilla, as 
shown
#       in the mapping below. The mapping for uppercase C-cedilla is shown
#       in a commented line in the mapping table.

I believe it is relevant to Kannel because there is to and from
GSM 7-bit alphabet conversions in Kannel, of course, for MO/MT
transmissions. In Kannel implementation, seemingly relevant
excerpts from gateway-1.4.3/gwlib/latin1_to_gsm.h include:

/* 0xc7 */ 0x09, /* pc: NON PRINTABLE */    (Ç)

and

/* 0xe7 */ NRP, /* pc: NON PRINTABLE */     (ç)

What do you think? Should both of these chars rather map to 0x09?
Have you ever seen a phone displaying ç from 0x09 from a GSM 7
bit message (me never)?

-- 
Guillaume Cottenceau

Reply via email to