RE: charset question (Greek)

Jörg Pommnitz Wed, 13 Mar 2002 05:55:04 -0800

> Is a greek unicode text now to be sent as unicode or as GSM alphabeth?


Common infrastructure: provide conversion from Unicode to GSM 7bit
(including
Greek). Drivers can override the common fucntion to use whatever their 
corresponding SMSC requires.

> is a text suitable for ISO8859-1? 

This is something we cannot decide. Where would you place the limit?
Is a single character that is out of range enough? Or 25%?

> how about characters which exist in multiple unicode character tables?

I don't understand this. Where is the problem? (I know that the ASCII
characters appear multiple times in Unicode as part of the ISO8859
encodings).

> how to do pattern matching?

We could use UTF-8 to encode Unicode and use the normal ANSI-C string 
operations.

> ... and is an incoming SMS now binary or unicode?

This is a task for the SMSC driver. In the common case this is what the DCS
tells you.

> what if we have an SMSC which supports ascii text but not binary or 
> unicode?

Don't use it for Unicode or binary messages. We cannot prevent stupidity
in Kannel.

> how do we decide at routing if input=unicode or input=binary
> but containing clean 7 bit text?

We know the original character set (e.g. from the content-type for POST
and the charset CGI parameter for GET).

What I envision is something like this

struct smsc_msg_ops {
  int (*unicode_to_smsc)(Msg *);
  int (*split_message)(Msg *);
};

Then we could do:

...at initialization...
SMSCenter *smsc;
struct smsc_ops ops;

ops = default_ops ();
smsc = smscenter_construct(ops);

...at runtime...

smsc->msg_ops->unicode_to_smsc(msg);
smsc->msg_ops->split_message(msg);

smsc->delivermsg (msg);

This is a very rough darft, but you get the idea.

Regards
  Jörg

RE: charset question (Greek)

Reply via email to