You might take a look at what we have in ICU for doing transliteration. It
is rule-based, where each of the rules can take the context of surrounding
letters into account.

For information, see
http://oss.software.ibm.com/icu/userguide/Transform.html
http://oss.software.ibm.com/icu/userguide/TransformRule.html
You can try out the rules with an interactive demo at
http://oss.software.ibm.com/cgi-bin/icu/tr

ÎÐrk
----- Original Message ----- 
From: "Donald Z. Osborn" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Friday, July 02, 2004 21:52
Subject: Hausa: Boko<->Ajami? (RE: Looking for transcription or
transliteration standards latin- >arabic)


> I've read selected messages in this thread (on Unicode list) and some
messages
> bring to mind the thought of developing routines or standards to permit
> toggling back and forth between standard Latin and Arabic transcriptions
for
> the same language, such as between the Boko and Ajami writing of Hausa.
(Same
> applies to any two or three transcription systems used for particular
> languages.)
>
> One of the benefits of ICT is, theoretically anyway, that one can have
text both
> (all) ways. Which would mean that the user has options, people using
> alternative systems are not excluded, and the society does not have to
debate a
> decision of which writing system to use, etc.
>
> Because there is generally not a 1-to-1 character correspondence in
spellings in
> different transcriptions, I wonder if you don't end up having to consider
> something that operates a bit like machine translation, analyzing the
context
> of words in cases where transcription of a word in one system could be
> transliterated into something misspelled or taken as more than one word in
the
> other system. Necessarily, I think, such routines would have to be
> language-specific.
>
> Any feedback would be appreciated. TIA...
>
> Don Osborn
> Bisharat.net
>
>
>
>
>
>
>


Reply via email to