You might take a look at what we have in ICU for doing transliteration. It is rule-based, where each of the rules can take the context of surrounding letters into account.
For information, see http://oss.software.ibm.com/icu/userguide/Transform.html http://oss.software.ibm.com/icu/userguide/TransformRule.html You can try out the rules with an interactive demo at http://oss.software.ibm.com/cgi-bin/icu/tr ÎÐrk ----- Original Message ----- From: "Donald Z. Osborn" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Friday, July 02, 2004 21:52 Subject: Hausa: Boko<->Ajami? (RE: Looking for transcription or transliteration standards latin- >arabic) > I've read selected messages in this thread (on Unicode list) and some messages > bring to mind the thought of developing routines or standards to permit > toggling back and forth between standard Latin and Arabic transcriptions for > the same language, such as between the Boko and Ajami writing of Hausa. (Same > applies to any two or three transcription systems used for particular > languages.) > > One of the benefits of ICT is, theoretically anyway, that one can have text both > (all) ways. Which would mean that the user has options, people using > alternative systems are not excluded, and the society does not have to debate a > decision of which writing system to use, etc. > > Because there is generally not a 1-to-1 character correspondence in spellings in > different transcriptions, I wonder if you don't end up having to consider > something that operates a bit like machine translation, analyzing the context > of words in cases where transcription of a word in one system could be > transliterated into something misspelled or taken as more than one word in the > other system. Necessarily, I think, such routines would have to be > language-specific. > > Any feedback would be appreciated. TIA... > > Don Osborn > Bisharat.net > > > > > > >