Idris, I know a bit of perl and would love to help. However, I fear that sending us your stuff via mail will be a bit difficult because the utf-8 chracters get transformed into gibberish. Could you send the hexadecimal code of the characters you want to convert? Or I could simply give you the syntax, you'll know what to do. So here comes a perl script that works for my greek stuff; I don't see why it shouldn't work with Arabic:
==================================cut here #!/usr/bin/perl -w use strict; use open ':utf8'; open(NEW,">new.tex"); #opens file to print out the result while (<>); { #this opens the file for reading $_ =~ s/\x{HEXADECIMAL_VALUE_OF_CHARACTER}/\x{HEXADECIMAL_VALUE_OF_NEW_CHARACTER}/esg; #this is the actual conversion print NEW "$_"; #and this writes the result into file "new.tex" } close(NEW); ==================================and here Make the script executable and call it with the name of a file as an argument. HTH Thomas On Sat, 2004-06-05 at 21:32, Idris Samawi Hamid wrote: > Hi gang, > > For Arabic we use a Latin transcription in Aleph/(e-)Omega (or even > ArabTeX) unless one of the encoding filters like utf-8 is used. Even for > utf-8 files, however, it would be very useful to be able to convert a > utf-8 file to Latin transcription for further processing by > Aleph/(e-)Omega. For example, adding diacritics is much easier to do in > Latin than in an Arabic script editor because Latin transcription is > one-dimensional and adding diacritics to Arabic is a 2-dimen affair. > > The best thing would be a perl script but I don't know perl at all (except > to run some some precreated scripts). If someone out of the kindness of > their heart could write a short and simple script for just seven > characters I could do the rest myself and present it back here. > > Now all of the Arabic charachters in utf-8 can be represented by extended > ascii. I need something like this, that converts every extended ascii > representation of Arabic utf-8 into a Latin transcription: > > à=> A > > à=> b > > à=> j > > à=> d > > Ãâ => h > > ÃË => w > > à=> z > > If someone could write a perl script that can accomplish the above > conversion, I can manually fill in the rest of the script. Basically I use > a modified version of the ArabTeX transcription. > > Here is a "gift" in return: a sample utf-8 Arabic file that can be > processed by Aleph/(e-)Omega in ConTeXt (you will probably need to dvips > this, though some dvi-viewers can do the postscript/16-bit thing): > > ============================================== > \hoffset=0pt % for Omega bug: has this been fixed? > > \def\ArabicUTF{\ocp\UTFArUni=inutf8 %% in88596 > %\ocp\UTFArUni=in88596 > \ocp\UniCUni=uni2cuni > \ocp\CUniArab=cuni2oar > \ocplist\UTFArOCP= > \addbeforeocplist 1 \UTFArUni > \addbeforeocplist 1 \UniCUni > \addbeforeocplist 1 \CUniArab > \nullocplist > \pushocplist\UTFArOCP} > > \input m-gamma.tex > \input type-omg.tex > \switchtobodyfont[omarb,12pt] % > > \textdir TRT% > \pardir TRT% > \ArabicUTF > > \starttext > > ÃÅ Ãâ Êààààààààààààà> àààààà> àààààààÃâ àÃâ ÃÆ Ãâ Ãâ Ãâ Ãâ > ÃË Ãâ ÃÅ > > \blank[big] > > %Ãâ ÊàÊààÃâ > > Ãâ à ààààààààààààààà> ààààà> ààààààààÃâ àÃâ ÃÆ Ãâ Ãâ Ãâ > Ãâ ÃË Ãâ ÃÅ Ãâ Êà> ÊààÃâ Ãâ Ãâ Ãâ Ãâ Ãâ Ãâ ÃË Ãâ ÃÅ Ãâ ÃÅ > àÊÊàààà> ààààààààààààààààà> ààààà> ààÃâ àÃÆ Ãâ Ãâ Ãâ Ãâ ÃË Ãâ ÃÅ Ãâ Êà> Ãâ Ãâ Ãâ Ãâ Ãâ à> ààààààààà> > \blank[big] > > ÃâÃâ ÃâÃÅ ÃâàÃâÃÅ ÃâàÃâàÃâÃâ > ÃâÃâ Ãâà> > ààààÃâ ÃË Ã > > \stoptext > > ============================================== > > Best > Idris _______________________________________________ ntg-context mailing list [EMAIL PROTECTED] http://www.ntg.nl/mailman/listinfo/ntg-context