On Tue, 2007-08-14 at 07:56 -0400, Dinbandhu wrote:
[...]
> One further question: in Baraha, there is a facility for interconverting
> text between languages. For example, a particular text which is written
> using Devanagari script can, with a single command, be converted into
> Bengali script. Would there be such a facility in SCIM as well?
[...]

If by conversion, you mean simple transliteration, i.e., a character in
a certain position in the Devanagari Unicode block getting converted to
the corresponding character in the Unicode block for the other language
this is possible through a Perl script that I wrote for transliterating
the keymaps. Thus, for example, the Devanagari letter "ka" (U0915, at
position 16, counting from U0900, the start of the Devanagari block)
would get transliterated to the Bengali letter "ka" (U0995, also at
position 16 from the start of the Bengali block at U0980). This works
in a crude sense, but runs into obvious problems when a character in
one language has no equivalent in the other.

You will need to check out the baraha-maps distribution via CVS from
code.indlinux.net. Install cvs with "sudo apt-get install cvs" if you
do not already have it, and then do the following:
1. Check out the code (in case your mail client wraps long lines, each
   command below should be typed in a terminal on a single line):
   cvs -d :pserver:[EMAIL PROTECTED]:/cvsroot/baraha-maps
login
   When prompted for a password, just hit return.
   cvs -d :pserver:[EMAIL PROTECTED]:/cvsroot/baraha-maps
checkout baraha-maps
   This will create a sub-directory baraha-maps, and start checking out
   the code, giving you some messages. The server occasionally has
   problems, so if you have trouble connecting, please retry a few
   times. Likewise, if the connection might not be closed properly at
   the end, so if it downloads a bunch of files and seems to hang for
   a while, just hit Control-C to return to the prompt. The checkout
   should be complete if it seems to have hung for, say 5min. without
   doing anything.
2. The code will include the Baraha keymap for Hindi, from which the
   keymaps for other languages will be generated. Compile these with
     cd baraha-maps
     make
     sudo make install
   You will need to have Perl installed, but it should be there by
   default. This will install the various xx-baraha.mim keymaps in
   /usr/share/m17n, some Perl modules in /usr/local/lib/site_perl, and
   a script called remap_lang in /usr/local/bin. Thus, you will need to
   have /usr/local/bin in your path, or call the script with the full
   pathname, i.e., /usr/local/bin/remap_lang. Should you wish to
   uninstall things, do
     sudo make uninstall
3. Here are some examples of using remap_lang:
     remap_lang -i Devanagari -o Bengali < infile > outfile
   transliterates Devanagari text in "infile" to Bengali text in
   "outfile". Non-Devanagari text in "infile" is passed through
   unchanged. Any Indian script in Unicode can be used as input, or
   output. Try,
     remap_lang -i help
   for a list of known scripts

   Normally, a check is made that both the input character, and the
   output character are assigned in Unicode, and unassigned characters
   are silently dropped. You can force this check not to be done, and
   all characters to be transliterated with
      remap_lang -i Devanagari -o Bengali -c 0 < infile > outfile

   "remap_lang -h" gives a short usage message, and "remap_lang -m" a
   detailed manual.

I had not considered such transliteration an important issue, so
feedback on this script will be appreciated.

Regards,
Gora


-- 
ubuntu-in mailing list
ubuntu-in@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-in

Reply via email to