Re: converting devanagari to mangal unicode

2002-12-17 Thread Peter_Constable

On 12/16/2002 05:09:04 PM Eric Muller wrote:

>May be Sunil is just asking for a conversion of data, presumably from
>ISCII to Unicode.

Or perhaps from one of a variety of non-standard Devanagari encodings.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485







RE: converting devanagari to mangal unicode

2002-12-17 Thread Marco Cimarosti
Bob Hallissy wrote:
> NB: One of the complexities you may run into, and which will limit your
> options, is that your encoding may store text in a different order than
> Unicode requires. If this is the case, TECkit can do the rearrangement for
> you but I'm not sure ICU will easily do that. Certainly the current
> standard for XML-based descriptions of encoding mappings as given in
> Unicode Technical Report 22 (see
> http://www.unicode.org/unicode/reports/tr22/ ) cannot express such
> mappings.

Someone made me notice recently that UTR#22 can indeed implement Indic
visual-to-logical mappings, provided that one chooses the whole Indic
"syllable" as a mapping unit. E.g.:




Of course, this requires very big tables, which could be avoided using a
smarter mechanisms. Moreover, it only works with well-formed sequences in an
anticipated set of languages, but fails with misspellings or new
orthographies.

_ Marco




Re: converting devanagari to mangal unicode

2002-12-17 Thread Bob_Hallissy

On 16/12/2002 22:02:36 "Magda Danish (Unicode)" wrote:

>> I have a data in devanagri true type font i want to convert
>> this data into mangal unicode.

Sunil,

For Windows or Mac use: If you want to convert data from one encoding to
Unicode, one option is to look at the free TECkit package.  There are many
non-Unicode encodings of Devanagari, so I'm unable to guess how your data
is currently encoded. TECkit is table-driven, i.e., you find or prepare a
description of the mapping between your encoding and Unicode, and then
TECkit uses that description to convert data. You may even be able to find
a mapping description already prepared as TECkit can use the XML mapping
definitions from ICU (see
http://oss.software.ibm.com/cvs/icu/charset/data/xml/)  For more
information about TECkit or to download it, see
http://www.sil.org/nrsi/teckit/

Depending on the characteristics of your encoding and your desire to do a
bit of programming, you may also be able to incorporate the ICU
(International Components for Unicode) library into your own program to do
the conversion you need. See
http://oss.software.ibm.com/developerworks/opensource/icu/project/ for more
information.

NB: One of the complexities you may run into, and which will limit your
options, is that your encoding may store text in a different order than
Unicode requires. If this is the case, TECkit can do the rearrangement for
you but I'm not sure ICU will easily do that. Certainly the current
standard for XML-based descriptions of encoding mappings as given in
Unicode Technical Report 22 (see
http://www.unicode.org/unicode/reports/tr22/ ) cannot express such
mappings.

Bob








RE: converting devanagari to mangal unicode

2002-12-17 Thread Marco Cimarosti
John Hudson wrote:
> At 03:09 PM 12/16/2002, Eric Muller wrote:
> 
> >>In order to convert any Devanagari font to be rendered in 
> the same way,
> >
> >May be Sunil is just asking for a conversion of data, 
> presumably from 
> >ISCII to Unicode.
> 
> Ah, yes, this is possible. I'm so used to people asking the 
> other question 
> that I assumed from the slightly mixed up references in the 
> question that this was what Sunil intended.

OK, this is my interpretation of Sunil's question: He has text data encoded
in a so-called "font encoding" (e.g. "Shusha"), and he needs to convert it
to Unicode.

The Linux Technology Development for Indian Languages
(http://www.cse.iitk.ac.in/users/isciig/) has two ongoing projects for
similar conversions:

- iconverter
(http://www.cse.iitk.ac.in/users/isciig/iconverter/main.html)
- ISSCIIlib
(http://www.cse.iitk.ac.in/users/isciig/isciilib/main.html)

_ Marco




Re: converting devanagari to mangal unicode

2002-12-16 Thread John Hudson
At 03:09 PM 12/16/2002, Eric Muller wrote:


In order to convert any Devanagari font to be rendered in the same way,


May be Sunil is just asking for a conversion of data, presumably from 
ISCII to Unicode.

Ah, yes, this is possible. I'm so used to people asking the other question 
that I assumed from the slightly mixed up references in the question that 
this was what Sunil intended.

John Hudson

Tiro Typeworks		www.tiro.com
Vancouver, BC		[EMAIL PROTECTED]

A book is a visitor whose visits may be rare,
or frequent, or so continual that it haunts you
like your shadow and becomes a part of you.
   - al-Jahiz, The Book of Animals




Re: converting devanagari to mangal unicode

2002-12-16 Thread Eric Muller
In order to convert any Devanagari font to be rendered in the same way, 


May be Sunil is just asking for a conversion of data, presumably from 
ISCII to Unicode.

Eric.





Re: converting devanagari to mangal unicode

2002-12-16 Thread John Hudson


> I am Gis/Website developer my query is
>
> I have a data in devanagri true type font i want to convert
> this data into mangal unicode.
>
> I want to know whether any converter is available for
> converting devanagari to mangal unicode.


This is, excuse the pun, a bit of a mangled question. Mangal is Microsoft's 
Hindi UI font; it is an OpenType font that uses glyph substitution and 
positioning to correctly display the Devanagari script on top of a standard 
Unicode text string. In order to convert any Devanagari font to be rendered 
in the same way, two steps are necessary:

1. Make sure that the font has a Unicode cmap table and that the base forms 
of Devanagari characters are encoded in it in accordance with the Unicode 
standard.

2. Use Microsoft's free VOLT tool to add OpenType Layout tables for glyph 
substitution and positioning.

There is no automated way to do such a conversion, although various 
sub-stages could be automated within particular tools (e.g. defining 
Unicode cmap mappings from glyph names in FontLab). The nature of the 
OpenType Layout lookups required will depend on the glyph repertoire of the 
individual font.

See http://www.microsoft.com/typography/specs/default.htm for more 
information about making OpenType fonts for Devanagari and other scripts.

John Hudson

Tiro Typeworks		www.tiro.com
Vancouver, BC		[EMAIL PROTECTED]

A book is a visitor whose visits may be rare,
or frequent, or so continual that it haunts you
like your shadow and becomes a part of you.
   - al-Jahiz, The Book of Animals