Printing and Displaying Dependent Vowels

2004-03-25 Thread Avarangal



We are in the process of updating Tamil keyboard 
drivers and one of the requirements by educational establishments is the ability 
to print and display dependent vowels without dotted circles.
 
Can any one provide information on the sequences 
used for diplaying and printing dependent vowels as standalones.
 
Srivas


printing dependent vowels

2004-03-22 Thread Avarangal



I'm looking for advice on how to print and display 
trailing vowels (dependent vowels) as stand alone characters in  three 
different formats as described at
http://www.araichchi.com/kanini/misal/print-trailing.pdf
 
Srivas


tick, tick box, cross, cross box

2004-03-21 Thread Avarangal



We are in need of tick, tick box, 
crossand cross box preferably as symbols with code points.
 
Any advice on this is appriciated
 
SArivas


Tamil Unicode in Win 95 and 98.

2003-03-10 Thread Avarangal




Tamil Unicode in Windows 95 and 98.
 
Now we can type, copy and paste within windows 95 and 
98.  Go to:
http://www.jaffnalibrary.com/tools/Tsc.htm
Click the button marked Unicode. Click in side the top 
box of the two boxes and start typing. 
 
Copy the Tamil text in the lower box. Paste into any 
application such as Word, OutlookExpress, etc... 
 
Way you go.
Thanks to Suratha
 
 
Sinnathurai Srivas
Avarangal


Re: A case for Tamil-X (k sh)

2003-01-08 Thread Avarangal
As Michka wrote the matter (x , ksh) is being discussed elsewhere at
present.

Sorry it was typo: It should be ng in English (not en) and ng in penguin.

I can't take my head off: keep saying rendering instead of complex
rendering. I'll try.

Sinnathurai


- Original Message -
From: "Doug Ewell" <[EMAIL PROTECTED]>
To: "Unicode Mailing List" <[EMAIL PROTECTED]>
Cc: "Avarangal" <[EMAIL PROTECTED]>
Sent: Wednesday, January 08, 2003 4:40 PM
Subject: Re: A case for Tamil-X (k sh)


> Sinnathurai Srivas  wrote:
>
> > ie, with rendering enabled one can not have ksh, but only x.
> > without rendering only ksh is possible and not x.
>
> "Without rendering," neither is possible.  As I tried to explain last
> July 22, the term "rendering" refers to the general process of mapping
> characters to glyphs.  The process you are talking about is "complex
> rendering."
>
> > An analogy is
> >
> > en in English is a single consonant (though written as en), but
> > en in penguin is two independent consonants.
>
> How can "en" in "English" be considered a single consonant?  It's
> pronounced [ɪŋ], a vowel (U+026A) followed by a consonant (U+014B).  The
> g is pronounced separately: [ˈɪŋɡlɪʃ].
>
> A better analogy would be:
>
> sh in hogwash is a single consonant (though written as sh), but
> sh in hogshead is two independent consonants.
>
> There may be merit in adding this new "x" character (or perhaps the
> problem could be solved with ZWNJ or ZWJ), but Michael is correct:
> although it's a good idea to discuss it on the list first, nothing will
> be considered for addition unless a proper proposal is written and
> submitted.
>
> -Doug Ewell
>  Fullerton, California
>




A case for Tamil-X (k sh)

2003-01-06 Thread Avarangal



At INFITT WG02, a discussion is goin on about encoding 
x as a character and not as ligature.
 
x is not always ksh.
 
so rendering problem occurs, when one is chosen, the 
other is not possible.
 
ie, with rendering enabled one can not have ksh, but 
only x.
without rendering only ksh is possible and not 
x.
 
Luxmi
riksha 
are typical examples of different ways of writing and 
pronouncing.
ie, the rendering system breaks.
 
An analogy is
 
en in English is a single consonant (though written as 
en), but 
en in penguin is two independent 
consonants.
 
When ligaturing technique is used and 
assuming the ng has a single symbol then representing both e+n and en are not 
possible.  
 
Tamil only have one of these unusual character x. 
Others are practised a+b or ab depending on where it occurs in a word. But x has 
it's own form and k sh form in independently.
 
hence there is a need to add a new Unicode character X 
for Tamil.


Documenting in Tamil Computing

2002-12-15 Thread Avarangal



fwd:
fyi: Below is a copy of a mail I circulated on the 
subject of 
Documenting in Tamil Computing 
<<
From: "sisrivas <[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> Date: Sun Dec 
15, 2002 11:24pm Subject: Documenting in Tamil Computing  
 
We need to be clear as to the direction that Tamil is going with regard 
to Tamil computing. I'm writing this again and again as there is some miss 
understanding about what font encodings are doing to Tamil computing. (TSC is 
Temporary. TAB is temporary, OldType(alas Bamini) is temporary.
 
1/If you are preparing a Tamil document, intended for long term use you 
must use Unicode Encoding. Any other approach you take can be considered a 
waste of time if your content is intended for long term use.So do yourself 
and others a favour, prepare your documents using Tamil Unicode.
 
see item 7 at the URL http://www.gbizg.com/Tamilfonts/ekalappai.htm 
on how to get Unicode keyboard drivers.
 
Unfortunately Windows 95 and Windows 98 can only read Unicode pages. You 
can write in Unicode using Windows NT, 2000, XP and linux.
 
So what can you do if you only have Windows 95, 98 or 3.1,Well sorry 
you need to use TSC or TAB or even OldType (alas Bamini) encoding. You can 
assume that these documents that you make will not be usable in the near 
future.
 
Are you going to write a book, are you going to publish some research 
materials, etc, etc, do your self a favour. use Unicode and nothing else. 

 
DO NOT WASTE YOUR TIME. TIME IS PRESIOUS.
 
2/catch 22
 
You know we all use Tamil eMail and for that we can not use Unicode.For 
Tamil eMail we use 8bit encoding called TSCii. I'm sorry to say that you 
still need to use this 8 bit encoding (which is not Unicode), because 
Unicode is not mature enough to be used in multilingual email yet.
 
You just have to make do with the 8bit TSCII encoding for Tamil 
eMail.
 
For more infohttp://www.geocities.com/avarangal/
 
Sinnathuirai Srivas
 


Converting Pages to Unicode

2002-07-13 Thread Avarangal



Forwarding a question on converting to 
Unicode.
 
<<<
. Now I have a 
Question:
 
 As we can see, each web-site has different 
encodingand font. But I need to convert all encoding intoUnicode and 
display the content in IE, by justspecifying the encoding, which is unicode 
.. So for my research, i need to convert all the filesin all 
these web-sites into a single unicode-encoding.
 
Additional Questions: When i wen through the 
web-sites i found somethinglike "x-user-defined". Can this user-defined be 
Unicode and 8bit?
 
One more silly doubt.. is UTF-8 same as unicode, and 
can everything (8bit) be converted toUTF-8 (if it was easier).
 
... then please suggest a method to convert all 
diff pages from diffweb-sites into a single encoding and read it using 
asingle font.
 
 Thanks a lot, Kalairaja
 
fwd: End


Re: Hexadecimal characters.

2002-06-23 Thread Avarangal

one more question,
Is a font replacing a-z and A-Z with language dependent glyphs (floating)
for HEX glyphs is legal and multilingual?


- Original Message -
From: "Alistair Vining" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Saturday, June 22, 2002 12:44 AM
Subject: RE: Hexadecimal characters.


> Roozbeh Pournader wrote:
> >
> > There is also another fact which may be interesting. My father, when he
> > was a high school student, had some advanced mathematic course, where
they
> > also studied computation in base more than ten. They used Persian digits
> > for zero to nine, and lowercase greek letters alpha, beta, ... for
digits
> > more than nine.
>
> So what we really need are combining number characters: one for the digit
value and
> one for the base.
>
> But oh no! only integer bases up to (number of code points allotted).  How
will we
> cope.
>
> Al.
>
>
>




Re: How to punch in Tamil Unicode characters?

2002-04-22 Thread Avarangal

visit
http://www.geocities.com/avarangal/index.html

I'll be contacting you after Wednesday about Keyboard Input prog that I
have.



- Original Message -
From: "suresh ." <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, April 22, 2002 9:16 AM
Subject: How to punch in Tamil Unicode characters?


> Hello everybody,
> I wish to use Unicode tamil chatacters to develop an
> website.Can some body tell me how to punch in Unicode
> tamil characters in to my webpage?.Is it sufficient
> for me to mention encoding=utf-16 as my charset in the
> webpage?
> Please let me know how to key-in as well as to display
> unicode tamil characters.
> Thanks a lot
> Lenasuresh
>
> __
> Do You Yahoo!?
> Yahoo! Games - play chess, backgammon, pool and more
> http://games.yahoo.com/
>
>




Re: Inherent "a"

2002-04-11 Thread Avarangal

Dear Doug Ewell, William Overington, James E. Agenbroad, and Maurice
Bauhahn,

Thank you all for the reply.

May I assume u+0b85 as official?

Some explanations for the need for a visible "a".
In Tamil,

a/
dependent "ai", and "au" has ligatures. infact "au" and "ou" at present
utilise the same ligature.
(Additionally the  use of "ai and au" are expected to be introduced, not as
ligatures.)

b/
(dithong?) such as ae, ao need to be linearly represented (without
ligatures)

d/
 The use of visible "a" for educational purposes with consonants are a
necessity.

e/
 A design plan need to be implemented, anticipating the possible use of
visible "a" instead of inherent "a" in the distance future.



Regards
Sinnathurai Srivas

>>
Avarangal

I'm not sure why you want a character for a "inherent a" which, in Indic
scripts, "exists"any consonant unmarked by a vowel sign or  virama -
perhaps you could describe your application. You could use 17B4 in the
Khmer block. Since Khmer is also an Indic script, this character essentially
has the properties you're looking for - though it's in a different block. -
Another idea might be to use  0B85 (TAMIL LETTER A) +  093C (NUKTA)
or maybe 0B85 + 0F9D (VIRAMA); - I don't know Tamil but I think these
combinations would not normally occur.

There was once a proposal to encode an "inherent a" or "root marker" at 0F70
in the Tibetan block as some people thought this was necessary as Tibetan
syllables often contain silent prefixes and suffixes - but the primary
collation is on the main letter in a syllable (which may be the second or
third
character in the string). In Tibetan the first consonant (or consonant
stack)
marked by a vowel sign is the root of a syllable - but where there is no
vowel
sign (i.e. an "inherent a") there is no "flag" to indicate the root
consonant
so some thought it would simplify processing to have one.
A problem with this is that there would be no visible glyph for such a
character and if the consonant marked by such an invisible character
was deleted the inherent a   character might get left behind and
consequently flag an adjoining character where it might not be wanted.

Also, since such a character is not necessary to display Tibetan properly,
chances
are you'd wind up with some people/ applications  making use of this
character
and others not using it - so you'd get two different strings for the same
word.
In the case of Tibetan, the root consonant in a syllable or word can be
determined by rules or a lookup - and in the end it was thought better to
leave
it to applications to determine unmarked root consonants when they needed to
rather than having an inherent a character to mark them (which in any case
would
require a rule based system or lookup to insert reliably - unless you leave
it to
users to type in) . IMO in general use such a character would probably cause
more problems than it solved - though it might sometimes be useful in
private
data.

- Chris

>>>
>
> While we're waiting for someone with better knowledge of Indic scripts
> to reply...
>
> 1.  An *inherent* A wouldn't have its own code point, would it?  I don't
> think of it as having an existence outside of the consonant it goes
> with.  Tamil KA is U+0B95, which represents K plus the inherent A.  If
> you wanted to represent only the K, you would use U+0B95 plus the Tamil
> virama, U+0BCD, to kill the A.  But how could you represent an inherent
> vowel by itself?
>
> 2.  Assuming you have an answer to #1 above, the only way "you" could
> allocate a Unicode code point for this character would be to use the
> Private Use Area.  You could choose any code point from U+E000 to U+F8FF
> for this purpose.  (There are unofficial assignments for some of these,
> but you are perfectly free to ignore them.)  Do *not* assign a code
> point in the Tamil block, or anywhere else except the Private Use Area,
> even if it's only for temporary and internal use.  To do so would be
> very non-conformant.
>
> -Doug Ewell
>  Fullerton, California
>
Monday, April 1, 2002
There is always 0B85 for this vowel when it is not "inhering" to a
consonant.

 Regards,
  Jim Agenbroad ( [EMAIL PROTECTED] )
 "It is not true that people stop pursuing their dreams because they
grow old, they grow old because they stop pursuing their dreams." Adapted
from a letter by Gabriel Garcia Marquez.
 The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
 Addresses: Office: Phone: 202 707-9612; Fax: 202 707-0955; US
mail: I.T.S. Sys.Dev.Gp.4, Library of Congress, 101 Independence Ave. SE,
Washington

Inherant "a"

2002-03-29 Thread Avarangal

I need to allocate a U+codepoint for inherent "a", to be used for Tamil
research. Can anyone suggest a temporary location or is it possible to find
such code point within the existing code point for Tamil.