In Indic scripts in certain contexts you have to use a vowel sign for the typography to make sense; you can’t use a vowel letter in its place. So for example the middle “ku” in my name has to be written as ક+ુ — which will be rendered as કુ — even though it is equivalent to ક+્+ઉ. Also, “halant” (્), is not a letter! 

I would strongly urge Nikhilesh and other people wanting to use any Indic script to *avoid*  it (even if Go implements TR31 as in Swift) and instead use the lossless transliteration scheme of IAST if the program calls for an Indian word as a Go object name.   https://en.wikipedia.org/wiki/International_Alphabet_of_Sanskrit_Transliteration 


On Nov 6, 2022, at 4:02 AM, Rob Pike <r...@golang.org> wrote:



% unicode -d పే

U+0C2A 'ప' telugu letter pa

U+0C47 'ే' telugu vowel sign ee

% unicode -U C2A C47

U+0C2A 'ప' TELUGU LETTER PA

category: Lo

canonical combining classes: 0

bidirectional category: L

mirrored: N

U+0C47 'ే' TELUGU VOWEL SIGN EE

category: Mn

canonical combining classes: 0

bidirectional category: NSM

mirrored: N

%


The problem is the second code point, U+0C47, Telugu vowel sign EE. It is not in the letter class. If I change your program to use just the first code point, it works: https://play.golang.com/p/eNvuZH33s65


The rules for identifiers in Go were chosen because they are easy to implement, but they do have the problem that they do not treat all languages equally. They may expand one day, but at the moment this is the situation.


There are a number of open issues around this. Start with https://github.com/golang/go/issues/20706 if you want to read more.


-rob




On Sun, Nov 6, 2022 at 9:52 PM Konstantin Khomoutov <kos...@bswap.ru> wrote:
On Sun, Nov 06, 2022 at 01:45:53PM +0530, Nikhilesh Susarla wrote:

>> Per the Go spec[1], an identifier consists of a Unicode letter followed by
>> zero or more Unicode letters or digits. The character పే is in the Unicode
>> category nonspacing mark rather than the category letter.
[...]
> So, if the unicode letters are there in the nonspacing mark as you
> mentioned they can't be used right ?

I sense the source of your misunderstanding might be rooted in your lack of
certain basics about Unicode. You seem to call "a letter" anything which may
appear in a text document (a Go source code file is a text document) but this
it not true. Maybe that's just a terminological problem, but still the fact
is, the Unicode standard calls "letters" a very particular group of things
among those the Unicode standard describes. To give a very simplified example,
in the text string "foo bar" there are six letters (five distinct) and one
space character which is not a letter. The charcter being discussed is not a
letter in Unicode, either.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/20221106105154.xkoemtt6tx25flam%40carbon.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOXNBZS085qwY5tXj%3Di5MeBguXeemHYBmSzjZks--MNmALohcg%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/560F39D7-DC3F-443A-A062-B70D6DA42D5D%40iitbombay.org.

Reply via email to