RE: Errata in language/script list: xUSSR languages
Kairat, I fould this link regarding a new Bashkir Tatar Latin alphbet. http://rferl.org/bd/tb/tatar/TATAR/abs.html Carl -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Kairat A. Rakhim Sent: Friday, August 03, 2001 4:16 AM To: James Kass Cc: Unicode List Subject: Re: Errata in language/script list: xUSSR languages I 've uploaded tables of Latin-based alphabets of Abazin, Awar, Adyge, Balkar, Bashkir, Buryat, Dargwa.to http://www.pmicro.kz/~library/unicode/index.html Regards, Kairat
RE: Errata in language/script list: xUSSR languages
At 09:05 7/31/2001 -0500, Hohberger, Clive wrote: Tundra Nenets, together with Forest Nenets, forms the Nenets group of languages, which belongs to the Samoyed branch of the Finno-Ugrian (Uralic) language family. Nenets was formerly known as Yurak or Yurak Samoyed, both now obsolete. Last year, or perhaps 1999, I was approached by a Finnish academic who was working with a poet/publisher in Russia who wanted a font for printing Forest Nenets. Nothing ever came of the project, but I did record that the orthography being used for Forest Nenets was slightly different from that used for Tundra Nenets. One letter in the Forest Nenets orthography -- EL with spike -- was not encoded in Unicode last I checked, but I was never able to get a clear understanding of whether this orthography was invented by the publisher, who was very keen to encourage develop Forest Nenets as a literary language distinct from Tundra Nenets, or if it had an established user community. John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] There are sheep in the field. 'I know what they are,' she says, 'but I don't know what they are called.' Thus Wittgenstein is routed by my mother. (Alan Bennett, Diaries 1983)
Re: Errata in language/script list: xUSSR languages
James Kass wrote, Kairat A. Rakhim wrote, I have notes about languages of former USSR included in the list. In 1930th almost all of them have been written in Latin script known as 'Unified New Turkic Alphabet',.or in its derivatives (Common Northern Alphabet etc). It should be emphasized that these Latin-based alphabets contain a number of characters which are not yet encoded in Unicode. Is it possible to see some examples?I shall upload examples as soon as possible, in day or two. Now I upload Kazakh alphabet based on Arabic script to http://www.pmicro.kz/~library/unicode/kazakh.html. I haven't finished this work yet, so there is an image only, without any comments. ...What is 'Netets'? If you're asking about 'Nenets' Sorry. Imeanttwo different entries in the list, 'Nenets' and unknown 'Netets'. Best regards, Kairat A. Rakhim [EMAIL PROTECTED] Public Library of Karaganda, Kazakhstan
Re: Errata in language/script list: xUSSR languages
Peter Constable thought maybe a couple and you illustrate no additional characters required. I'll split the difference and say one. With the lower case... it's a couple, isn't it? I meant the upper / lower of what I think Marco proposed as 413+321, but I'm not sure these should be represented using 0321. Ken indicated that he thought it should not be represented that way and was, indeed, missing. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: [EMAIL PROTECTED]
Errata in language/script list: xUSSR languages
Hello,I have notes about languages of former USSR included in the list.In 1930th almost all of them have been written in Latin script known as 'Unified New Turkic Alphabet',.or in its derivatives (Common Northern Alphabet etc). It should be emphasized that these Latin-based alphabets contain a number of characters which are not yet encoded in Unicode. Later these facts was clearly expurgated from publications. For example, you can't find full information about Kazakh (Latin) alphabet even in Kazakh Encyclopedia (It is 14 volumes printed in 1970th). I have got two rare books containing alphabet tables:- M.I.Isaev. Yazykovoe stroitel'stvo v SSSR=Linguistic building in the USSR. - Moscow, 1979,- R.S.Gilyarevsky, V.S.Grivnin. Opredelitel' yazykov mira po pis'mennostyam=Handbook for recognition of world languages by script. - 3rd edition. - Moscow, 1965.There are differences in dates of alphabet transitions in both books, and I skipped them (I hope verify them and publish later with alphabet tables). But chronological orders are the same: Language Scripts in chronological order Abaza Latin, Cyrillic Abkhaz Cyrillic, Latin, Georgian, Cyrillic Adygei Arabic, Latin, Cyrillic Altai Cyrillic, Latin, Cyrillic Avar Arabic, Latin, Cyrillic Azerbaijani (Azeri) Arabic, Latin, Cyrillic, Latin Balkar Arabic, Latin, Cyrillic Bashkir Arabic, Latin, Cyrillic Buryat Mongolian, Latin, Cyrillic Chechen Arabic, Latin, Cyrillic Chukchi Latin, Cyrillic Crimean Tatar Arabic, Latin, Cyrillic, Latin Dargwa Arabic, Latin, Cyrillic Dungan Arabic, Cyrillic Evenki Latin, Cyrillic Ingush Arabic, Latin, Cyrillic Kabardian and Cherkessian CyrillicArabic, Latin, Cyrillic Kalmyk Mongolian, Cyrillic, Latin, Cyrillic Karachay Arabic, Latin, Cyrillic Karakalpak Arabic, Latin, Cyrillic Karelian Cyrillic Kazakh Arabic, Latin, Cyrillic Khakass Cyrillic, Latin, Cyrillic Khanty Latin, Cyrillic Kirghiz Arabic, Latin, Cyrillic Komi Cyrillic, Latin, Cyrillic Koryak Latin, Cyrillic Kumyk Arabic, Latin, Cyrillic Kurdish Arabic, Armenian, Latin, Cyrillic Lak Arabic, Latin, Cyrillic Lezghian Arabic, Latin, Cyrillic Mansi Latin, Cyrillic Nanai Latin, Cyrillic Nenets Latin, Cyrillic Nivkh Latin, Cyrillic Nogai Arabic, Latin, Cyrillic Ossetic GeorgianCyrillic, Armenian, Latin, Cyrillic Sami Cyrillic, Latin Selkup Latin, Cyrillic Shor Cyrillic, Latin, Cyrillic Tabasaran Latin, Cyrillic Tajik Arabic, Latin, Cyrillic Tat Cyrillic Tatar Arabic, Latin, Cyrillic, Latin Turkmen Arabic, Latin, Cyrillic, Latin Tuva Mongolian, Latin, Cyrillic Udekhe Latin, Cyrillic Uighur Uighur, Arabic, Latin, Cyrillic Uzbek Arabic, Latin, Cyrillic, Latin Yakut Cyrillic, Latin, Cyrillic Cherkessian, Crimean Tatar, Kumyk, Nivkh are not yet presented in the list. Azerbaijani and Azeri are the same language.What is 'Netets'? One note about [3] abbreviation. Kazakh, Kirghiz, Uighur in China, Iran is written in Arabic nowadays. Best regards, Kairat A. Rakhim [EMAIL PROTECTED] Public Library of Karaganda, Kazakhstan
Re: Errata in language/script list: xUSSR languages
On Tue, Jul 31, 2001 at 17:58:57 +0700, Kairat A. Rakhim wrote: Nenets Latin, Cyrillic What is 'Netets'? http://directory.google.com/Top/Regional/Europe/Russia/Society_and_Culture/Nationalities/Arctic_and_Siberian/Nenets/ http://directory.google.com/Top/Science/Social_Sciences/Language_and_Linguistics/Natural_Languages/Finno-Ugric_Languages/Nenets/ SY, Uwe -- [EMAIL PROTECTED] | Zu Grunde kommen http://www.ptc.spbu.ru/~uwe/| Ist zu Grunde gehen
Re: Errata in language/script list: xUSSR languages
On 07/31/2001 05:58:57 AM Kairat A. Rakhim wrote: Cherkessian, Crimean Tatar, Kumyk, Nivkh are not yet presented in the list. It's my understanding that the Nivkh Cyrillic writing system requires a couple of characters that are not yet in Unicode. These same characters are also required for Yupik (Central Siberian Yupik, I think -- maybe other varieties as well). - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: [EMAIL PROTECTED]
Re: Errata in language/script list: xUSSR languages
Kairat A. Rakhim wrote, I have notes about languages of former USSR included in the list. In 1930th almost all of them have been written in Latin script known as 'Unified New Turkic Alphabet',.or in its derivatives (Common Northern Alphabet etc). It should be emphasized that these Latin-based alphabets contain a number of characters which are not yet encoded in Unicode. Is it possible to see some examples? ...What is 'Netets'? If you're asking about 'Nenets', according to The Languages of the World (Katzner,1975): Nenets, formerly known as Yurak, is spoken in the northernmost part of the Soviet Union, in an area extending from the White Sea on the west to the Yenisei River on the east, a distance of about 1,500 miles. It's speakers, who are known as Nentsy, number about 25,000. Most of them live in the Yamal-Nenets National District, with its capital at Salekhard; about 3,500 live in the Nenets National District, whose capital is Naryan Mar. Nenets is the most widely spoken of the Samoyed languages, one of the two branches of the Uralic family. Best regards, James Kass.
RE: Errata in language/script list: xUSSR languages
Tundra Nenets, together with Forest Nenets, forms the Nenets group of languages, which belongs to the Samoyed branch of the Finno-Ugrian (Uralic) language family. Nenets was formerly known as Yurak or Yurak Samoyed, both now obsolete. Clive -Original Message- From: Valeriy E. Ushakov [SMTP:[EMAIL PROTECTED]] Sent: Tuesday, July 31, 2001 7:48 AM To: [EMAIL PROTECTED] Subject: Re: Errata in language/script list: xUSSR languages On Tue, Jul 31, 2001 at 17:58:57 +0700, Kairat A. Rakhim wrote: Nenets Latin, Cyrillic What is 'Netets'? http://directory.google.com/Top/Regional/Europe/Russia/Society_and_Culture /Nationalities/Arctic_and_Siberian/Nenets/ http://directory.google.com/Top/Science/Social_Sciences/Language_and_Lingu istics/Natural_Languages/Finno-Ugric_Languages/Nenets/ SY, Uwe -- [EMAIL PROTECTED] | Zu Grunde kommen http://www.ptc.spbu.ru/~uwe/| Ist zu Grunde gehen
Re: Errata in language/script list: xUSSR languages
Peter Constable wrote, It's my understanding that the Nivkh Cyrillic writing system requires a couple of characters that are not yet in Unicode. These same characters are also required for Yupik (Central Siberian Yupik, I think -- maybe other varieties as well). For a nice illustration of the Nivkh alphabet: http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm Best regards, James Kass.
RE: Errata in language/script list: xUSSR languages
James Kass wrote: Peter Constable wrote, It's my understanding that the Nivkh Cyrillic writing system requires a couple of characters that are not yet in Unicode. These same characters are also required for Yupik (Central Siberian Yupik, I think -- maybe other varieties as well). For a nice illustration of the Nivkh alphabet: http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm Seems to me that, using composing diacritics, all letters can be encoded: 410 411 412 413 492 413+321 414 415 401 416 417 418 419 41A 41A+31B 49A 49A+31B 41B 41C 41D 4C7 41E 41F 41F+31B 420 420+30C 421 422 422+31B 423 424 425 4B2 425+335 426 427 428 429 42A 42B 42C 42D 42E 42F (I maintained the same layout as the beautiful chart on the above web site, and I removed the leading zero from codes to keep lines short.) Notice that the combination 413+321 probably requires an ad-hoc glyph or a special kerning between the base letter and the diacritic. _ Marco
Re: Errata in language/script list: xUSSR languages
On 07/31/2001 05:58:57 AM Kairat A. Rakhim wrote: Cherkessian, Crimean Tatar, Kumyk, Nivkh are not yet presented in the list. Peter C responded: It's my understanding that the Nivkh Cyrillic writing system requires a couple of characters that are not yet in Unicode. Can someone propose them? Rick
Nivkh ( was: RE: Errata in language/script list: xUSSR languages)
Marco said: James Kass wrote: Peter Constable wrote, It's my understanding that the Nivkh Cyrillic writing system requires a couple of characters that are not yet in Unicode. These same characters are also required for Yupik (Central Siberian Yupik, I think -- maybe other varieties as well). For a nice illustration of the Nivkh alphabet: http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm See: http://www.eki.ee/letter/chardata.cgi?lang=_nivkhscript=cyrillic for an analysis of the missing letters. Seems to me that, using composing diacritics, all letters can be encoded: 410 411 412 413 492 413+321 This is probably a Cyrillic descender, and not a palatalization hook. So this character is missing (and is one of the ones Peter mentioned for Siberian Yupik). 414 415 401 416 417 418 419 41A 41A+31B 49A 49A+31B 41B 41C 41D 4C7 41E I doubt that these involve 031B (combining horn). More likely 0315 or 02BC. 41F 41F+31B 420 420+30C 421 422 422+31B 423 424 425 4B2 425+335 426 427 428 429 The ha-bar should probably just be encoded as a separate character. It is shown as missing in the eki.ee database. 42A 42B 42C 42D 42E 42F Rick said: Can someone propose them? Indeed. Nothing gets done without a proposal and someone to carry it forward. --Ken
Re: Errata in language/script list: xUSSR languages
Marco Cimarosti wrote, For a nice illustration of the Nivkh alphabet: http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm Seems to me that, using composing diacritics, all letters can be encoded: 410 411 412 413 492 413+321 414 415 401 416 417 418 419 41A 41A+31B 49A 49A+31B 41B 41C 41D 4C7 41E 41F 41F+31B 420 420+30C 421 422 422+31B 423 424 425 4B2 425+335 426 427 428 429 42A 42B 42C 42D 42E 42F (I maintained the same layout as the beautiful chart on the above web site, and I removed the leading zero from codes to keep lines short.) Tried a similar approach with Unipad... А Б В Г Ғ Ҕ Д Е Ё Ж З И Й К Кʼ Ӄ Ӄʼ Л М Н Ӈ О П Пʼ Р Р̌ С Т Тʼ У Ф Х Х̡ Х̵ Ц Ч Ш Щ Ъ Ы Ь Э Ю Я Peter Constable thought maybe a couple and you illustrate no additional characters required. I'll split the difference and say one. The kh (Х̡) with hoop should typographically match the (Ӄ). Since the Cyrillic Unicode encodes both (Ӄ) and (Қ), seems like the modified (Х) should get the same treatment. Best regards, James Kass.