RE: Errata in language/script list: xUSSR languages

2001-08-03 Thread Carl W. Brown

Kairat,

I fould this link regarding a new Bashkir  Tatar Latin alphbet.

http://rferl.org/bd/tb/tatar/TATAR/abs.html

Carl

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
 Behalf Of Kairat A. Rakhim
 Sent: Friday, August 03, 2001 4:16 AM
 To: James Kass
 Cc: Unicode List
 Subject: Re: Errata in language/script list: xUSSR languages
 
 
 I 've uploaded tables of Latin-based alphabets of Abazin, Awar, Adyge,
 Balkar, Bashkir, Buryat, Dargwa.to
 http://www.pmicro.kz/~library/unicode/index.html
 
 Regards,
 
 Kairat
 
 
 
 




RE: Errata in language/script list: xUSSR languages

2001-08-01 Thread John Hudson

At 09:05 7/31/2001 -0500, Hohberger, Clive wrote:

Tundra Nenets, together with Forest Nenets, forms the Nenets group of
languages, which belongs to the Samoyed branch of the Finno-Ugrian (Uralic)
language family. Nenets was formerly known as Yurak or Yurak Samoyed, both
now obsolete.

Last year, or perhaps 1999, I was approached by a Finnish academic who was 
working with a poet/publisher in Russia who wanted a font for printing 
Forest Nenets. Nothing ever came of the project, but I did record that the 
orthography being used for Forest Nenets was slightly different from that 
used for Tundra Nenets. One letter in the Forest Nenets orthography -- EL 
with spike -- was not encoded in Unicode last I checked, but I was never 
able to get a clear understanding of whether this orthography was invented 
by the publisher, who was very keen to encourage develop Forest Nenets as a 
literary language distinct from Tundra Nenets, or if it had an established 
user community.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

There are sheep in the field. 'I know what they are,' she says,
'but I don't know what they are called.' Thus Wittgenstein is
routed by my mother.  (Alan Bennett, Diaries 1983)





Re: Errata in language/script list: xUSSR languages

2001-08-01 Thread Kairat A. Rakhim



James Kass wrote, Kairat A. 
Rakhim wrote,   I have notes about languages of former USSR 
included in the   list. In 1930th almost all of them have been 
written in Latin   script known as 'Unified New Turkic Alphabet',.or 
in its   derivatives (Common Northern Alphabet etc). It should be 
  emphasized that these Latin-based alphabets contain a number 
  of characters which are not yet encoded in Unicode. 
 Is it possible to see some examples?I shall upload examples as 
soon as possible, in day or two. Now I upload Kazakh alphabet based on Arabic 
script to http://www.pmicro.kz/~library/unicode/kazakh.html. 
I haven't finished this work yet, so there is an image only, without any 
comments.
  ...What is 'Netets'? 
 If you're asking about 'Nenets'

Sorry. Imeanttwo different 
entries in the list, 'Nenets' and unknown 'Netets'.


Best regards,

Kairat A. Rakhim
[EMAIL PROTECTED]

Public Library of Karaganda,
Kazakhstan


Re: Errata in language/script list: xUSSR languages

2001-08-01 Thread Peter_Constable


Peter Constable thought maybe a couple and you illustrate
no additional characters required.

I'll split the difference and say one.

With the lower case... it's a couple, isn't it?

I meant the upper / lower of what I think  Marco proposed as 413+321, but
I'm not sure these should be represented using 0321. Ken indicated that he
thought it should not be represented that way and was, indeed, missing.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]

  




Errata in language/script list: xUSSR languages

2001-07-31 Thread Kairat A. Rakhim



Hello,I have notes about languages of 
former USSR included in the list.In 1930th almost all of them have been 
written in Latin script known as 'Unified New Turkic Alphabet',.or in its 
derivatives (Common Northern Alphabet etc). It should be emphasized that these 
Latin-based alphabets contain a number of characters which are not yet encoded 
in Unicode.
Later these facts was clearly expurgated from 
publications. For example, you can't find full information about Kazakh (Latin) 
alphabet even in Kazakh Encyclopedia (It is 14 volumes printed in 1970th). I 
have got two rare books containing alphabet tables:- M.I.Isaev. Yazykovoe 
stroitel'stvo v SSSR=Linguistic building in the USSR. - Moscow, 1979,- 
R.S.Gilyarevsky, V.S.Grivnin. Opredelitel' yazykov mira po 
pis'mennostyam=Handbook for recognition of world languages by script. - 3rd 
edition. - Moscow, 1965.There are differences in dates of alphabet 
transitions in both books, and I skipped them (I hope verify them and publish 
later with alphabet tables). But chronological orders are the same:




  
  

  Language

  Scripts in chronological order
  

  Abaza

  Latin, Cyrillic
  

  Abkhaz

  Cyrillic, Latin, Georgian, Cyrillic
  

  Adygei

  Arabic, Latin, Cyrillic
  

  Altai

  Cyrillic, Latin, Cyrillic
  

  Avar

  Arabic, Latin, Cyrillic
  

  Azerbaijani (Azeri)

  Arabic, Latin, Cyrillic,  Latin
  

  Balkar

  Arabic, Latin, Cyrillic
  

  Bashkir

  Arabic, Latin, Cyrillic
  

  Buryat

  Mongolian, Latin, Cyrillic
  

  Chechen

  Arabic, Latin, Cyrillic
  

  Chukchi

  Latin, Cyrillic
  

  Crimean Tatar

  Arabic, Latin, Cyrillic,  Latin
  

  Dargwa

  Arabic, Latin, Cyrillic
  

  Dungan

  Arabic, Cyrillic
  

  Evenki

  Latin, Cyrillic
  

  Ingush

  Arabic, Latin, Cyrillic
  

  Kabardian and Cherkessian

  CyrillicArabic, Latin, Cyrillic
  

  Kalmyk

  Mongolian, Cyrillic, Latin, Cyrillic
  

  Karachay

  Arabic, Latin, Cyrillic
  

  Karakalpak

  Arabic, Latin, Cyrillic
  

  Karelian

  Cyrillic
  

  Kazakh

  Arabic, Latin, Cyrillic
  

  Khakass

  Cyrillic, Latin, Cyrillic
  

  Khanty

  Latin, Cyrillic
  

  Kirghiz

  Arabic, Latin, Cyrillic
  

  Komi

  Cyrillic, Latin, Cyrillic
  

  Koryak

  Latin, Cyrillic
  

  Kumyk

  Arabic, Latin, Cyrillic
  

  Kurdish

  Arabic, Armenian, Latin, Cyrillic
  

  Lak

  Arabic, Latin, Cyrillic
  

  Lezghian

  Arabic, Latin, Cyrillic
  

  Mansi

  Latin, Cyrillic
  

  Nanai

  Latin, Cyrillic
  

  Nenets

  Latin, Cyrillic
  

  Nivkh

  Latin, Cyrillic
  

  Nogai

  Arabic, Latin, Cyrillic
  

  Ossetic

  GeorgianCyrillic, Armenian, Latin, 
Cyrillic
  

  Sami

  Cyrillic, Latin
  

  Selkup

  Latin, Cyrillic
  

  Shor

  Cyrillic, Latin, Cyrillic
  

  Tabasaran

  Latin, Cyrillic
  

  Tajik

  Arabic, Latin, Cyrillic
  

  Tat

  Cyrillic
  

  Tatar

  Arabic, Latin, Cyrillic,  Latin
  

  Turkmen

  Arabic, Latin, Cyrillic,  Latin
  

  Tuva

  Mongolian, Latin, Cyrillic
  

  Udekhe

  Latin, Cyrillic
  

  Uighur

  Uighur, Arabic, Latin, Cyrillic
  

  Uzbek

  Arabic, Latin, Cyrillic,  Latin
  

  Yakut

  Cyrillic, Latin, Cyrillic



Cherkessian, Crimean Tatar, Kumyk, Nivkh are 
not yet presented in the list. Azerbaijani and Azeri are the same 
language.What is 'Netets'?
One note about [3] abbreviation. Kazakh, 
Kirghiz, Uighur in China, Iran is written in Arabic nowadays.


Best regards,

Kairat A. Rakhim
[EMAIL PROTECTED]

Public Library of Karaganda,
Kazakhstan



Re: Errata in language/script list: xUSSR languages

2001-07-31 Thread Valeriy E. Ushakov

On Tue, Jul 31, 2001 at 17:58:57 +0700, Kairat A. Rakhim wrote:

   Nenets
  Latin, Cyrillic

 What is 'Netets'?

http://directory.google.com/Top/Regional/Europe/Russia/Society_and_Culture/Nationalities/Arctic_and_Siberian/Nenets/

http://directory.google.com/Top/Science/Social_Sciences/Language_and_Linguistics/Natural_Languages/Finno-Ugric_Languages/Nenets/

SY, Uwe
-- 
[EMAIL PROTECTED] |   Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/|   Ist zu Grunde gehen




Re: Errata in language/script list: xUSSR languages

2001-07-31 Thread Peter_Constable


On 07/31/2001 05:58:57 AM Kairat A. Rakhim wrote:

Cherkessian, Crimean Tatar, Kumyk, Nivkh are  not yet presented in the
list.

It's my understanding that the Nivkh Cyrillic writing system requires a
couple of characters that are not yet in Unicode. These same characters are
also required for Yupik (Central Siberian Yupik, I think -- maybe other
varieties as well).



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]

  




Re: Errata in language/script list: xUSSR languages

2001-07-31 Thread James Kass

Kairat A. Rakhim wrote,

 I have notes about languages of former USSR included in the 
 list.  In 1930th almost all of them have been written in Latin 
 script known as 'Unified New Turkic Alphabet',.or in its 
 derivatives (Common Northern Alphabet etc). It should be 
 emphasized that these Latin-based alphabets contain a number 
 of characters which are not yet encoded in Unicode.

Is it possible to see some examples?

 ...What is 'Netets'?

If you're asking about 'Nenets', according to The Languages of
the World (Katzner,1975):

Nenets, formerly known as Yurak, is spoken in the northernmost
part of the Soviet Union, in an area extending from the White Sea
on the west to the Yenisei River on the east, a distance of about
1,500 miles.  It's speakers, who are known as Nentsy, number about
25,000.  Most of them live in the Yamal-Nenets National District,
with its capital at Salekhard; about 3,500 live in the Nenets
National District, whose capital is Naryan Mar.  Nenets is the most
widely spoken of the Samoyed languages, one of the two branches
of the Uralic family.

Best regards,

James Kass.







RE: Errata in language/script list: xUSSR languages

2001-07-31 Thread Hohberger, Clive

Tundra Nenets, together with Forest Nenets, forms the Nenets group of
languages, which belongs to the Samoyed branch of the Finno-Ugrian (Uralic)
language family. Nenets was formerly known as Yurak or Yurak Samoyed, both
now obsolete. 

Clive



 -Original Message-
 From: Valeriy E. Ushakov [SMTP:[EMAIL PROTECTED]]
 Sent: Tuesday, July 31, 2001 7:48 AM
 To:   [EMAIL PROTECTED]
 Subject:  Re: Errata in language/script list: xUSSR languages
 
 On Tue, Jul 31, 2001 at 17:58:57 +0700, Kairat A. Rakhim wrote:
 
Nenets
   Latin, Cyrillic
 
  What is 'Netets'?
 
 http://directory.google.com/Top/Regional/Europe/Russia/Society_and_Culture
 /Nationalities/Arctic_and_Siberian/Nenets/
 
 http://directory.google.com/Top/Science/Social_Sciences/Language_and_Lingu
 istics/Natural_Languages/Finno-Ugric_Languages/Nenets/
 
 SY, Uwe
 -- 
 [EMAIL PROTECTED] |   Zu Grunde kommen
 http://www.ptc.spbu.ru/~uwe/|   Ist zu Grunde gehen




Re: Errata in language/script list: xUSSR languages

2001-07-31 Thread James Kass


Peter Constable wrote,

 It's my understanding that the Nivkh Cyrillic writing 
 system requires a couple of characters that are not yet 
 in Unicode. These same characters are also required for 
 Yupik (Central Siberian Yupik, I think -- maybe other
 varieties as well).

For a nice illustration of the Nivkh alphabet:
http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm

Best regards,

James Kass.






RE: Errata in language/script list: xUSSR languages

2001-07-31 Thread Marco Cimarosti

James Kass wrote:
 Peter Constable wrote,
 
  It's my understanding that the Nivkh Cyrillic writing 
  system requires a couple of characters that are not yet 
  in Unicode. These same characters are also required for 
  Yupik (Central Siberian Yupik, I think -- maybe other
  varieties as well).
 
 For a nice illustration of the Nivkh alphabet:
 http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm

Seems to me that, using composing diacritics, all letters can be encoded:

410 411 412 413 492 413+321
414 415 401 416 417 418 419 41A
41A+31B 49A 49A+31B 41B 41C 41D 4C7 41E
41F 41F+31B 420 420+30C 421 422 422+31B 423
424 425 4B2 425+335 426 427 428 429
42A 42B 42C 42D 42E 42F

(I maintained the same layout as the beautiful chart on the above web site,
and I removed the leading zero from codes to keep lines short.)

Notice that the combination 413+321 probably requires an ad-hoc glyph or a
special kerning between the base letter and the diacritic.

_ Marco




Re: Errata in language/script list: xUSSR languages

2001-07-31 Thread Rick McGowan


 On 07/31/2001 05:58:57 AM Kairat A. Rakhim wrote:

 Cherkessian, Crimean Tatar, Kumyk, Nivkh are  not yet presented in the
 list.

Peter C responded:

 It's my understanding that the Nivkh Cyrillic writing system requires a
 couple of characters that are not yet in Unicode.

Can someone propose them?

Rick




Nivkh ( was: RE: Errata in language/script list: xUSSR languages)

2001-07-31 Thread Kenneth Whistler

Marco said:

 James Kass wrote:
  Peter Constable wrote,
  
   It's my understanding that the Nivkh Cyrillic writing 
   system requires a couple of characters that are not yet 
   in Unicode. These same characters are also required for 
   Yupik (Central Siberian Yupik, I think -- maybe other
   varieties as well).
  
  For a nice illustration of the Nivkh alphabet:
  http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm

See:

http://www.eki.ee/letter/chardata.cgi?lang=_nivkhscript=cyrillic

for an analysis of the missing letters.

 
 Seems to me that, using composing diacritics, all letters can be encoded:
 
 410 411 412 413 492 413+321

This is probably a Cyrillic descender, and not a palatalization hook.
So this character is missing (and is one of the ones Peter mentioned
for Siberian Yupik).

 414 415 401 416 417 418 419 41A
 41A+31B 49A 49A+31B 41B 41C 41D 4C7 41E

I doubt that these involve 031B (combining horn).  More likely
0315 or 02BC.

 41F 41F+31B 420 420+30C 421 422 422+31B 423
 424 425 4B2 425+335 426 427 428 429

The ha-bar should probably just be encoded as a separate character.
It is shown as missing in the eki.ee database.

 42A 42B 42C 42D 42E 42F
 

Rick said:

 Can someone propose them?

Indeed. Nothing gets done without a proposal and someone to carry
it forward.

--Ken




Re: Errata in language/script list: xUSSR languages

2001-07-31 Thread James Kass


Marco Cimarosti wrote,

  For a nice illustration of the Nivkh alphabet:
  http://odur.let.rug.nl/~bergmann/russia/alphabets/nivkh.htm

 Seems to me that, using composing diacritics, all letters can be encoded:

 410 411 412 413 492 413+321
 414 415 401 416 417 418 419 41A
 41A+31B 49A 49A+31B 41B 41C 41D 4C7 41E
 41F 41F+31B 420 420+30C 421 422 422+31B 423
 424 425 4B2 425+335 426 427 428 429
 42A 42B 42C 42D 42E 42F

 (I maintained the same layout as the beautiful chart on the above web site,
 and I removed the leading zero from codes to keep lines short.)


Tried a similar approach with Unipad...

   А  Б  В  Г  Ғ  Ҕ
Д  Е  Ё  Ж  З  И  Й  К
Кʼ Ӄ  Ӄʼ Л  М  Н  Ӈ  О
П  Пʼ Р  Р̌  С  Т  Тʼ У
Ф  Х  Х̡  Х̵  Ц  Ч  Ш  Щ
   Ъ  Ы  Ь  Э  Ю  Я

Peter Constable thought maybe a couple and you illustrate
no additional characters required.

I'll split the difference and say one.  The kh (Х̡) with hoop
should typographically match the (Ӄ).  Since the Cyrillic
Unicode encodes both (Ӄ) and (Қ), seems like the modified
(Х) should get the same treatment.

Best regards,

James Kass.