Forming Coptic Numbers in Unicode
Greetings, To compose coptic numerals under Unicode I've applied the appropriate lowercase letters in the Greek-Coptic range with the elements from the Combining Diacritical Marks: U+0304, U+0331 and U+0347. I had no basis to choose these diacritical symbols upon other than they seemed to get the job done visually. What are the "approved" symbols for composing coptic numbers portably? Next question; what does one do for character codes to show values over a billion? If there is no official sanctioned solution, it occured to me that the diacritic symbols could simply accumulate following the lowercase char for interchange and then be presented graphically. Some recommendation added to the "The Unicode Standard" reference would be a good service here, sorry if I've missed it. thanx, /Daniel
Re: Chromatic font research
> In the handwritten form, could you please say whether the adding of the red > increases the width of the area needed to represent the character? yes, absolutely, at least by the width of two dots. > Also, when handwritten, does the scribe have a black pen in one hand and a > red pen in the other so that colouring takes place on a character by > character basis as writing proceeds, or does the scribe put down one pen and > pick up another, and, if so, is that on a character by character basis or is > that on the basis of producing a number of characters in black and then > adding the red afterwards. This would seem to be possibly significant due > to the possible need to allow for the greater width of the area used for a > character that is later to receive red flourishes. my oh my, these are wonderfully interesting questions :) I would think the use of tools would be highly sensitive to the experience, training, and learned habits of the writer. I haven't witnessed a great enough number to sensibly say what a norm would be. I certainly haven't seen a person hold two pens at once though. The scribes I've seen (maybe 4 I watched closely) were pragmatic in their writing, when a red word occurred they would put down the black brush and pick up the red and write the word. While the utensil was still in hand they would go back and add red dots or strokes where they thought it was needed. If no red words occurred (usually one every sentence or two depending upon the material) they would continue writing in black until the end of a sentence or section and stop there to change pens to go back and update punctuation or tonal marks. Again, I wouldn't draw any significant conclusions from this. I don't believe extra space is considered for adding red marks later, the red is allowed to bleed over the black. Trying to reproduce the practice with fonts though I have used an enlarged version of 1362 because the result looked much clearer. The original intention was lost when keeping the original proportions. My thought at the time was that it was just a natural adjustment that one makes when going from ink and paper to computer typography, the goal being that we try to improve upon what the hand can do without losing the essence of it. /Daniel
Re: Chromatic font research
> of the uses have a cultural and sometimes religious significance I felt that > it would be respectful to those situations to use a purely ornamental In the ethiopic case it is 1362 (four dots like ::) interlaced with 5 red dots in the sign of the cross that is the most common. This is 9 dots altogether and at a glance looks like a colorful paragraph separator. Any punctuation or numeral may receive extra flourishes of red (1364 receives red strokes about as often), there is no semantic impact on the character. It is a practice relegated by and large to religious works, scribes themselves have told me that they have no rhyme nor reason for why they've made one character or word red in one sentence and not the next -save for possible subliminal divine inspiration at that particular instant :) The capability to the same electronically would be well received. /Daniel
UTF-8@Hotmail
Greetings, I just noticed that utf-8 encoding is finally working at hotmail. UTF-8 works in the subject as well as the body of a letter. Late last year I saw that UTF-8 would not display properly at hotmail, even when the letter body was HTML with the encoding set right. Anyone here know for how long this functionality has been available and to what extent? I'm not quite brave enough to start using a unicode password ;) thanks, /Daniel
RFC: Extended Ethiopic
Greetings, The Quality and Standards Authority of Ethiopia is now requesting public input on the proposed national syllabary. Note that the closing date for comment submissions is the 2nd of March: http://www.qsae.org/web_en/Standards_info/Drafts.htm thank you, /Daniel
Re: Additional Ethiopic characters?
> Daniel Yacob was to get me samples of the characters in use, so we > could update the proposal. That hasn't happened yet. "All good things come to those who wait." ..and lots of good things are coming, however slowly ;)
Q: "smEthiopic" in Apple Localization Codes
Greetings, I'm hoping Apple developers might be able to clear up what is happening with the "smEthiopic" script identifier in the reference: http://developer.apple.com/techpubs/quicktime/qtdevdocs/APIREF/SOURCESIV/localizationcodes.htm The granularity of script names is not quite in step with what we see at the Unicode "Code Charts" page, which might be why we then find: langInuktitut= 143 // Inuit using smEthiopic script Taken literally "Inuit using smEthiopic script" is not a high probability scenario. Not that I wouldn't recommend Ethiopic to the Inuit ;) Is this mearly a bug in the comments or does "smEthiopic" span from the Ethiopic range thru the Unified Canadian Aboriginal Syllabic region? The absence of "smCherokee" and "smCanadian" leads me towards the later. thanks, /Daniel
Planning a "Unicode Only" Week
Greetings, A number of the Ethiopian language news services have tentatively planned for a "Unicode Only" week during the first week of January. Service in all other encoding systems would be suspended during the week. The intention is to give users a gentle push to download and install a Unicode font. Users will be warned weeks ahead of time of the impending legacy encoding outage. The choice of the first week of January was mostly arbitrary. If other web sites would like to join us for a larger "Unicode Only" week please contact me offline. We could reschedule to coincide with the week of IUC-20 if there is enough interest. cheers, /Daniel
Feera: Ancient Script of the Afar
I've stumbled into Feera this evening and have added it to my list of things to be grateful for this Thanksgiving. Now if only I could read french: http://www.arhotaba.com/feera.htm Certainly elements of the script are familiar but it is distinct from neighboring writing systems like Osmanya or Ethiopian cursive which share a few common glyphs. Does anyone here no more about it? /Daniel
Grand Unified Syllabary Project Opens
The Grand Unified Syllabary project has the primary objective to map the natural (non-composition based) syllabaries of Unicode onto a common linguistic frame of reference. The target frame of reference is a CVCT table (consonant-vowel-consonant-tone) applying IPA rules for the phonemic mapping of the symbols. Such a table that defined the component properties of syllables, it is assumed, would serve as a reference for: * syllabic character classes * regular expression languages * transliteration between syllabaries and other writing systems * phonetic based and script independent input methods GUS furthers the development of "Syllables.txt" data file introduced with Perl 5.6. Orthography experts are still in great need for the Yi, Canadian Aboriginal, Cherokee, Katakana and Hiragana syllabaries. More information, and an development email list can be found on the project homepage: http://syllabary.sourceforge.net/ /Daniel
Status of Unicode on Wireless Devices?
Greetings, I've recently had to work on a headline pusher that would send either transliterated or utf-8 "alerts" to instant messengers, cell phones, pagers and any other devices accessible thru an email gateway. Unfortunately, the extent to which I can test the service is highly limited. So I was wondering if there might be a survey page lurking somewhere on the net that indicates the degree of unicode support in the most common wireless devices? If you'd like to try out the service and test your own device, the url is here: http://www.ethiozena.net/mobile/ Be aware that the software is still in a beta stage of development. I would appreciate any insightful feed back. The alerts are generally sent once a day, six days a week. /Daniel
Re: A UTF-8 based News Service
[EMAIL PROTECTED] wrote: > > As a test, I downloaded the first article on the page: > > http://unicode.ethiozena.net/Gazettas/Kibrit/Archives/1993/Hamle/05/Kibrit.051 > 193.sera.html > > The article, dated 1993-05-11, has the formidable title: > Yesterday in the Ethiopian calendar :) > «p-t negaso gidada wedeTalyan kobelelu teblo yeteseraCew zegeba f`Sum Heset > new» yeTalyan Embasi > Titles (in markups) remain transliterated since a number of browsers that support UTF-8 viewing in the page display area do not in the "title" area of the browser's application window. Transliterated Ethiopic actually fairs better than UTF-8 since consonants can be a single byte, syllables 2 bytes and diphthongs 3. On average a document might "compress" with transliteration down to 53%. Not so easy on the eyes though but useful as a last resort. > > Encoded in UTF-8, the file was 1891 bytes long. Converted into SCSU, it > dropped to 1121 bytes, which is 40% shorter than the UTF-8 version, better > than UTF-16, and probably better than any existing legacy encoding for > Ethiopic. SCSU is a Good Thing. Sounds promising! How well does SCSU gzip? /Daniel
A UTF-8 based News Service
Greeings, I thought this would be of interest to people here who might be involved in multilingual news services: The Ethiopian News Headlines has relocated to a new server at http://www.ethiozena.net/ and is making it easier than ever to read news headlines in Unicode. A companion Unicode only server is launched at http://unicode.ethiozena.net/ which serves articles in UTF-8 encoding only. Other new features include localization in three languages and daily article links are packaged in XML for other news services to link to (see http://www.ethiozena.net/zena.xml and a demonstration parsing script in Perl http://www.ethiozena.net/zena.pl.txt). As someone involved in the service I often wish there was some form of "compressed" Unicode encoding. The 3-byte penalty that Ethiopic bears under UTF-8 turns into higher bandwidth that web hosting services meter and charge for by the megabyte. For a popular site this soon makes UTF-8 a costly option to support. A system analagous to iso-8859-x whereby Ethiopic and other scripts in the 3 byte range could be shifted back into the 2 byte range might help (generally only English and Ethiopic is desired together). Fortunately there is mod_gzip for Apache. I would appreciate any information about other options. thanks, /Daniel
Ethiopian Time Locale Demonstrator
Greetings, I discovered the wonderful "FreeType" tools this last weekend that convert TT strings to images (PNG in this case) on the fly. I didn't expect it to work with UTF8 but whoah and behold it does! I've applied it to the LibEth Perl bindings to demonstrate time formatting options under Ethiopian norms: http://www.geez.org/date-config.html It is fairely rudementary now but as time allows in the coming weeks I'll be adding more languages and typeface options, etc. A new fangled web page hit counter is also on my mind.. cheers, /Daniel