Forming Coptic Numbers in Unicode
Greetings, To compose coptic numerals under Unicode I've applied the appropriate lowercase letters in the Greek-Coptic range with the elements from the Combining Diacritical Marks: U+0304, U+0331 and U+0347. I had no basis to choose these diacritical symbols upon other than they seemed to get the job done visually. What are the approved symbols for composing coptic numbers portably? Next question; what does one do for character codes to show values over a billion? If there is no official sanctioned solution, it occured to me that the diacritic symbols could simply accumulate following the lowercase char for interchange and then be presented graphically. Some recommendation added to the The Unicode Standard reference would be a good service here, sorry if I've missed it. thanx, /Daniel
Re: Chromatic font research
In the handwritten form, could you please say whether the adding of the red increases the width of the area needed to represent the character? yes, absolutely, at least by the width of two dots. Also, when handwritten, does the scribe have a black pen in one hand and a red pen in the other so that colouring takes place on a character by character basis as writing proceeds, or does the scribe put down one pen and pick up another, and, if so, is that on a character by character basis or is that on the basis of producing a number of characters in black and then adding the red afterwards. This would seem to be possibly significant due to the possible need to allow for the greater width of the area used for a character that is later to receive red flourishes. my oh my, these are wonderfully interesting questions :) I would think the use of tools would be highly sensitive to the experience, training, and learned habits of the writer. I haven't witnessed a great enough number to sensibly say what a norm would be. I certainly haven't seen a person hold two pens at once though. The scribes I've seen (maybe 4 I watched closely) were pragmatic in their writing, when a red word occurred they would put down the black brush and pick up the red and write the word. While the utensil was still in hand they would go back and add red dots or strokes where they thought it was needed. If no red words occurred (usually one every sentence or two depending upon the material) they would continue writing in black until the end of a sentence or section and stop there to change pens to go back and update punctuation or tonal marks. Again, I wouldn't draw any significant conclusions from this. I don't believe extra space is considered for adding red marks later, the red is allowed to bleed over the black. Trying to reproduce the practice with fonts though I have used an enlarged version of 1362 because the result looked much clearer. The original intention was lost when keeping the original proportions. My thought at the time was that it was just a natural adjustment that one makes when going from ink and paper to computer typography, the goal being that we try to improve upon what the hand can do without losing the essence of it. /Daniel
UTF-8@Hotmail
Greetings, I just noticed that utf-8 encoding is finally working at hotmail. UTF-8 works in the subject as well as the body of a letter. Late last year I saw that UTF-8 would not display properly at hotmail, even when the letter body was HTML with the encoding set right. Anyone here know for how long this functionality has been available and to what extent? I'm not quite brave enough to start using a unicode password ;) thanks, /Daniel
Re: Additional Ethiopic characters?
Daniel Yacob was to get me samples of the characters in use, so we could update the proposal. That hasn't happened yet. All good things come to those who wait./Hannibal ..and lots of good things are coming, however slowly ;)
Q: smEthiopic in Apple Localization Codes
Greetings, I'm hoping Apple developers might be able to clear up what is happening with the smEthiopic script identifier in the reference: http://developer.apple.com/techpubs/quicktime/qtdevdocs/APIREF/SOURCESIV/localizationcodes.htm The granularity of script names is not quite in step with what we see at the Unicode Code Charts page, which might be why we then find: langInuktitut= 143 // Inuit using smEthiopic script Taken literally Inuit using smEthiopic script is not a high probability scenario. Not that I wouldn't recommend Ethiopic to the Inuit ;) Is this mearly a bug in the comments or does smEthiopic span from the Ethiopic range thru the Unified Canadian Aboriginal Syllabic region? The absence of smCherokee and smCanadian leads me towards the later. thanks, /Daniel
Grand Unified Syllabary Project Opens
The Grand Unified Syllabary project has the primary objective to map the natural (non-composition based) syllabaries of Unicode onto a common linguistic frame of reference. The target frame of reference is a CVCT table (consonant-vowel-consonant-tone) applying IPA rules for the phonemic mapping of the symbols. Such a table that defined the component properties of syllables, it is assumed, would serve as a reference for: * syllabic character classes * regular expression languages * transliteration between syllabaries and other writing systems * phonetic based and script independent input methods GUS furthers the development of Syllables.txt data file introduced with Perl 5.6. Orthography experts are still in great need for the Yi, Canadian Aboriginal, Cherokee, Katakana and Hiragana syllabaries. More information, and an development email list can be found on the project homepage: http://syllabary.sourceforge.net/ /Daniel
Status of Unicode on Wireless Devices?
Greetings, I've recently had to work on a headline pusher that would send either transliterated or utf-8 alerts to instant messengers, cell phones, pagers and any other devices accessible thru an email gateway. Unfortunately, the extent to which I can test the service is highly limited. So I was wondering if there might be a survey page lurking somewhere on the net that indicates the degree of unicode support in the most common wireless devices? If you'd like to try out the service and test your own device, the url is here: http://www.ethiozena.net/mobile/ Be aware that the software is still in a beta stage of development. I would appreciate any insightful feed back. The alerts are generally sent once a day, six days a week. /Daniel
Re: A UTF-8 based News Service
[EMAIL PROTECTED] wrote: As a test, I downloaded the first article on the page: http://unicode.ethiozena.net/Gazettas/Kibrit/Archives/1993/Hamle/05/Kibrit.051 193.sera.html The article, dated 1993-05-11, has the formidable title: Yesterday in the Ethiopian calendar :) insert favorite Y2K joke here «p-t negaso gidada wedeTalyan kobelelu teblo yeteseraCew zegeba f`Sum Heset new» yeTalyan Embasi Titles (in title markups) remain transliterated since a number of browsers that support UTF-8 viewing in the page display area do not in the title area of the browser's application window. Transliterated Ethiopic actually fairs better than UTF-8 since consonants can be a single byte, syllables 2 bytes and diphthongs 3. On average a document might compress with transliteration down to 53%. Not so easy on the eyes though but useful as a last resort. Encoded in UTF-8, the file was 1891 bytes long. Converted into SCSU, it dropped to 1121 bytes, which is 40% shorter than the UTF-8 version, better than UTF-16, and probably better than any existing legacy encoding for Ethiopic. SCSU is a Good Thing. Sounds promising! How well does SCSU gzip? /Daniel
A UTF-8 based News Service
Greeings, I thought this would be of interest to people here who might be involved in multilingual news services: The Ethiopian News Headlines has relocated to a new server at http://www.ethiozena.net/ and is making it easier than ever to read news headlines in Unicode. A companion Unicode only server is launched at http://unicode.ethiozena.net/ which serves articles in UTF-8 encoding only. Other new features include localization in three languages and daily article links are packaged in XML for other news services to link to (see http://www.ethiozena.net/zena.xml and a demonstration parsing script in Perl http://www.ethiozena.net/zena.pl.txt). As someone involved in the service I often wish there was some form of compressed Unicode encoding. The 3-byte penalty that Ethiopic bears under UTF-8 turns into higher bandwidth that web hosting services meter and charge for by the megabyte. For a popular site this soon makes UTF-8 a costly option to support. A system analagous to iso-8859-x whereby Ethiopic and other scripts in the 3 byte range could be shifted back into the 2 byte range might help (generally only English and Ethiopic is desired together). Fortunately there is mod_gzip for Apache. I would appreciate any information about other options. thanks, /Daniel
Ethiopian Time Locale Demonstrator
Greetings, I discovered the wonderful "FreeType" tools this last weekend that convert TT strings to images (PNG in this case) on the fly. I didn't expect it to work with UTF8 but whoah and behold it does! I've applied it to the LibEth Perl bindings to demonstrate time formatting options under Ethiopian norms: http://www.geez.org/date-config.html It is fairely rudementary now but as time allows in the coming weeks I'll be adding more languages and typeface options, etc. A new fangled web page hit counter is also on my mind.. cheers, /Daniel