Forming Coptic Numbers in Unicode

2002-10-10 Thread Daniel Yacob

Greetings,

To compose coptic numerals under Unicode I've applied the appropriate
lowercase letters in the Greek-Coptic range with the elements from the
Combining Diacritical Marks: U+0304, U+0331 and U+0347.  I had no basis
to choose these diacritical symbols upon other than they seemed to get
the job done visually.  What are the approved symbols for composing
coptic numbers portably?

Next question; what does one do for character codes to show values
over a billion?  If there is no official sanctioned solution, it
occured to me that the diacritic symbols could simply accumulate
following the lowercase char for interchange and then be presented
graphically.

Some recommendation added to the The Unicode Standard reference
would be a good service here, sorry if I've missed it.


thanx,

/Daniel




Re: Chromatic font research

2002-06-27 Thread Daniel Yacob

 In the handwritten form, could you please say whether the adding of the red
 increases the width of the area needed to represent the character?

yes, absolutely, at least by the width of two dots.


 Also, when handwritten, does the scribe have a black pen in one hand and a
 red pen in the other so that colouring takes place on a character by
 character basis as writing proceeds, or does the scribe put down one pen and
 pick up another, and, if so, is that on a character by character basis or is
 that on the basis of producing a number of characters in black and then
 adding the red afterwards.  This would seem to be possibly significant due
 to the possible need to allow for the greater width of the area used for a
 character that is later to receive red flourishes.

my oh my, these are wonderfully interesting questions :)  I would think the
use of tools would be highly sensitive to the experience, training, and
learned habits of the writer.  I haven't witnessed a great enough number to
sensibly say what a norm would be.  I certainly haven't seen a person hold
two pens at once though.  The scribes I've seen (maybe 4 I watched closely)
were pragmatic in their writing, when a red word occurred they would put down
the black brush and pick up the red and write the word.  While the utensil
was still in hand they would go back and add red dots or strokes where they
thought it was needed.  If no red words occurred (usually one every sentence
or two depending upon the material) they would continue writing in black
until the end of a sentence or section and stop there to change pens to go
back and update punctuation or tonal marks.  Again, I wouldn't draw any
significant conclusions from this.

I don't believe extra space is considered for adding red marks later, the
red is allowed to bleed over the black.  Trying to reproduce the practice
with fonts though I have used an enlarged version of 1362 because the result
looked much clearer.  The original intention was lost when keeping the original
proportions.  My thought at the time was that it was just a natural adjustment
that one makes when going from ink and paper to computer typography, the
goal being that we try to improve upon what the hand can do without losing
the essence of it.

/Daniel




UTF-8@Hotmail

2002-02-19 Thread Daniel Yacob

Greetings,

I just noticed that utf-8 encoding is finally working at hotmail.
UTF-8 works in the subject as well as the body of a letter.  Late
last year I saw that UTF-8 would not display properly at hotmail,
even when the letter body was HTML with  the encoding set right.

Anyone here know for how long this functionality has been available
and to what extent?  I'm not quite brave enough to start using a
unicode password ;)

thanks,

/Daniel




Re: Additional Ethiopic characters?

2002-01-27 Thread Daniel Yacob

 Daniel Yacob was to get me samples of the characters in use, so we 
 could update the proposal. That hasn't happened yet.


All good things come to those who wait./Hannibal

..and lots of good things are coming, however slowly ;)




Q: smEthiopic in Apple Localization Codes

2002-01-11 Thread Daniel Yacob

Greetings,

I'm hoping Apple developers might be able to clear up what is happening with
the smEthiopic script identifier in the reference:


http://developer.apple.com/techpubs/quicktime/qtdevdocs/APIREF/SOURCESIV/localizationcodes.htm

The granularity of script names is not quite in step with what we see at
the Unicode Code Charts page,  which might be why we then find:


langInuktitut= 143   // Inuit using smEthiopic script


Taken literally Inuit using smEthiopic script is not a high probability
scenario.  Not that I wouldn't recommend Ethiopic to the Inuit ;)

Is this mearly a bug in the comments or does smEthiopic span from the
Ethiopic range thru the Unified Canadian Aboriginal Syllabic region?  The
absence of smCherokee and smCanadian leads me towards the later.

thanks,

/Daniel




Grand Unified Syllabary Project Opens

2001-09-06 Thread Daniel Yacob,,,


The Grand Unified Syllabary project has the primary objective
to map the natural (non-composition based) syllabaries of
Unicode onto a common linguistic frame of reference. The target
frame of reference is a CVCT table (consonant-vowel-consonant-tone)
applying IPA rules for the phonemic mapping of the symbols.

Such a table that defined the component properties of syllables,
it is assumed, would serve as a reference for:

  *  syllabic character classes
  *  regular expression languages
  *  transliteration between syllabaries and other writing systems
  *  phonetic based and script independent input methods

GUS furthers the development of Syllables.txt data file
introduced with Perl 5.6.  Orthography experts are still in great
need for the Yi, Canadian Aboriginal, Cherokee, Katakana and
Hiragana syllabaries.

More information, and an development email list can be found on
the project homepage:

  http://syllabary.sourceforge.net/


/Daniel




Status of Unicode on Wireless Devices?

2001-09-03 Thread Daniel Yacob,,,


Greetings,

I've recently had to work on a headline pusher that would send either
transliterated or utf-8 alerts to instant messengers, cell phones, pagers
and any other devices accessible thru an email gateway.

Unfortunately, the extent to which I can test the service is highly limited.
So I was wondering if there might be a survey page lurking somewhere on the
net that indicates the degree of unicode support in the most common wireless
devices?

If you'd like to try out the service and test your own device, the url is
here:

  http://www.ethiozena.net/mobile/

Be aware that the software is still in a beta stage of development.  I would
appreciate any insightful feed back.  The alerts are generally sent once a
day, six days a week.

/Daniel




Re: A UTF-8 based News Service

2001-07-13 Thread Daniel Yacob

[EMAIL PROTECTED] wrote:
 
 As a test, I downloaded the first article on the page:
 
 http://unicode.ethiozena.net/Gazettas/Kibrit/Archives/1993/Hamle/05/Kibrit.051 
 193.sera.html
 
 The article, dated 1993-05-11, has the formidable title:
 

Yesterday in the Ethiopian calendar :) insert favorite Y2K joke here



 «p-t negaso gidada wedeTalyan kobelelu teblo yeteseraCew zegeba f`Sum Heset
 new» yeTalyan Embasi
 

Titles (in title markups) remain transliterated since a number of
browsers
that support UTF-8 viewing in the page display area do not in the
title area
of the browser's application window.  Transliterated Ethiopic actually
fairs
better than UTF-8 since consonants can be a single byte, syllables 2
bytes
and diphthongs 3.  On average a document might compress with
transliteration
down to 53%.  Not so easy on the eyes though but useful as a last
resort.



 Encoded in UTF-8, the file was 1891 bytes long.  Converted into SCSU, it
 dropped to 1121 bytes, which is 40% shorter than the UTF-8 version, better
 than UTF-16, and probably better than any existing legacy encoding for
 Ethiopic.  SCSU is a Good Thing.


Sounds promising!  How well does SCSU gzip?

/Daniel




A UTF-8 based News Service

2001-07-12 Thread Daniel Yacob

Greeings,

I thought this would be of interest to people here who might be
involved in multilingual news services:


The Ethiopian News Headlines has relocated to a new server at
http://www.ethiozena.net/ and is making it easier than ever to
read news headlines in Unicode.  A companion Unicode only server
is launched at http://unicode.ethiozena.net/ which serves
articles in UTF-8 encoding only.

Other new features include localization in three languages and daily
article links are packaged in XML for other news services to link to
(see http://www.ethiozena.net/zena.xml and a demonstration parsing
script in Perl http://www.ethiozena.net/zena.pl.txt).


As someone involved in the service I often wish there was some
form of compressed Unicode encoding.  The 3-byte penalty that
Ethiopic bears under UTF-8 turns into higher bandwidth that web
hosting services meter and charge for by the megabyte.  For a
popular site this soon makes UTF-8 a costly option to support.

A system analagous to iso-8859-x whereby Ethiopic and other scripts
in the 3 byte range could be shifted back into the 2 byte range
might help (generally only English and Ethiopic is desired together).

Fortunately there is mod_gzip for Apache.  I would appreciate any
information about other options.

thanks,

/Daniel




Ethiopian Time Locale Demonstrator

2000-11-20 Thread Daniel Yacob

Greetings,

I discovered the wonderful "FreeType" tools this last weekend that
convert TT strings to images (PNG in this case) on the fly.  I didn't
expect it to work with UTF8 but whoah and behold it does! I've applied
it to the LibEth Perl bindings to demonstrate time formatting options
under Ethiopian norms:

  http://www.geez.org/date-config.html

It is fairely rudementary now but as time allows in the coming weeks
I'll be adding more languages and typeface options, etc.  A new fangled
web page hit counter is also on my mind..

cheers,

/Daniel