Re: Euro

2000-07-29 Thread Asmus Freytag

At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote:
I was not talking about the shape. I think all of us have seen it, and
many have also read the documents which define its exact shape using a
ruler and a compass. I was talking about the origin of the shape.

In some sense, except for purists, this discussion is rapidly becoming 
moot. The 'euro glyphs' have been out in the wild, on shop displays, in 
newsprint etc. for well over a year now.

If you will, the 'common man's' idea of what a proper Euro glyph is, is 
fast becoming influenced by what he sees on a daily basis, not by the 
origin of the glyph or by the logo (which is prescribed only for its 
appearance on the currency itself).

Given the name, I'm sure even the 'non-European' font designers that Werner 
likes to blame aren't suggesting that the logo for the 'e'uro is based on a 
'c'. However, when you try to put the thing together with the serifs used 
in many of the common type faces, the result can indeed look a bit like a 
'c'. This seems particularly true for monospaced fonts.

A./




Re: Euro

2000-07-29 Thread 11digitboy

Yeah, how WOULD you make a serifed, rounded E that
doesn't look silly and doesn't look like a C with
an extra line? Well, maybe you can, I dunno. Anyone
who can do that, I'd like to see it. 

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Asmus Freytag [EMAIL PROTECTED] wrote:
 At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote:
 I was not talking about the shape. I think all
 of us have seen it, and
 many have also read the documents which define
 its exact shape using a
 ruler and a compass. I was talking about the origin
 of the shape.
 
 In some sense, except for purists, this discussion
 is rapidly becoming 
 moot. The 'euro glyphs' have been out in the wild,
 on shop displays, in 
 newsprint etc. for well over a year now.
 
 If you will, the 'common man's' idea of what a
 proper Euro glyph is, is 
 fast becoming influenced by what he sees on a daily
 basis, not by the 
 origin of the glyph or by the logo (which is prescribed
 only for its 
 appearance on the currency itself).
 
 Given the name, I'm sure even the 'non-European'
 font designers that Werner 
 likes to blame aren't suggesting that the logo
 for the 'e'uro is based on a 
 'c'. However, when you try to put the thing together
 with the serifs used 
 in many of the common type faces, the result can
 indeed look a bit like a 
 'c'. This seems particularly true for monospaced
 fonts.
 
 A./
 
 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Euro

2000-07-29 Thread Roozbeh Pournader


I found it! Everybody's invited to take a look at:
http://www.tug.org/TUGboat/Articles/tb19-2/tb59inn.pdf

On Sat, 29 Jul 2000, Asmus Freytag wrote:

 If you will, the 'common man's' idea of what a proper Euro glyph is, is 
 fast becoming influenced by what he sees on a daily basis, not by the 
 origin of the glyph or by the logo (which is prescribed only for its 
 appearance on the currency itself).

Ok, but I only want to know about the historical origins.

 Given the name, I'm sure even the 'non-European' font designers that Werner 
 likes to blame aren't suggesting that the logo for the 'e'uro is based on a 
 'c'. However, when you try to put the thing together with the serifs used 
 in many of the common type faces, the result can indeed look a bit like a 
 'c'. This seems particularly true for monospaced fonts.

Take a look at the referenced article.




Encoding of non-characters

2000-07-29 Thread Doug Ewell

Did I read recently (in a message that I shortsightedly deleted)
something to the effect that a character encoding scheme (CES) or
transfer encoding syntax (TES) needs to be able to encode the non-
characters U+D800 through U+DFFF, and presumably U+xxFFFE and U+xx
as well?

I've been playing around with a TES (or maybe it's a CES; I'm still
having a little trouble knowing exactly where to draw the line).  Don't
worry, I'm not going to propose it anywhere as Yet Another UTF.  I'm
just playing around with Unicode, and hopefully teaching myself
something along the way.

Anyway, my scheme encodes non-BMP characters not *as* surrogates, but
using the surrogate mechanism in a slightly modified way.  Like UTF-16,
this makes it impossible to encode the BMP non-characters in the range
U+D800 through U+DFFF.  Normally I wouldn't think this was a problem,
but I thought someone (Davis?) just said recently that it should be
possible to round-trip these thingies, for some reason.

The situation would be different in the case of U+xxFFFE and U+xx,
because while the surrogates occupy entire ranges that can be utilized
in a special way, you kind of have to *deliberately* exclude the FFFx
characters.  Nonetheless, the same question applies:  Must these bogus
code points be representable in a CES or TES, or can they be handled
conformantly by raising an error or mapping them to U+FFFD?

-Doug Ewell
 Fullerton, California



Re: Display Persian characters under Linux

2000-07-29 Thread Edward Cherlin

Darya Said-Akbari wrote:

  Hi,

this is my firts email to the Unicode email list. There is a lot I 
want to learn from you all. So even if my questions are sometimes 
stupid, nevertheless I like to read your answer on all issues.

The only stupid question is the one you didn't ask.

My goal turning my interest on Unicode is to get Persian letters on 
my Monitor and into a database lets say in Oracle8i.
The operating system will be Linux. So, what I have done until now 
is to buy the Unicode Standard 3.0. But that is not enough and 
therefore I need your help.

The short answer is that Oracle supports Arabic script data entry 
from files or keyboard, and you should check your manuals, on-line 
resources at Oracle.com and Oracle user groups, and tech support (in 
that order) for information on importing files and setting data input 
modes. You should also check Linux.org and the Linux user group sites 
for information on font and keyboard mapping file availability and 
installation procedures. For example, send the message

subscribe linux-utf8

to the e-mail address

[EMAIL PROTECTED]

to join a mailing list discussion entirely devoted to issues of 
Unicode on Linux.

What steps do I have to do to get my dream real. Yes, I have 
several character sets on my machine but they are all european one. 
And honestly I am a little bit afraid to touch them,

Yes, leave them alone, since you need them for other purposes.

since I dont know the different between a character set and unicode.

UNIX systems have a font mapping file (name, please, somebody?) 
containing character set information, including the mapping between 
PostScript font names and UNIX font names. The UNIX names include the 
standard ID for the covered character set. Most of what you have will 
be identified as 8859-1, which is ASCII plus Latin-1. The current 
Arabic script font standard (which covers Farsi) is ISO 8859-6. You 
may see ECMA-114 or ASMO 449 mentioned in some places.

Yes, you need to find Farsi fonts encoded in either 8859-6 or 
Unicode. Any search engine can find a number of sites for you.

Reading the first pages of the book, makes me more confuse. There 
is something talken about rendering. It seems when I use the ARABIC 
letters I have to concern on rendering.

Reading the Unicode Standard 3.0 through from the beginning is 
definitely not recommended. Skip to page 189, where the description 
of Arabic script rendering begins, and be sure to look at the code 
chart and notes for U-0600-U-067F, Arabic, pp. 389-395. Also skim 
through the Bidirectional Behavior section starting on p. 55. Bidi 
only matters to you if your Farsi data is sometimes mixed with 
material in other writing systems.

Is there anybody who can give me a quickstart to get rid of 
confusing charsets, unicode, rendering etc.? I know I have to put 
more time on this issue and I am prepared for this. But a little 
success would really motivate me.

Yes. You can prepare test data files for convenient import into 
Oracle in any Farsi-capable word processor or spreadsheet program 
that reads and writes ISO-8859-6 and UTF-8. You don't have to wait 
until you have the fonts and keyboard layouts installed on your Linux 
system. You can also get Oracle to generate Farsi output files to be 
viewed on a different system.

best regards
Darya Said-Akbari
-- 

Edward Cherlin
Generalist
"A knot!" exclaimed Alice. "Oh, do let me help to undo it."
Alice in Wonderland



Re: Bangla(Bengali) letter Missing

2000-07-29 Thread Md Ziaur Rahman

  Now I came to the conclusion that there is a way to represent khando-ta
in
 Standard and that is quite satisfactory.
 
  However some indications are confusing. So I am writing my
understanding,
 
  Ta + Virama + ZWNJ = ta with explicit virama
  Ta + Virama + consonant = Conjunct (ta + consonant)
  Ta + Virama = Khando-ta (while occurs final )
  Ta + Virama + ZWJ = Khando-ta (explicit half - consonant)


 This was my suggestion:
 [Ta] [virama] - [khando-ta] (when final)
 [Ta] [virama] [ZWNJ] - [khando-ta]
 [Ta] [virama] [] - [appropriate conjunt form]
 [Ta] [Virama] [ZWJ] - [Ta Virama]


The difference is only ZWNJ and ZWJ after virama. I think you should try the
guidelines of Unicode 3.0 standards. My opinion follows the guideline. Since
all indic languages are derived from sanskreet hence I think the guideline
for devanagari is not absolutely useless for Bangla.

 of the 'Bengali Script' and *not* the 'Bangladeshi language'. Assami and
 monipuri writers *do* make the distinction, but they have the luxury of
 being able to use Assami Va (U+09F1) as well as Ba (U+09AC) to produce the
 two forms shown in my gif.
 Speakers of Bangla make the distinction of the two forms depending only on
 context. e.g.. svaamii is spelt sbami and pronunced shami and not sbami by
 Bangladeshis, whilst in Assamiya it is spelt svami (with U+09F1) not
sbami.
 My question is, should speakers of Bangla be restricted to be able to form
 only the common forms, or should there be a way for us to produce both
forms
 shown? Or perhaps do you expect us (Bangladeshis) to use the assami Va?

In the grammar book by Munir chowdhury, Mofazzal Haider Ch. and Ibrahim
Khalil (Text book for S.S.C),  vba is omitted from the bangla character set.
It is confusing for common people. So I think the decision is wise.

Regards,

Zia






Re: Display Persian characters under Linux

2000-07-29 Thread Roozbeh Pournader



On Sat, 29 Jul 2000, Edward Cherlin wrote:

 The current Arabic script font standard (which covers Farsi) is
 ISO 8859-6. You may see ECMA-114 or ASMO 449 mentioned in some places.

ISO 8859-6 does not cover Farsi. There are at least six missing important
letters, PEH, TCHEH, JEH, KEHEH, GAF, and FARSI YEH.

 Yes, you need to find Farsi fonts encoded in either 8859-6 or 
 Unicode. Any search engine can find a number of sites for you.

There's no Farsi font encoded in 8859-6 because of the stated reason.

 Bidi only matters to you if your Farsi data is sometimes mixed with 
 material in other writing systems.

Or you have alphabetic data mixed with numerical data. Bidi is needed even
for pure Persian text, since the numbers are written left-to-right.

--roozbeh





Re: FW: Oracle and Surrogate Pairs

2000-07-29 Thread Edward Cherlin

At 2:41 AM -0800 7/25/2000, [EMAIL PROTECTED] wrote:
Hi all,
 I have been developing/convering a software to support multiple
languages, especially Japanese, Korean and later on French etc.
i have increased all the required fields by a factor of 3. Keeping "True,
but within a year or so, there *will* be surrogates assigned in Unicode.  "
in mind what do you think i should be the value of this "factor" i should
have.

I think you should upgrade to software that handles variable length 
fields, or to programming practices that make use of them.

Thanks  Regards,
Samir Mehrotra,
i-flex Solutions Limited,
a CitiCorp venture capital company
at SEI-CMM level 5.
[EMAIL PROTECTED]

  -Original Message-
  From:   John H. Jenkins [SMTP:[EMAIL PROTECTED]]
  Sent:   Tuesday, July 25, 2000 8:12 AM
  To: Unicode List
  Subject:Re: Oracle and Surrogate Pairs

   Does the field in question need to support literally any possible
  character
   in Unicode 3.0 and beyond (since 3.0 does not have any surrogates
   assigned!)?
   
   

  True, but within a year or so, there *will* be surrogates assigned in
  Unicode.  One cannot be premature in supporting them at this point.


  =
  John H. Jenkins
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  http://www.blueneptune.com/~tseng

-- 

Edward Cherlin
Generalist
"A knot!" exclaimed Alice. "Oh, do let me help to undo it."
Alice in Wonderland



Re: Display problems

2000-07-29 Thread Mark Davis

The best I've got so far is:

 Java

 To allow Java applets (and/or programs) to draw Unicode characters in
 the fonts you have available, you will need to hand-edit the
 font.properties files that the Java runtime uses. Since you may have
 several Java runtimes installed on your machine (for different
 browsers, development environments, etc), you will need to search for
 all the files containing the letters "font.properties". These files
 may also be in .jar files, depending on your configuration.

 Once you have found the files, Sun provides instructions
 [http://java.sun.com/products/jdk/1.1/docs/guide/intl/fontprop.html]
 on how to edit them to add new fonts. (This may take some patience:
 the description is not exactly straightforward.)

Edward Cherlin wrote:

 At 6:41 AM -0800 7/25/2000, Mark Davis wrote:
 The issue of how to get Java to display Unicode characters comes up
 periodically. Since the instructions on how to do it are fairly arcane
 (hand-editing the font.properties files), I'd like to see someone
 compose a short description to add to the material we have on
 http://www.unicode.org/help/display_problems.html. If there is already
 a good description in a persistent page, we could just provide a link
 to it.
 
 Any volunteers?
 
 Mark

 Since I want to display Unicode in Java, and haven't yet found out
 the gory details, I volunteer to accept all available information,
 figure out what it means, test it on Windows, Mac, and Linux (Red Hat
 and Yellow Dog), and write it up properly. If someone else comes up
 with a complete procedure, I still volunteer to test it, and to edit
 it with illustrations (screen shots, diagrams, and code). I can
 produce multilingual documents in MSWord, FrameMaker, PDF, HTML, and
 TeX.

 Does anybody know whether this process could be reduced to an
 installation utility written in Java?
 --

 Edward Cherlin
 Generalist
 "A knot!" exclaimed Alice. "Oh, do let me help to undo it."
 Alice in Wonderland




Re: Display problems

2000-07-29 Thread Katsuhiko Momoi

Hi,

Some time ago I wrote up how-to-edit font.properties file for
Communicator users. In it I illustrate how to set up multilingual
display with Cyberbit font. The file has been available from the
International Users page under Communicator's HELP menu. I hope to
update this document slightly in the near future.

http://home.netscape.com/eng/intl/jdkfontinfo.html

- Kat

Mark Davis wrote:
 
 The best I've got so far is:
 
  Java
 
  To allow Java applets (and/or programs) to draw Unicode characters in
  the fonts you have available, you will need to hand-edit the
  font.properties files that the Java runtime uses. Since you may have
  several Java runtimes installed on your machine (for different
  browsers, development environments, etc), you will need to search for
  all the files containing the letters "font.properties". These files
  may also be in .jar files, depending on your configuration.
 
  Once you have found the files, Sun provides instructions
  [http://java.sun.com/products/jdk/1.1/docs/guide/intl/fontprop.html]
  on how to edit them to add new fonts. (This may take some patience:
  the description is not exactly straightforward.)
 
 Edward Cherlin wrote:
 
  At 6:41 AM -0800 7/25/2000, Mark Davis wrote:
  The issue of how to get Java to display Unicode characters comes up
  periodically. Since the instructions on how to do it are fairly arcane
  (hand-editing the font.properties files), I'd like to see someone
  compose a short description to add to the material we have on
  http://www.unicode.org/help/display_problems.html. If there is already
  a good description in a persistent page, we could just provide a link
  to it.
  
  Any volunteers?
  
  Mark
 
  Since I want to display Unicode in Java, and haven't yet found out
  the gory details, I volunteer to accept all available information,
  figure out what it means, test it on Windows, Mac, and Linux (Red Hat
  and Yellow Dog), and write it up properly. If someone else comes up
  with a complete procedure, I still volunteer to test it, and to edit
  it with illustrations (screen shots, diagrams, and code). I can
  produce multilingual documents in MSWord, FrameMaker, PDF, HTML, and
  TeX.
 
  Does anybody know whether this process could be reduced to an
  installation utility written in Java?
  --
 
  Edward Cherlin
  Generalist
  "A knot!" exclaimed Alice. "Oh, do let me help to undo it."
  Alice in Wonderland

-- 
Katsuhiko Momoi
Netscape International Client Products Group
[EMAIL PROTECTED]

What is expressed here is my personal opinion and does not reflect 
official Netscape views.



RE: FW: Oracle and Surrogate Pairs

2000-07-29 Thread Michael Kung

Or you can use SQL Server w/o upgrading your exist databases.  SQL Server
7.0 supports two character sets - CHAR for legacy character set and NCHAR
for Unicode (UCS-2).  SQL is surrogate safe.  You can store surrogates in
the NCHAR data column.  

Michael

-Original Message-
From: Edward Cherlin [mailto:[EMAIL PROTECTED]]
Sent: Saturday, July 29, 2000 2:33 PM
To: Unicode List
Subject: Re: FW: Oracle and Surrogate Pairs

At 2:41 AM -0800 7/25/2000, [EMAIL PROTECTED] wrote:
Hi all,
 I have been developing/convering a software to support multiple
languages, especially Japanese, Korean and later on French etc.
i have increased all the required fields by a factor of 3. Keeping "True,
but within a year or so, there *will* be surrogates assigned in Unicode.  "
in mind what do you think i should be the value of this "factor" i should
have.

I think you should upgrade to software that handles variable length
fields, or to programming practices that make use of them.

Thanks  Regards,
Samir Mehrotra,
i-flex Solutions Limited,
a CitiCorp venture capital company
at SEI-CMM level 5.
[EMAIL PROTECTED]

  -Original Message-
  From:   John H. Jenkins [SMTP:[EMAIL PROTECTED]]
  Sent:   Tuesday, July 25, 2000 8:12 AM
  To: Unicode List
  Subject:Re: Oracle and Surrogate Pairs

   Does the field in question need to support literally any possible
  character
   in Unicode 3.0 and beyond (since 3.0 does not have any surrogates
   assigned!)?
  
  

  True, but within a year or so, there *will* be surrogates assigned in
  Unicode.  One cannot be premature in supporting them at this point.


  =
  John H. Jenkins
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  http://www.blueneptune.com/~tseng

--

Edward Cherlin
Generalist
"A knot!" exclaimed Alice. "Oh, do let me help to undo it."
Alice in Wonderland



Re: Bangla(Bengali) letter Missing

2000-07-29 Thread Abdul Malik

  My question is, should speakers of Bangla be restricted to be able to
form
  only the common forms, or should there be a way for us to produce both
 forms
  shown? Or perhaps do you expect us (Bangladeshis) to use the assami Va?

 In the grammar book by Munir chowdhury, Mofazzal Haider Ch. and Ibrahim
 Khalil (Text book for S.S.C),  vba is omitted from the bangla character
set.
 It is confusing for common people. So I think the decision is wise.

 Regards,

 Zia


Okay, I think by this you mean that for bangla (language) only the common
forms should be displayed?

If that is so, it only leaves the problem of how to display Unicode encoded
text, where the language is not known. I mean should 'da virama ba' be
displayed as 'dba' or 'dva'? A bangladeshi would expect 'dva' (the common
form) but an assamiya reader would expect 'dba' or 'da virama ba' to be
displayed ('dva' being displayed only for da virama va)

Hmm. I think then, that the default behaviour of an application should be to
render the common forms, and only display the other forms, or with a visible
virama, when the language is known not to be bangla. (But I'm biased, I
doubt that someone from assam would agree with that.)

(Another solution would be to insist that assami writers always insert a
ZWNJ / ZWJ before their Ba's so that we don't confuse them with Va's ;-))

That only leaves the problem of how to deal with Assamiya text quoted within
Bangla text.
Oh well, I think i'll leave that for another day.

In any case, I think distinguishing between Ba and Va is only going to be a
problem in rare circumstances.

Best regards

Abdul