Re: Euro Currency for UK

2003-10-10 Thread Philippe Verdy

- Original Message - 
From: Martin JD Green [EMAIL PROTECTED]
 This means that if I generate a new system (XP Professional) with all the
 latest updates but use UK as the standard locale and then try to switch to
 FRENCH/FRANCE I still get Francs! To get the locale to use euros I have to
 download this tool and run it while switched into the FRENCH/FRANCE
locale!

Microsoft has already provided the administrative Euro settings tool that
allows users to see the Euro in their corresponding prefered locale instead
of the legacy unit: with this tool, users can switch from one locale to the
other and have the Euro enabled/disabled.

In fact I just think that despite the UK is still not using the Euro, this
is only specific to the UK locale and should not apply to other locales,
even in a British localized system. So the default locale should be
date-based, knowing that, e.g., French/France is now Euro-enabled even in
UK: on such British system, a fr/FR user locale should see the EUR at first
place before FRF.

The limit exists on Windows 9x/ME where the automatic switch is not
available for standalone systems, as Win9x/ME only supports one locale at a
time (but it exists in domain-level group policies, when Win 9x/ME is used
as a workstation in a NT/2K/XP domain).

Note that this problem is absolutely not related to Unicode or to the
support of the Euro symbol in system fonts and for text codepage
conversions. If one applies a Windows-1252 update in a British system, it
should still assign and use the Euro symbol at position 0x80, no matter the
current user locale, provided that the user locale or any application uses
the Windows-1252 codepage, or the system uses this codepage to map its
ANSI codepage. And even in that case it should map to U+20AC for
conversion with Unicode (the UTF-16 transform used in all Win32 Unicode API,
or in UTF-8 and other related encoding schemes used on the web).

This assignment is not necessary for the DOS/OEM codepage which is most
often the CP-850 in Western Europe and does not contain the Euro character
(that cannot be seen in a command-line window, and will be displayed as a
? if entered on the keyboard).

IBM (not Microsoft) has defined some variants of the traditional DOS/OEM
codepages to allow display and input of the Euro symbol and character. They
are not used on Windows, except in text converter tools for NT/2K/XP (the
c_*.nls system files) such as in MIME-compliant applications like Internet
Explorer, Outlook, Outlook Express, or Office programs like Word that need
to convert correctly all texts (entered in non-Windows systems) to a Windows
codepage or to Unicode.




Re: Euro Currency for UK

2003-10-09 Thread Peter Kirk
On 08/10/2003 16:52, Jain, Pankaj (MED, TCS) wrote:

Hi,
I Have requirement to display Euro Currency Symbol for en_GB 
locale.I know that if we use en_GB as CurrencLocale, then it default 
to Pound.Is there any way I can set it to Euro.
 
Thanks
Pankaj
Our default currency in the UK is still the pound sterling. It will take 
more than you changing some settings to change it to the Euro! :-)

The Euro symbol is available, and should be displayed correctly if you 
have a suitable font, in CP1252 and ISO-8859-1 which are the usual 
legacy encodings used in the UK - and of course in Unicode. I assume you 
are not using a system from before about 1998 when the Euro was added to 
systems and fonts. Anything beyond that depends on what system you are 
referring to, and so is probably not really a matter for this list.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/




Re: Euro Currency for UK

2003-10-09 Thread jon
 The Euro symbol is available, and should be displayed correctly if you 
 have a suitable font, in CP1252 and ISO-8859-1 which are the usual 
 legacy encodings used in the UK - and of course in Unicode.

The Euro symbol is not in ISO 8859-1, it is however in ISO 8859-15 and ISO 8859-16. It 
was added to CP1252 after the inital specification of CP1252 and hence some systems 
may not render it correctly (especially since the update may have seemed a pointless 
install to some outside of the jurisdictions in which the Euro is legal tender).

I think the question though is how to get some particular locale system to use that 
symbol as the default currency character.







Re: Euro Currency for UK

2003-10-09 Thread Stefan Persson
[EMAIL PROTECTED] wrote:

The Euro symbol is not in ISO 8859-1, it is however in ISO 8859-15 and ISO 8859-16. It was added to CP1252 after the inital specification of  CP1252 and hence some systems may not render it correctly (especially since the update may have seemed a pointless install to some outside of  the jurisdictions in which the Euro is legal tender).
Isn't Euro support added to all CP1252 versions of Windows 98 and later, 
and in Windows 95 if people manually visit some Microsoft web page and 
download an update for this?  My copy of iconv for Linux supports  in 
CP1252, and all of my other CP1252-compatible programs (e.g. Mozilla) 
also seem to support it.

Stefan




Re: Euro Currency for UK

2003-10-09 Thread Addison Phillips [wM]
Hmm.. this isn't really a Unicode question. You might want to post this 
question over on the i18n programming list '[EMAIL PROTECTED]' 
or on the locales list at '[EMAIL PROTECTED]'.

You don't say what your programming or operating environments are. There 
are two possibilities here.

If you want to use your existing software to display currencies as the 
Euro instead of pounds, you can generally either set the display 
settings (Windows Regional options control panel) for currency to look 
like the Euro. Or you can set (on Unix systems) the LC_MONETARY locale 
variable to some locale that uses the Euro with English-like formatting. 
A few systems actually provide a specialized variant locale for 
[EMAIL PROTECTED] for this purpose. A few provide an [EMAIL PROTECTED], which won't be 
helpful to you because of differences in the separators used in the two 
locales.

You can also compile your own locale tables on Unix. Read the man pages 
on locale.

If you are writing your own software, then it really isn't that hard. 
Some programming environments, such as Java, provide either a separate 
Currency class with the ability to create specific display-time formats 
that take both the currency and the display locale into account. Others 
require you to create a formatter to convert the value into a string for 
display.

In fact, when working with currency it is important to associate which 
currency you mean with the value. You may experience problems if you 
create a data field for value and format it according to the machine's 
runtime locale. The runtime locale can imply a certain default currency, 
as you note, but default does not mean only. Consider:

value123.45/value

Not right:

en_GB: 123,45
en_US: $123.45
de_DE: 123,45
ja_JP: 123
Most commonly the ISO4217 currency code is associated with a value to 
create a data structure that is specific:

value
  amount123.45/amount
  currencyEUR/currency
/value
en_GB: 123,45
en_US: 123.45
de_DE: 123,45
ja_JP: 123.45
Getting the formatting right is a matter of accessing the formatting 
fucctions of your programming API correctly. Most programming 
environments provide a way to format a value using separate locale rules 
(for grouping and decimal separators) and currency.

More information about what you're trying to do would help in 
recommending a solution.

Best Regards,

Addison

--
Addison P. Phillips
Director, Globalization Architecture
webMethods, Inc.
+1 408.962.5487  mailto:[EMAIL PROTECTED]
---
Internationalization is an architecture. It is not a feature.
Chair, W3C I18N WG Web Services Task Force
http://www.w3.org/International/ws




Re: Euro Currency for UK

2003-10-09 Thread jon
 Isn't Euro support added to all CP1252 versions of Windows 98 and later, 
 and in Windows 95 if people manually visit some Microsoft web page and 
 download an update for this?

Yes (well, I'm not sure of the exact versions, but that's a minor matter). At this 
point most people who would have needed to update have done, but it's possible that 
users in countries that don't use the Euro haven't done so. Given that we are talking 
about the use of the symbol with a locale that is otherwise focused on people in 
Britain it's worth considering.







Re: Euro Currency for UK

2003-10-09 Thread Martin JD Green

- Original Message - 
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, October 09, 2003 5:20 PM
Subject: Re: Euro Currency for UK


  Isn't Euro support added to all CP1252 versions of Windows 98 and later,
  and in Windows 95 if people manually visit some Microsoft web page and
  download an update for this?

 Yes (well, I'm not sure of the exact versions, but that's a minor matter).
At this point most people who would have needed to update have done, but
it's possible that users in countries that don't use the Euro haven't done
so. Given that we are talking about the use of the symbol with a locale that
is otherwise focused on people in Britain it's worth considering.


The euro character was added to CP1252 back in 1999 and most systems have
the character. However, the locales which should be using the euro were not
updated and no replacement locales for Windows are directly available from
Microsoft. They do have a tool available to add the euro as the default
currency symbol to those locales which need it but that tool ONLY works if
you have that locale as the default locale.

This means that if I generate a new system (XP Professional) with all the
latest updates but use UK as the standard locale and then try to switch to
FRENCH/FRANCE I still get Francs! To get the locale to use euros I have to
download this tool and run it while switched into the FRENCH/FRANCE locale!

I'm not sure why you want to set the euro as the standard currency for UK as
(at present) we have not switched to that currency!?

Martin Green





Re: Euro Currency for UK

2003-10-09 Thread Markus Scherer
I think Addison is on the right track here.

I would like to point to ICU sample code for this kind of thing: 
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/source/samples/numfmt/main.cpp

See the code there from setNumberFormatCurrency_2_6 on down (the preceding code is for older ICU 
versions and general number formatting API usage).

ICU homepage: http://oss.software.ibm.com/icu/

Best regards,
markus



Re: Euro Currency for UK

2003-10-08 Thread Chris Jacobs



u+20AC should display as  EURO SIGN, _regardless_ what the 
locale is.
If it does not then your system is broken.

If it does, but your keyboard layout does not have a key for 
u+20AC then get e.g. UniPad at www.unipad.org

  - Original Message - 
  From: 
  Jain, 
  Pankaj (MED, TCS) 
  To: '[EMAIL PROTECTED]' 
  Sent: Thursday, October 09, 2003 1:52 
  AM
  Subject: Euro Currency for UK
  
  Hi,
   
  I Have requirement to display Euro Currency Symbol for en_GB locale.I know 
  that if we use en_GB as CurrencLocale, then it default to Pound.Is there any 
  way I can set it to Euro.
  
  Thanks
  Pankaj


RE: Euro in Windows double-byte code pages

2002-03-27 Thread Cathy Wissink

932 does not have a Euro in it, but all other Windows DBCS (936, 950, 949) codepages 
do.  

For more information, please see 
http://www.microsoft.com/globaldev/DrIntl/017/default.asp#q3 as well as 
http://www.microsoft.com/globaldev/articles/euro.asp

Cathy

-Original Message-
From: Doug Ewell [mailto:[EMAIL PROTECTED]]

Ken Krugler [EMAIL PROTECTED] is wondering:

 I'm wondering if anybody (say, for example, Kenneth) has information
 on the addition of the Euro to the various Windows double-byte code
 pages (CP950, CP932, CP936, CP949). In particular, was the the Euro
 added to CP932 for Windows 2000-J (at location 0x80), or is this just
 a wild rumor?

Unicode maintains a set of Windows code page mapping tables at
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/.  (There
are lots of non-Windows tables too; start from MAPPINGS in the above
URL.)

Oddly, of the four Windows DBCS character sets, only CP932 has not been
updated since 1998-04-15, and that table does not have a mapping for
U+20AC EURO SIGN.  Code position 0x80 is undefined, however, so MS may
have added it there.

The other three tables are all dated 2000-01-07, and show the EURO SIGN
at the following positions:

9360x80
9490xA2E6
9500xA3E1

Hope this helps,

-Doug Ewell
 Fullerton, California







Re: Euro in Windows double-byte code pages

2002-03-26 Thread Doug Ewell

Ken Krugler [EMAIL PROTECTED] is wondering:

 I'm wondering if anybody (say, for example, Kenneth) has information
 on the addition of the Euro to the various Windows double-byte code
 pages (CP950, CP932, CP936, CP949). In particular, was the the Euro
 added to CP932 for Windows 2000-J (at location 0x80), or is this just
 a wild rumor?

Unicode maintains a set of Windows code page mapping tables at
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/.  (There
are lots of non-Windows tables too; start from MAPPINGS in the above
URL.)

Oddly, of the four Windows DBCS character sets, only CP932 has not been
updated since 1998-04-15, and that table does not have a mapping for
U+20AC EURO SIGN.  Code position 0x80 is undefined, however, so MS may
have added it there.

The other three tables are all dated 2000-01-07, and show the EURO SIGN
at the following positions:

9360x80
9490xA2E6
9500xA3E1

Hope this helps,

-Doug Ewell
 Fullerton, California






Re: Euro in Windows double-byte code pages

2002-03-26 Thread Jungshik Shin

On Tue, 26 Mar 2002, Ken Krugler wrote:

 I'm wondering if anybody (say, for example, Kenneth) has information 
 on the addition of the Euro to the various Windows double-byte code 
 pages (CP950, CP932, CP936, CP949). In particular, was the the Euro 

  I'm not sure of CP950/932/936, but CP949 (hmm, even of this,
I don't have a definitive answer) may have it added at 0xA2E6.  Because KS
X 1001 was revised in Dec. 1998 to  add two characters: Euro Sign (U+20AC)
at row 2, column 70 (0xA2E6 when put in GR) and Registered Sign(U+00AE)
at row 2, column 71 (0xA2E7 when put in GR).

 added to CP932 for Windows 2000-J (at location 0x80), or is this just 
 a wild rumor?

 Putting it at 0x80 could break some existing programs that do
their own encoding conversion because 0x80 has its MSB set, couldn't it?
Anyway, I think this can be 'experimentally' :-) determined. After
updating your Win2k with the latest update, write a simple program to
invoke WideCharToMultiByte(??) with U+20AC under codepages of your concern
(932,936,949, 950) and see what it returns.

  Jungshik Shin





Re: Euro in Email headers, misinformation?

2001-12-26 Thread Richard epas

On Wed Dec 26 00:17:53 2001 +0330 Roozbeh Pournader wrote:


On Tue, 25 Dec 2001, Dr. International wrote:

 In fact, I didn't know about RFC 2047. Unfortunately my contacts in the
 Exchange team did not mention that this might be the reason for the lost
 Euro in the subject line. A quick search on the internet leads me to
 believe that mail servers, which implement RFC 2047 will keep the Euro
 (for that matter any character) in the subject intact, if the client can
 understand it. Is that right? I have to admit that mail severs are not
 one of my areas of expertise.

Sorry for getting angry, and thanks for the clarification.

But to get to the problem, no, it's not related to mail servers as far as
I know. Mail servers don't need to implement RFC 2047, they should support
it automatically, since it's ASCII compatible. It's the mail readers that
should support it. So this is a realm for Outlook and Outlook Express, and
not Exchange. (I do not know how these two implement RFC 2047, I have not
tested them throughly yet.)

But to get back to what has happened with some of your failed tests, I
guess the problem is 8-bit headers. Some mail composers send the subject
line as 8-bit, without doing any RFC 2047 conversion (Outlook/Outlook
Express may be among them, I don't know). Then some mail servers may strip
any eighth bit, remaining conformant to RFC 2822.

...

Plain 8bit headers are not default Outlook/Outlook setting for mail.
But people tend to use it as Outlook/OE allows this (unfortunately)
without knowing consequences.  I heard rumours that Exchange server may just
discard a message if it founds these non-standard eight bit bytes in From:
header.





Re: Euro in Email headers, misinformation?

2001-12-26 Thread Michael \(michka\) Kaplan

From: Richard ÄOepas [EMAIL PROTECTED]

 Plain 8bit headers are not default Outlook/Outlook setting
 for mail. But people tend to use it as Outlook/OE allows
 this (unfortunately) without knowing consequences.

Piece of MS Misinformation (tm) #1: Outlook Express usually happens to work,
*except* in the list of e-mails. It is only Outlook that never seems to work
here (i.e. looks fine when you are about to send but looks bad once sent, in
all views).

 I heard rumours that Exchange server may just discard a
 message if it founds these non-standard eight bit bytes in
 From: header.

Piece of MS Misinformation (tm) #2: Messages are not discarded by Exchange
Server in the From:, To:, CC:, BCC:, or Subect lines for the awful crime of
non-ASCII headers. You might just see question marks on the other end,
though.

We could argue over which behavior is worse (always converting the header
using the clients and the server code page vs. just sending the bytes on and
mis-displaying them in the list of e-mails but having them look fine when
you open the e-mail).

But lets not make the situation worse than it actually is -- MS bashing is
much more effective when its accurate, don't you think? :-)


MichKa

Michael Kaplan
Trigeminal Software, Inc.  -- http://www.trigeminal.com/





Re: Euro in Email headers, misinformation?

2001-12-26 Thread Michael \(michka\) Kaplan

From: Roozbeh Pournader [EMAIL PROTECTED]
To: Michael (michka) Kaplan [EMAIL PROTECTED]

 And there is MS basher bashing (TM), which is a rare art practiced by
 Michael ;-)

Hey, I often get into trouble for actually bashing myself (i.e. biting the
hand that feeds me since I do contract work for them? g). I just prefer
accurate bashing, since:

(a) truth is always more interesting;
(b) perhaps they might fix a well-described problem;
(c) they are more likely to listen to you if you are not the one who wasted
their time with non-existent issues the day before;
(d) all of the *really* smart folks at MS can see past this stuff if they
know its real bugs and that there is no evil intent.

I would do the same with IBM, Apple, Oracle, et. al., if I knew more about
their problems.

;-)


MichKa

Michael Kaplan
Trigeminal Software, Inc.  -- http://www.trigeminal.com/





Re: Euro in Email headers, misinformation?

2001-12-26 Thread Roozbeh Pournader

On Wed, 26 Dec 2001, Michael (michka) Kaplan wrote:

 But lets not make the situation worse than it actually is -- MS bashing is
 much more effective when its accurate, don't you think? :-)

And there is MS basher bashing (TM), which is a rare art practiced by 
Michael ;-)

BTW, I wish to restate again that I was not trying to bash MS when I sent
the original email. I am very happy with MS as one of the biggest
evangelizers of Unicode. I simply got angry, thinking it was intentional.
And it is clear for me now that there was no mischief.

roozbeh





Re: Euro in Email headers, misinformation?

2001-12-26 Thread James Kass


On this system, responding to the letter generates:

  From: Richard ÄŒepas [EMAIL PROTECTED]

Michael Kaplan's response arrived as:

 From: Richard ÄOepas [EMAIL PROTECTED]

In the 'folder e-mail list window', the name appears as
'Richard Cepas'.

Perhaps the correct name is Richard Čepas?

When Αλέξανδρος Διαμαντίδης writes, the text of the e-mail
displays his name just fine, but it appears in the 'folder e-mail
list window' as a string of diacriticized vowels and in the
address book as 'a?d??? ??aµa?t?d??'

Same kind of thing happens with てんどうりゅうじ while in the case
of Séamas Ó Brógáin there doesn't seem to be a problem.

These automatic conversions performed by the system are
undesirable and unnecessary.  Changing a Č into a C makes
the name wrong.

This system (Win M.E., M.S.O.E.) apparently uses Unicode as the
actual encoding of the name strings stored in the address book.
The application should be expected to convert incoming messages
from the old code pages into Unicode at the time the address is
added to the list.  The display of addresses and lists of e-mail
messages should also use Unicode as default in order for everything
to display correctly.

For Richard's last name, what has been stored in the address book
(*.WAB) here is U+00C4, U+0152, U+0065, U+0070, U+0061, U+0073.
(It's stored in Little Endian.)

Best regards,

James Kass.










RE: Euro in Email headers, misinformation?

2001-12-25 Thread Roozbeh Pournader


On Tue, 25 Dec 2001, Dr. International wrote:

 In fact, I didn't know about RFC 2047. Unfortunately my contacts in the
 Exchange team did not mention that this might be the reason for the lost
 Euro in the subject line. A quick search on the internet leads me to
 believe that mail servers, which implement RFC 2047 will keep the Euro
 (for that matter any character) in the subject intact, if the client can
 understand it. Is that right? I have to admit that mail severs are not
 one of my areas of expertise.

Sorry for getting angry, and thanks for the clarification.

But to get to the problem, no, it's not related to mail servers as far as
I know. Mail servers don't need to implement RFC 2047, they should support
it automatically, since it's ASCII compatible. It's the mail readers that
should support it. So this is a realm for Outlook and Outlook Express, and
not Exchange. (I do not know how these two implement RFC 2047, I have not
tested them throughly yet.)

But to get back to what has happened with some of your failed tests, I
guess the problem is 8-bit headers. Some mail composers send the subject
line as 8-bit, without doing any RFC 2047 conversion (Outlook/Outlook
Express may be among them, I don't know). Then some mail servers may strip
any eighth bit, remaining conformant to RFC 2822.

The other possible source of problem is the mail reader not respecting RFC
2047, showing the subject line as gibberish.

It would be great if you update the column. I am available for any kind of 
help I may be able to give.

best,
roozbeh





Re: Euro sign in HTML

2001-07-16 Thread Misha . Wolf

I think the decimal Numeric Character Reference #8364; 
would work in more browsers.

Misha


On 16/07/2001 15:09:52 unicode-bounce wrote:
 Hello,
 
 further to the concerns expressed in the Eudora thread,
 I'd like to point to an easy solution: exploit the euro
 entity defined in the last line of
 http://www.w3.org/TR/REC-html40/sgml/entities.html
 which was already defined in HTML 4.0 as of 1998-04-24.
 
 Every decent browser should handle euro; legibly, even
 if no Euro symbol is available in the fonts at hand. E. g.,
 Netscape communicator 4.7 displays 10,23 EUR, in my Irix
 system, where Netscape Communicator 4.75 has 10,23 €,
 on my PC. (Note that this message is tagged as ISO 8859-15
 and the previous sentence contains a true Euro symbol).
 
 Even 10,23 euro;, as displayed by pre-4.0 browsers, is
 legible to some extend.
 
 For an example of this technique, go to
 http://www.rz.uni-konstanz.de/Antivirus/TVD/index.html#B-Lizenz;
 you may wish to try this with your own browser.
 
 Best wishes,
 Otto Stolz
 
 
 


-
Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.




Re: Euro

2000-08-09 Thread Antoine Leca

Asmus Freytag wrote:
 
 The problem with the commission design of the euro glyph is that it only
 works as long as you use their aspect ratio and uniform stroke width. As
 long as you have these, the eye will complete them to a lower case 'e' form
 and you will see an 'e'uro. As soon as you attempt to actually match the
 type face you are working with, you end up with a glyph that's taller than
 wide, has different stroke width (usually thinner) for the cross bars
 and/or variable stroke width and serifs for the main loop.

I disagree. That may be true for transitional types (« réales » in French),
like Times, and more broadly all the highly-constrated types (think about
the euro glyph in didones in general and Bodoni in particular...)

But on the other end, the EU Commission design fits moderately well as it
with all the antiques (sans serif). Here in Europe we have a number of
examples, particularly ads, which show the official glyph along with
some digits in some sans serif font. The real problem is that a bold
version is officially prohibited (AFAIK), and the oblique version has to
be the same as the roman one (this is only a little problem, but it
confuses things).

In fact, I can even think about a réale version of the glyph with:
 - no serif, or very light ones
 - a lighter constrast, perhaps only half of the one for C; perhaps
  taking Optima's C as model
 - crossbars of the same weight as the main arms, and oblique ends

And the result may fit well with Times... but only for the thin or
light faces; the regular face and bolder will not leave sufficient
space to draw both crossbars (that's the main defect of the official
glyph, IMHO); and having only one crossbar (as it happens sometimes
for £, ¥ or $) is not at all recognized, at least for the moment
(wait for laymen to write the symbol by hand: trust me, that is
quite difficult to respect the proportions...)


Another point: for the moment, the euro glyph is made as capital
height (or perhaps sometimes figure height), not x-height; perhaps
this have a connection with the fact that most people, and among them
the (English-based) typographers, expects "Euro" rather than "euro"
for the name of the currency, even in Romance languages. As a result,
it is a bit more difficult to 'see' the 'e': Dr. Freytag himself uses
'Curo' instead of 'curo'!


 The eye immediately responds and see's an adorned capital 'C', so
 you get the 'C'uro, whether you want to or not.

I believe, because all designs to date, for obvious reasons, were
adaptations of this very capital 'C'...

 
 Familiarity with use will 'train' enough people to accept the 'C'uro for a
 'e'uro, so their knowledge of the full name will override their visual
 processing long enough that they will see the 'e' shape in the euro. I
 suspect that process is in full swing already, since the various
 typographers' attempt to design euro glyphs for printers, PC's, bill
 printing and sign making have already been going on for over a year now.

I agree with your analysis, however current usage is more limited to
academic uses (like govt anouncements), and therefore we see a lot the
official glyph, and less the adorned C's


Antoine



RE: Euro

2000-08-07 Thread Marco . Cimarosti

Asmus Freytag wrote:
 The problem with the commission design of the euro glyph is 
 that it only 
 works as long as you use their aspect ratio and uniform 
 stroke width. As 
 long as you have these, the eye will complete them to a lower 
 case 'e' form [...]

Visual perception is indeed a funny thing!

I never realized before this similarity between the euro sign and a
lowercase "e". But as soon as I read your words I started visualizing it.

I had always seen it as an ugly uppercase "E" with a certain "Cyrillic"
flavor (think at U+0404 or U+42D).

I think that this "Cyrillicity" in the euro logo is perceived by many other
people, and this probably contributed to the popular joke that the "EUR"
currency Id means "EUropean Ruble".

_ Marco





Re: Euro

2000-08-07 Thread Doug Ewell

Asmus Freytag [EMAIL PROTECTED] wrote:

 Familiarity with use will 'train' enough people to accept the 'C'uro
 for a 'e'uro, so their knowledge of the full name will override their
 visual processing long enough that they will see the 'e' shape in the
 euro.

There is no 'S' in 'dollar' nor 'L' in 'pound,' yet the symbols '$' and
'£' are readily accepted in North America and the British Isles.  The
origins of the symbols and words are something we learn after the fact.
First we learn the symbols and the words, we notice that there is no
obvious connection, then we shrug our shoulders and accept it.

The same thing will likely happen with the euro, especially as most
people (who are not experts on characters and typography like the
members of this list) will learn the euro symbol as a separate entity
and will not be bothered for long by any similarities between it and
ordinary letters such as 'C' or 'E'.

-Doug Ewell
 Fullerton, California



Re: Euro

2000-08-06 Thread Asmus Freytag

At 07:46 PM 7/30/00 -0800, John Cowan wrote:
  Yeah, how WOULD you make a serifed, rounded E that
  doesn't look silly and doesn't look like a C with
  an extra line? Well, maybe you can, I dunno. Anyone
  who can do that, I'd like to see it.

http://www.egt.ie/standards/iso10646/euro/euroglyph.html

The problem with the commission design of the euro glyph is that it only 
works as long as you use their aspect ratio and uniform stroke width. As 
long as you have these, the eye will complete them to a lower case 'e' form 
and you will see an 'e'uro. As soon as you attempt to actually match the 
type face you are working with, you end up with a glyph that's taller than 
wide, has different stroke width (usually thinner) for the cross bars 
and/or variable stroke width and serifs for the main loop. The eye 
immediately responds and see's an adorned capital 'C', so you get the 
'C'uro, whether you want to or not.

Familiarity with use will 'train' enough people to accept the 'C'uro for a 
'e'uro, so their knowledge of the full name will override their visual 
processing long enough that they will see the 'e' shape in the euro. I 
suspect that process is in full swing already, since the various 
typographers' attempt to design euro glyphs for printers, PC's, bill 
printing and sign making have already been going on for over a year now.

A./



Re: Euro

2000-07-31 Thread Roozbeh Pournader



On Sun, 30 Jul 2000, John Cowan wrote:

 http://www.egt.ie/standards/iso10646/euro/euroglyph.html

So you're taking it a "C"?




Re: Euro

2000-07-31 Thread John Cowan

Roozbeh Pournader wrote:
 
 On Sun, 30 Jul 2000, John Cowan wrote:
 
  http://www.egt.ie/standards/iso10646/euro/euroglyph.html
 
 So you're taking it a "C"?

I am realizing that people think the "I" of this page is me!  It is not;
I am not a font designer.  Send kudos or criticism to
Michael Everson [EMAIL PROTECTED].

-- 

Schlingt dreifach einen Kreis um dies! || John Cowan [EMAIL PROTECTED]
Schliesst euer Aug vor heiliger Schau,  || http://www.reutershealth.com
Denn er genoss vom Honig-Tau,   || http://www.ccil.org/~cowan
Und trank die Milch vom Paradies.-- Coleridge (tr. Politzer)



Re: Euro

2000-07-31 Thread Roozbeh Pournader



On Mon, 31 Jul 2000, John Cowan wrote:

 I am realizing that people think the "I" of this page is me!  It is not;
 I am not a font designer.  Send kudos or criticism to
 Michael Everson [EMAIL PROTECTED].

Sorry, I even didn't look at the sender's name! I thought it should have
been Michael himself. It's the trained eye, you know. ;)

--roozbeh




Re: Euro

2000-07-29 Thread Asmus Freytag

At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote:
I was not talking about the shape. I think all of us have seen it, and
many have also read the documents which define its exact shape using a
ruler and a compass. I was talking about the origin of the shape.

In some sense, except for purists, this discussion is rapidly becoming 
moot. The 'euro glyphs' have been out in the wild, on shop displays, in 
newsprint etc. for well over a year now.

If you will, the 'common man's' idea of what a proper Euro glyph is, is 
fast becoming influenced by what he sees on a daily basis, not by the 
origin of the glyph or by the logo (which is prescribed only for its 
appearance on the currency itself).

Given the name, I'm sure even the 'non-European' font designers that Werner 
likes to blame aren't suggesting that the logo for the 'e'uro is based on a 
'c'. However, when you try to put the thing together with the serifs used 
in many of the common type faces, the result can indeed look a bit like a 
'c'. This seems particularly true for monospaced fonts.

A./




Re: Euro

2000-07-29 Thread 11digitboy

Yeah, how WOULD you make a serifed, rounded E that
doesn't look silly and doesn't look like a C with
an extra line? Well, maybe you can, I dunno. Anyone
who can do that, I'd like to see it. 

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Asmus Freytag [EMAIL PROTECTED] wrote:
 At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote:
 I was not talking about the shape. I think all
 of us have seen it, and
 many have also read the documents which define
 its exact shape using a
 ruler and a compass. I was talking about the origin
 of the shape.
 
 In some sense, except for purists, this discussion
 is rapidly becoming 
 moot. The 'euro glyphs' have been out in the wild,
 on shop displays, in 
 newsprint etc. for well over a year now.
 
 If you will, the 'common man's' idea of what a
 proper Euro glyph is, is 
 fast becoming influenced by what he sees on a daily
 basis, not by the 
 origin of the glyph or by the logo (which is prescribed
 only for its 
 appearance on the currency itself).
 
 Given the name, I'm sure even the 'non-European'
 font designers that Werner 
 likes to blame aren't suggesting that the logo
 for the 'e'uro is based on a 
 'c'. However, when you try to put the thing together
 with the serifs used 
 in many of the common type faces, the result can
 indeed look a bit like a 
 'c'. This seems particularly true for monospaced
 fonts.
 
 A./
 
 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Euro

2000-07-29 Thread Roozbeh Pournader


I found it! Everybody's invited to take a look at:
http://www.tug.org/TUGboat/Articles/tb19-2/tb59inn.pdf

On Sat, 29 Jul 2000, Asmus Freytag wrote:

 If you will, the 'common man's' idea of what a proper Euro glyph is, is 
 fast becoming influenced by what he sees on a daily basis, not by the 
 origin of the glyph or by the logo (which is prescribed only for its 
 appearance on the currency itself).

Ok, but I only want to know about the historical origins.

 Given the name, I'm sure even the 'non-European' font designers that Werner 
 likes to blame aren't suggesting that the logo for the 'e'uro is based on a 
 'c'. However, when you try to put the thing together with the serifs used 
 in many of the common type faces, the result can indeed look a bit like a 
 'c'. This seems particularly true for monospaced fonts.

Take a look at the referenced article.




Re: Euro

2000-07-28 Thread Roozbeh Pournader



On Wed, 26 Jul 2000, Werner LEMBERG wrote:

 [...] the origin of the Euro glyph is a Greek small epsilon.

Any reference for this? I once heard that this is a curved E.
It was Peter Flynn in TUGboat I think...





Re: Euro

2000-07-28 Thread 11digitboy

Maybe this is how to settle the Euro glyph thing:
Instead of talking about it, write the Euro glyph
on a piece of paper with a pen, scan it in, and post
it. If we STILL can't agree on what it looks like,
very politely request that one of the people in CHARGE
of this Euro thing please write the Euro glyph with
a pen, and then scan in what she wrote and post it.
Historically, aren't typeface glyphs based on handwritten
glyphs? Why should some typeface designer decide
what the glyph should look like?!

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Roozbeh Pournader [EMAIL PROTECTED]
wrote:
 
 
 On Wed, 26 Jul 2000, Werner LEMBERG wrote:
 
  [...] the origin of the Euro glyph is a Greek
 small epsilon.
 
 Any reference for this? I once heard that this
 is a curved E.
 It was Peter Flynn in TUGboat I think...
 
 
 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-13 Thread Erik van der Poel

Frank da Cruz wrote:
 
 Doug Ewell wrote:
  
  That last paragraph echoes what Frank said about "reversing the layers,"
  performing the UTF-8 conversion first and then looking for escape
  sequences.  True UTF-8 support, in terminal emulators and in other
  software as well, really should depend on UTF-8 conversion being
  performed first.
 
 The irony is, when using ISO 2022 character-set designation and invocation,
 you have to handle the escape sequences first to know if you're in UTF-8.
 Therefore, this pushes the burden onto the end-user to preconfigure their
 emulator for UTF-8 if that is what is being used, when ideally this should
 happen automatically and transparently.

I may be misunderstanding the above, but ISO 2022 says:

  ESC 2/5 F shall mean that the other coding system uses
  ESC 2/5 4/0 to return;

  ESC 2/5 2/15 F shall mean that the other coding system
  does not use ESC 2/5 4/0 to return (it may have an alternative
  means to return or none at all).

Registration number 196 is for UTF-8 without implementation level, and
its escape sequence is ESC 2/5 4/7. I believe that ISO 2022 was designed
that way so that a decoder that does not know UTF-8 (or any other coding
system invoked by ESC 2/5 F) could simply "skip" the octets in that
encoding until it gets to the octets ESC 2/5 4/0.

This means that it does not need to decode UTF-8 just to find the escape
sequence ESC 2/5 4/0. UTF-8 does not do anything special with characters
below U+0080 anyway (they're just single-byte ASCII), so it works, no?

Of course, if you wanted to include any C1 controls inside the UTF-8
segment, they would have to be encoded in UTF-8, but ESC 2/5 4/0 is
entirely in the ASCII range (less than 128), so those octets are encoded
as is.

Erik



Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-13 Thread Frank da Cruz

Erik van der Poel wrote:
 Frank da Cruz wrote:
  The irony is, when using ISO 2022 character-set designation and invocation,
  you have to handle the escape sequences first to know if you're in UTF-8.
  Therefore, this pushes the burden onto the end-user to preconfigure their
  emulator for UTF-8 if that is what is being used, when ideally this should
  happen automatically and transparently.
 
 I may be misunderstanding the above, but ISO 2022 says:
 
   ESC 2/5 F shall mean that the other coding system uses
   ESC 2/5 4/0 to return;
 
   ESC 2/5 2/15 F shall mean that the other coding system
   does not use ESC 2/5 4/0 to return (it may have an alternative
   means to return or none at all).
 
 Registration number 196 is for UTF-8 without implementation level, and
 its escape sequence is ESC 2/5 4/7. I believe that ISO 2022 was designed
 that way so that a decoder that does not know UTF-8 (or any other coding
 system invoked by ESC 2/5 F) could simply "skip" the octets in that
 encoding until it gets to the octets ESC 2/5 4/0.
 
 This means that it does not need to decode UTF-8 just to find the escape
 sequence ESC 2/5 4/0. UTF-8 does not do anything special with characters
 below U+0080 anyway (they're just single-byte ASCII), so it works, no?
 
Yes, but I was thinking more about the ISO 2022 invocation features than the
designation ones:  LS2, LS3, LS1R, LS2R, LS3R, SS2, and SS3 are C1 controls.
The situation *could* arise where these would be used prior to announcing
(or switching to) UTF-8.  In this case, the end-user would have to configure
the software in advance to know whether the incoming byte stream is UTF-8.

Not a big deal; just an illustration of what happens when we can't use the
normal layering.

- Frank




Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-13 Thread Erik van der Poel

Frank da Cruz wrote:
 
 Yes, but I was thinking more about the ISO 2022 invocation features than the
 designation ones:  LS2, LS3, LS1R, LS2R, LS3R, SS2, and SS3 are C1 controls.
 The situation *could* arise where these would be used prior to announcing
 (or switching to) UTF-8.  In this case, the end-user would have to configure
 the software in advance to know whether the incoming byte stream is UTF-8.

Shouldn't the UTF-8 segment switch back to ISO 2022 before invoking any
of those C1 controls? This way, the decoder wouldn't have to know UTF-8,
and could skip over it reliably.

Erik



Re: Euro character in ISO

2000-07-12 Thread Roozbeh Pournader



On Tue, 11 Jul 2000, Asmus Freytag wrote:

 The only safe way to encode a Euro in HTML appears to be to use Unicode - 
 e.g. by using 8859-1 together with the numeric character reference (NCR) of 
 #x20AC;

euro; is much safer. Netscape 4 doesn't recognize hexadecimal character
references.

--roozbeh




Re: Euro character in ISO

2000-07-12 Thread Michael Everson

Ar 15:30 -0800 2000-07-11, scríobh Asmus Freytag:
At 01:25 PM 7/11/00 -0800, Leon Spencer wrote:
Has ISO addressed the Euro character?

Yes. It's at 0x20AC in ISO/IEC 10646-1.

This is not a standard notation. Please use U+20AC or one of the other
standard notations to refer to UCS code positions.

ME





Re: Euro character in ISO

2000-07-12 Thread Michael Everson

Ar 18:19 -0800 2000-07-11, scríobh Robert A. Rosenberg:

The problem would go away if the ISO would get their heads out of
their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
put the CP125x codes there.

Excuse me, but that is not appropriate. The ISO/IEC 8859 series is
conformant with ISO/IEC 2022, and protocols which adhere to that standard
should not be compromised by what you suggest.

Then when you said you used 8859-21 you'd get CP-1252 and Windows
would no longer need to lie (or tell the truth by admitting it is
CP-1252).

The problem is that some companies do/did not correctly identify their code
pages. The world can live with Latin-1 and CP-1252. It shouldn't have to
live with CP-1252 being identified as Latin-1.

Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Vox +353 1 478 2597 ** Fax +353 1 478 2597 ** Mob +353 86 807 9169
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire





Re: Euro character in ISO

2000-07-12 Thread Antoine Leca

Robert A. Rosenberg wrote:
 
 At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro
 character in ISO:
 
 There has been an attempt to create a series of 'touched up' 8859
 standards. The problem with these is that you get all the issues of
 character set confusion that abound today with e.g. Windows CP 1252
 mistaken for 8895-1 with a vengeance:
 
 The problem would go away if the ISO would get their heads out of
 their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
 put the CP125x codes there.

Sorry. It may work for CP1252/iso-8859-1, and CP1254/iso-8859-9,
but won't for the others. Since Windows starts with the same letter as
Word --or is the reason that they both come from the same company.
No! I cannot believe that-- there are a couple of requirements
that makes effectively the "other" codepages slighty incompatible,
such as the necessary presence for · at position B5 (because this
is the character Word uses when you ask it to "display" the spaces,
and this is hard-coded in the product).


 Then when you said you used 8859-21 you'd get CP-1252 and Windows
 would no longer need to lie (or tell the truth by admitting it is
 CP-1252).

Even if 8859-21 is defined to be exactly the same as some stage of
CP1252, and everyone in the standardization community admits this
as such, habits are so much entrenched, and love against Microsoft
so rare in the Unix world, that you may bet a lot that such a
standard will never gain wide acceptance.

Furthermore, this is completely unnecessary, as nowadays such
a standard exists, and it is used to be called 'charset=windows-1252'...

The real problem is that:
- Windows browsers/MAs did not know that until 1999 (as it seems)
- Windows HTML-tools/MAs are reluctant to add the test for presence
of non-Latin1 characters to either tag as iso-8859-1 or
windows-1252. Apparently they are too lazy (because they already
did such a test for ASCII).
Well, I am angry, because probably nowadays browsers do the job correctly.


Antoine



RE: Euro symbol in HTML (was: Euro character in ISO)

2000-07-12 Thread Alan Wood

 Otto Stolz wrote:
 
  Hence, the only safe way to encode the Euro symbol seems to be:
  - Use the euro; entity
  This will cause Netscape 4.7 to display "EUR" if the Euro glyph
  is not available (at least the version on my Unix box does so).

  The following two ways are safe, if the Euro glyph is available in
  the fonts specified by the user:
  - use UTF-8 together with the decimal NCR "#8364;";
  - use UTF-8 together with the UTF-8 encoding 'E2 82 AC' (in hex).

  In all cases, do not forget to declare your HTML source as either
  HTML 4.0 or HTML 4.01,

I can confirm that euro; and #8364; also work with Netscape 4.73 under
Windows 95.

However, the euro symbol seems to be the exception.  In my index of HTML 4
named character entities at:

http://www.hclrss.demon.co.uk/demos/ent4_frame.html

Netscape 4.73 does not recognise any of the other named character entities
that correspond to decimal numbers greater than 255.  (With ViewCharacter
Set set to Unicode (UTF-8)), and using Arial Unicode MS.)

Alan Wood
(Documentation Writer / Web Master)
Context Limited
(Electronic publishers of UK and EU legal and official documents)
mailto:[EMAIL PROTECTED]
http://www.context.co.uk/
http://www.alanwood.net/ (Unicode, special characters, pesticide names)




Re: Euro character in ISO

2000-07-12 Thread brendan_murray



Robert A. Rosenberg wrote:
Then when you said you used 8859-21 you'd get CP-1252 and Windows
would no longer need to lie (or tell the truth by admitting it is
CP-1252).
And because certain companies had (and still have) bugs in their comms products, incorrectly identifying CP1252 data as ISO 8859-1, ISO standards should reject ISO-2022 and populate C1 with graphic characters?

I suppose other inconsiderate incompatibilities such as the incorrect encoding of half-pitch kana in ISO-2022-JP is the fault of ISO too?

Perhaps those companies that have these major bugs in their software, all of which have been repeatedly pointed out, should fix the probems there. The rest of the industry bends over backwards to accomodate these corrupt data, so a little effort on the part of the guilty would help a lot, and might prevent misguided postings like the above.

B=


Re: C1 controls and terminals (was: Re: Euro character in ISO)

2000-07-12 Thread Frank da Cruz

 Frank da Cruz [EMAIL PROTECTED] wrote:
 
  . If you send a code in the 0x80-8x9f range to such a terminal or 
emulator, it properly treats it as a control code.  If it was
intended as a graphic character ("smart quote" or somesuch) the
result is a fractured screen, sometimes even a frozen session.
 
 This is the widely reported compatibility problem between UTF-8 and
 terminals.  I know I read somewhere, possibly on Markus Kuhn's Unicode
 page, possibly somewhere else, that ISO 2022 codes exist to switch out
 of "ISO 2022 mode" and into "UTF-8 mode" and to either allow or prevent
 switching back to 2022.  Is there any progress on implementing this so
 terminals and emulators can live with UTF-8?
 
Maybe Markus can clarify.  I would be surprised if there's anything in
ISO 2022 about UTF8, except that it does provide a way to switch out of
and back into ISO 2022 mode, allowing the use of character sets that do
not comply with ISO 2022 and 4873.  That's what the designating escape
sequences "with standard return" and "without standard return" are for.

But that's not quite the same thing.  There is no good reason why UTF-8
couldn't be used by (say) a VT320 emulator without switching out of the
ISO 2022 regime, except that UTF-8 contains C1 control codes as data.
This was discussed here a while back and "the other Markus" showed how
a C1-safe form of UTF-8 could have been designed:

  http://www.mindspring.com/~markus.scherer/utf-8c1.html

But, as they say, "it's too late now".  Therefore, those of us who want
to make use of UTF-8 within the ISO 2022 regime must reverse the layers.
First decode the UTF-8, then parse for escape sequences.  Of course your
emulator can get into awful trouble that way if the data stream isn't
really UTF-8.  But overall it's not that bad; we can live with it, and
in fact have done it this way in practice in our own emulator.

- Frank




Re: Euro character in ISO

2000-07-12 Thread Robert A. Rosenberg

At 04:27 AM 07/12/2000 -0800, Michael Everson wrote:
Ar 18:19 -0800 2000-07-11, scríobh Robert A. Rosenberg:

 The problem would go away if the ISO would get their heads out of
 their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
 put the CP125x codes there.

Excuse me, but that is not appropriate. The ISO/IEC 8859 series is
conformant with ISO/IEC 2022, and protocols which adhere to that standard
should not be compromised by what you suggest.

 Then when you said you used 8859-21 you'd get CP-1252 and Windows
 would no longer need to lie (or tell the truth by admitting it is
 CP-1252).

The problem is that some companies do/did not correctly identify their code
pages. The world can live with Latin-1 and CP-1252. It shouldn't have to
live with CP-1252 being identified as Latin-1.

Which is what I am saying when I talk about admitting that you are using 
CP-1252 not
ISO-8859-1 (in your MIME/HTML headers) at least in the case where there are 
glyphs in the
x80-x9F range in use. If a system can claim US-ASCII if no codes in the 
x80-xFF range appear and ISO-8859-1 otherwise (as many MUAs do), it should 
have the smarts to claim CP-1252 if in its scan it found a x80-x9F glyph).


Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Vox +353 1 478 2597 ** Fax +353 1 478 2597 ** Mob +353 86 807 9169
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire




Re: Euro character in ISO

2000-07-12 Thread Robert A. Rosenberg

At 08:56 PM 07/11/2000 -0800, Geoffrey Waigh wrote:
On Tue, 11 Jul 2000, Robert A. Rosenberg wrote:

  At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro
  character in ISO:
 
  There has been an attempt to create a series of 'touched up' 8859
  standards. The problem with these is that you get all the issues of
  character set confusion that abound today with e.g. Windows CP 1252
  mistaken for 8895-1 with a vengeance:
 
  The problem would go away if the ISO would get their heads out of
  their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
  put the CP125x codes there.

Except that would break all the systems that understand that C1 "junk,"
and a number of systems do so because they are adhering to other
ISO standards.  If you are going to force someone to change their
datastreams to something new, they might as well go to some flavour
of Unicode anyways.

Who is going to get broken if I say on my MIME header (or HTML) that my 
CHARSET is (example) ISO-8859-21? You are talking about uses where the 
computer is talking to a device and needs the C1 range to tell it what to 
do not another computer (where it is just passing a text stream). The C1 
codes are DEVICE CONTROL and have no purpose (except to occupy slots that 
are better used for extra GLYPHS) in EMAIL or HTML transfer. I am NOT 
asking for anyone to change their mode of operation - only for ISO-8859-x 
codes that are designed for transfer of printable data. UNICODE is not a 
viable option since all we are talking about is the ability to select from 
a number of 256 codepoint 8-bit tables not go over to UTF-8 or UTF-16 
(which would require changes to the program code).


Geoffrey
"tilting at terminal emulators, err windmills."




Re: Euro character in ISO

2000-07-12 Thread Frank da Cruz

On Wed, 12 Jul 2000 10:43:59 -0800, Robert A. Rosenberg wrote:
 At 08:56 PM 07/11/2000 -0800, Geoffrey Waigh wrote:
 On Tue, 11 Jul 2000, Robert A. Rosenberg wrote:
   At 15:30 -0800 on 07/11/00, Asmus Freytag wrote:
   There has been an attempt to create a series of 'touched up' 8859
   standards. The problem with these is that you get all the issues of
   character set confusion that abound today with e.g. Windows CP 1252
   mistaken for 8895-1 with a vengeance:
  
   The problem would go away if the ISO would get their heads out of
   their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
   put the CP125x codes there.
 
 Except that would break all the systems that understand that C1 "junk,"
 and a number of systems do so because they are adhering to other
 ISO standards.  If you are going to force someone to change their
 datastreams to something new, they might as well go to some flavour
 of Unicode anyways.
 
 Who is going to get broken if I say on my MIME header (or HTML) that my 
 CHARSET is (example) ISO-8859-21?

We go through this exercise about twice a year.  First, let's recognize
that ISO is not about to revoke Standards 4873 and 2022, so there's not
much point in suggesting it.  Second, think of a terminal that complies
with these standards.  A physical terminal such as a VT320.  I am using it
to access my mail host in text mode, and I'm reading mail with (say) Unix
'mail'.  The terminal does not interpret the MIME headers.  It doesn't
parse HTML.  It implements a very straightforward finite state automaton
that implements the ISO 2022 based terminal.  Unix 'mail' sends to my
terminal the bytes of the message, period.

Perhaps you're suggesting the Unix 'mail' should become a translation
agent between the character set of the mail and that of the user's
terminal?  I hope not, since given that practically any character set
anybody can dream up is "MIME-compliant" as long as it's tagged, then
every mail program must know how to convert from every character set in
existence to every other one.  Or is it the mail transfer agent?  Or both?
It's really quite a mess; let's not go out of our way to make it worse.

To understand the implications of using 8-bit character sets that contain
graphic characters in the C1 area FOR INTERCHANGE, imagine trying to do
the same thing to the C0 area.

- Frank




Re: Euro character in ISO

2000-07-12 Thread Frank da Cruz

 On Wed, 12 Jul 2000, Frank da Cruz wrote:
 
  Perhaps you're suggesting the Unix 'mail' should become a translation
  agent between the character set of the mail and that of the user's
  terminal?  I hope not, since given that practically any character set
  anybody can dream up is "MIME-compliant" as long as it's tagged, then
  every mail program must know how to convert from every character set in
  existence to every other one.
 
 Yes, it damn well should. And this is easy, as there is a standard Unix
 function that knows how to do this. (it's called iconv).
 
I'm logged into unix right now:

  $ iconv
  bash: iconv: command not found
  $

How standard can it be?  And what about VMS, VMS/CMS, VOS, OS/390, OS/400,
Tandem, and all the others?

How does the mail client know what character set my terminal has?

Anyway, between you and me, there are potentially lots of places where
character-set conversion can occur.  Your mail client, your MTA, my MTA,
my mail client, my Telnet server, my Telnet client, my terminal emulator.
Let's think carefully about this before we have random combinations of
these clients, agents, and servers stepping on each others' toes.

- Frank




Re: Euro character in ISO

2000-07-12 Thread Rick McGowan

 There are lots of Unixes:
   http://www.columbia.edu/kermit/unix.html
 How many of them have an iconv function?

rangda 47: man iconv
man: no entry for iconv in the manual.
rangda 48: cat /etc/motd
Welcome to Darwin!
rangda 49: well, hmmm...
zsh: command not found: well,
rangda 50: 



RE: Euro character in ISO

2000-07-12 Thread Chris Wendt

The trick is HTML4.

Since you sent the message in HTML format, the Euro is encoded as numeric
character reference. Exchange knows how to decode HTML and generate RTF,
depending on what your email client needs.

If you had sent plain text, the Euro would have turned into ?. As is the
case in the plain text part of the multipart message.

This is the case for Outlook Express 5. Older versions of OE treated
Windows-1252 and iso-8859-1 the same.

Here is the source of the message from my Outlook Express Sent Mail folder.
(To see the source, open message and press Ctrl-F3).

From: "Chris Wendt" [EMAIL PROTECTED]
To: "Chris Wendt" [EMAIL PROTECTED]
Subject: Euro test
Date: Wed, 12 Jul 2000 15:17:49 -0700
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="=_NextPart_000_0005_01BFEC14.57202A10"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400

This is a multi-part message in MIME format.

--=_NextPart_000_0005_01BFEC14.57202A10
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

abcdef ? abcdef

--=_NextPart_000_0005_01BFEC14.57202A10
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
HTMLHEAD
META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type
META content=3D"MSHTML 5.00.3103.1000" name=3DGENERATOR
STYLE/STYLE
/HEAD
BODY bgColor=3D#ff
DIVFONT color=3D#008000 face=3DVerdana size=3D2abcdef #8364;=20
abcdef/FONT/DIV/BODY/HTML

--=_NextPart_000_0005_01BFEC14.57202A10--


-Original Message-
From: Leon Spencer [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, July 12, 2000 2:38 PM
To: Unicode List
Subject: RE: Euro character in ISO


Is Microsoft playing tricks in MS Outlook or IE?
If I send text from Outlook Express to my exchange
account, with charset set to iso-8859-1 but containing
the Trademark symbol ((tm)) in the body, it shows up
okay. The body of the message is in text/html.

Is it possible that MS Outlook's HTML ActiveX control
(which I'm assuming to be the same used for IE) is
defaulting to Cp1252/Windows-1252 when it sees iso-8859-1?

Leon

BTW, The body also contains the Euro!



RE: Euro character in ISO

2000-07-11 Thread Leon Spencer


Does anyone know where I can easily download the
latest ISO-8859-X specs? The ones at ftp.unicode.org
seem to be dated 1996.

Also, does anyone know which ISO-8859-X contains 
the Euro?

Thanks.
  Leon

 -Original Message-
 From: Murray Sargent [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, July 11, 2000 2:44 PM
 To: Unicode List
 Cc: Unicode List
 Subject: RE: Euro character in ISO
 
 
 The two statements are correct. ISO has addressed the problem 
 by adding more
 ISO-8859-x standards, since changing 8859-1 would cause 
 problems.  The best
 thing to do is to use Unicode and avoid the codepage confusion :-)
 
 Murray
 
  -Original Message-
  From:   Leon Spencer [SMTP:[EMAIL PROTECTED]]
  Sent:   Tuesday, July 11, 2000 2:26 PM
  To: Unicode List
  Subject:Euro character in ISO
  
  The Euro does not exist in iso-8859-1. It
  is in Cp1252 (WinLatin1) - Microsoft's code page
  superset of iso-8859-1. 
  
  Is this correct? Has ISO addressed the Euro character?
  If so, it the issue more of vendors implementing it?
  
  Leon
 



Re: Euro character in ISO

2000-07-11 Thread Asmus Freytag

At 01:25 PM 7/11/00 -0800, Leon Spencer wrote:
Has ISO addressed the Euro character?

Yes. It's at 0x20AC in ISO/IEC 10646-1.

There has been an attempt to create a series of 'touched up' 8859 
standards. The problem with these is that you get all the issues of 
character set confusion that abound today with e.g. Windows CP 1252 
mistaken for 8895-1 with a vengeance: Not only is 8859-15 slightly 
different from 8859-1, but the difference involves codes that are perfectly 
valid in 8859-1.

Because for 99% of all text, there is no difference, people are almost 
certain to mix them up, mislabel HTML files etc. etc.

The only safe way to encode a Euro in HTML appears to be to use Unicode - 
e.g. by using 8859-1 together with the numeric character reference (NCR) of 
#x20AC;

A./



Re: Euro character in ISO

2000-07-11 Thread Robert A. Rosenberg

At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro 
character in ISO:

There has been an attempt to create a series of 'touched up' 8859
standards. The problem with these is that you get all the issues of
character set confusion that abound today with e.g. Windows CP 1252
mistaken for 8895-1 with a vengeance:

The problem would go away if the ISO would get their heads out of 
their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and 
put the CP125x codes there.
Then when you said you used 8859-21 you'd get CP-1252 and Windows 
would no longer need to lie (or tell the truth by admitting it is 
CP-1252).



Re: Euro character in ISO

2000-07-11 Thread Michael \(michka\) Kaplan

Robert,

I am a big fan of the Windows code pages, they often make my life easier.
However, there is a disadvantage to the fact that even over the course of a
few service packs (let alone a few operating systems!) the code pages have
changed, and there is simply no good documentation that will tell you when
(for example) Farsi characters U+06A9 and U+06AF were added to Windows
CP1256 (Arabic) . All that one knows for certain is that it was before
Windows 98 SE and before NT4 SP5 (although it did no ship with NT4).

When you cannot figure out why an application works on one platform and not
another, it can make you pine for a more stationary standard! :-)

My ISP moved to Windows 2000 so I do not have to worry about making them
install things like newer code page files on the web server, but for a long
time thse differences plagued me heavily.

michka


- Original Message -
From: "Robert A. Rosenberg" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Cc: "Unicode List" [EMAIL PROTECTED]
Sent: Tuesday, July 11, 2000 7:19 PM
Subject: Re: Euro character in ISO


 At 15:30 -0800 on 07/11/00, Asmus Freytag wrote about Re: Euro
 character in ISO:

 There has been an attempt to create a series of 'touched up' 8859
 standards. The problem with these is that you get all the issues of
 character set confusion that abound today with e.g. Windows CP 1252
 mistaken for 8895-1 with a vengeance:

 The problem would go away if the ISO would get their heads out of
 their a$$ and drop the C1 junk from the NEW 'TOUCHED UP" 8859s and
 put the CP125x codes there.
 Then when you said you used 8859-21 you'd get CP-1252 and Windows
 would no longer need to lie (or tell the truth by admitting it is
 CP-1252).