Markus Scherer <[EMAIL PROTECTED]> wrote:
> On 2002-apr-09, Shlomi Tal and Doug Ewell discussed on this list
> a UTF-7 signature byte sequence of +/v8- (which was news to me).
I don't remember ever reading a recommendation, or even a suggestion, to
use +/v8- as a signature for UTF-7. But that w
At 18:40 4/11/2002,
=?iso-2022-jp?B?GyRCJG0hOyE7ITshOxsoQiAbJEIkbSE7ITshOxsoQg==?= wrote:
>Why does the printed word get so much more respect than the written word?
>
>It would be like saying that for a spoken language to be accepted into a
>registry, one must make a speech synthesizer for the
At 17:23 4/11/2002, David Starner wrote:
>pfaedit's a free font editor for Unix. Or one could write out a
>PostScript font by hand - it's not completely unreasonable, especially
>if you're doing something like a few math characters.
I love the fact that there are still people out there who would
This thread seems just about ended, and I don't want to be the person
to revive it, but there have been numerous related topics in the past
six months, and nothing in them answers the question that has been
nagging me.
The question is
"Considering the difficulty af actually getting access to
>This is a barrier erected for three reasons:
>
> 1. If a proposed character can't pass the font test -- i.e., nobody can
> come up with a usable font that contains it -- then it may be of
> rather marginal usefulness, since apparently people *aren't* using
it.
> Of course, histo
pfaedit's a free font editor for Unix. Or one could write out a
PostScript font by hand - it's not completely unreasonable, especially
if you're doing something like a few math characters.
--
David Starner - [EMAIL PROTECTED]
"It's not a habit; it's cool; I feel alive.
If you don't have it you'
> "Stefan" == Stefan Persson <[EMAIL PROTECTED]> writes:
Stefan> Is there some free font program out there that can be used for
Stefan> this purpose?
There is pfaedit at:
http://pfaedit.sf.net/
and for bdf bitmap fonts xmbdfed at:
http://crl.nmsu.edu/~mleisher/xmbdfed.html
Pfaedi
At 15:49 4/11/2002, Kenneth Whistler wrote:
> > Is there some free font program out there that can be used for this
> purpose?
>
>I'll let somebody else on the list who knows about font tools answer
>that one.
I'm not aware of any free tools that I would trust to do the job. The
cheapest optio
Juuitchan donned sackcloth and ashes and wailed:
> >It seems that I have to make a font containing any characters that I want
> to
> >propose for inclusion.
> >
>
> Oy gevalt. So I can't propose anything. Fabulous. Just fabulous.
Well, get serious.
The Unicode Standard is serious business. (E
Stefan asked:
> It seems that I have to make a font containing any characters that I want to
> propose for inclusion.
Or provide a font already made by someone else containing them, or get
someone else who has the relevant tools to produce it.
This is a barrier erected for three reasons:
1.
> From [EMAIL PROTECTED] Thu Apr 11 13:45:37 2002
> X-Originating-IP: [62.30.112.2]
> To: <[EMAIL PROTECTED]>
> Subject: Re: Inherent "a"
Sinnathurai Srivas wrote:
> May I assume u+0b85 as official?
No.
That is U+0B85 TAMIL LETTER A -- just the ordinary, standalone
letter /a/.
You are, of cour
>From: "Stefan Persson" <[EMAIL PROTECTED]>
>To: "Unicode-listan" <[EMAIL PROTECTED]>
>Subject: Concerning proposals
>Date: Thu, 11 Apr 2002 23:57:55 +0200
>
>It seems that I have to make a font containing any characters that I want
to
>propose for inclusion.
>
Oy gevalt. So I can't propose a
Mark:
A suggestion: On slide 5, I would be inclined not to differentiate
surrogates from non-characters. That only confuses people, I think,
regarding the relationships between codepoints and the various encoding
forms. Even if they are formally still distinguished in the Std, I contend
that the
ICU 2.1 will have an API for this, uchar.h/u_charAge().
markus
Kenneth Whistler wrote:
> Frank asked:
>>Given a Unicode encoding value U+ (or whatever for non-BMP), how can
>>I find out the version of the Unicode standard in which this character
>>first appeared?
>
> http://www.unicode.org
It seems that I have to make a font containing any characters that I want to
propose for inclusion.
Do the characters have to be encoded to the correct code points, or can they
be encoded to just about any code point?
Is there some free font program out there that can be used for this purpose?
- Original Message -
From: "Tom Gewecke" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: den 11 april 2002 22:56
Subject: Re: Vietnamese Nom Text
> >see:
> >
> > http://www.columbia.edu/kermit/utf8.html
> >
> >which has an interesting new entry: Vietnamese N^¥m, the first entry
> >con
>see:
>
> http://www.columbia.edu/kermit/utf8.html
>
>which has an interesting new entry: Vietnamese N¥m, the first entry
>containing non-BMP characters (probably will not be entirely visible to
>most people)
Can *anyone* see it properly? Last I checked no browser could read UTF-8
beyond the
Avarangal wrote:
> Dear Doug Ewell, William Overington, James E. Agenbroad, and Maurice
> Bauhahn,
>
> Thank you all for the reply.
>
> May I assume u+0b85 as official?
Whoa, hang on here! Official WHAT? u+0b85 is definitely in Unicode:
U+0B85 TAMIL LETTER A
It is _NOT_ an "inherent a"
Dear Doug Ewell, William Overington, James E. Agenbroad, and Maurice
Bauhahn,
Thank you all for the reply.
May I assume u+0b85 as official?
Some explanations for the need for a visible "a".
In Tamil,
a/
dependent "ai", and "au" has ligatures. infact "au" and "ou" at present
utilise the same li
Ken answered:
> Frank asked:
> > From [EMAIL PROTECTED] Thu Apr 11 12:12:33 2002
> > Date: Thu, 11 Apr 2002 14:58:48 EDT
> > Given a Unicode encoding value U+ (or whatever for non-BMP), how can
> > I find out the version of the Unicode standard in which this character
> > first appeared?
>
>
Frank asked:
> From [EMAIL PROTECTED] Thu Apr 11 12:12:33 2002
> Date: Thu, 11 Apr 2002 14:58:48 EDT
> Given a Unicode encoding value U+ (or whatever for non-BMP), how can
> I find out the version of the Unicode standard in which this character
> first appeared?
At last, a question for whic
Given a Unicode encoding value U+ (or whatever for non-BMP), how can
I find out the version of the Unicode standard in which this character
first appeared?
- Frank
Shlomi Tal wrote:
> UTF-7, it shocked me how Greek "Sokrates" and "S o k r a t e s" (with
> spaces between each Greek letter in the latter) would have different
> encodings for the same Unicode characters.
That is not unusual for stateful encodings.
It's the same with BOCU-1 (not in this part
Markus Scherer wrote:
>+/v8 is the encoding of U+FEFF as the first code point in a text. So far,
>so good.
>The '-' as the next byte switches UTF-7 back to direct-encoding of a subset
>of US-ASCII.
>
>What if there is no '-' there? What if a non-ASCII code point immediately
>follows the U+FEFF
On 2002-apr-09, Shlomi Tal and Doug Ewell discussed on this list a UTF-7 signature
byte sequence of +/v8- (which was news to me).
(Subject "MS/Unix BOM FAQ again (small fix)")
I "meditated" some over this -
+/v8 is the encoding of U+FEFF as the first code point in a text. So far, so good.
The '
> Mark Davis <[EMAIL PROTECTED]> wrote:
>
> > - when one of the BOM-allowing UTFs starts with a BOM, you know the
> > encoding*, and you strip off the BOM when you get the content.
> >
> > *assuming that no UTF-16 file has U+ as the first character.
>
> In the real world, this is a pretty go
I thought some of the choices in the following were amusing:
http://m-w.com/cgi-bin/dictionary/?va=Unicode
Mark
—
Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]
http://www.macchiato.com
It is a pretty good assumption; but if BOMs are used on smaller fields
the probability goes up. And to be perfectly reliable, you can't
assume it.
That is one reason that the WORD JOINER was encoded, so that
eventually we can use FEFF solely as a BOM.
Mark
—
Γνῶθι σαυτόν — Θαλῆς
[For transl
Doug Ewell wrote:
> As Shlomi points out, Microsoft products do not treat UTF-7
> specially, except that IE recognizes the UTF-7 BOM and sets its encoding
> accordingly (but this is true for any UTF-7 sequence, not just the BOM;
> try loading a text file containing only the 11 ASCII characters
>
Hello, Doug!
I)
AT> http://www.unicode.org/unicode/uni2book/ch03.pdf
AT>
1.
AT> - A single abstract character may correspond to more then one code
AT> value -
for example, U+00C5 ... LATIN CAPITAL LETTER A WITH RING and
U+212B ... ANGSTROM SIGN
2.
AT> - Multiple code values may be
30 matches
Mail list logo