Mark E. Shoulson scripsit:
> Heh... I've occasionally caught myself almost wishing for this kind of
> setup, ridiculous though it be. It would be nice to be able to get just
> the *content* of the text without having to bother with all that mucking
> about with HTML rendering engines and whatn
Given that U+3001 IDEOGRAPHIC COMMA
and U+FE50 SMALL COMMA
are both of Line Break class CL, wouldn't it make sense for
U+FE51SMALL IDEOGRAPHIC COMMA
to also be of class CL instead of class ID?
[EMAIL PROTECTED] wrote:
XML has become the de facto standard for fancy text. It is therefore
useful to explore ways and means of bringing XML into plain text,
since obviously plain text is simpler than, and superior to, fancy text.
The current method involving & and < and > and / and who knows w
Philippe Verdy wrote:
This seems highly excessive. We already have plenty of PUA space. All what we
need is a standard way (file format? protocol?) to transport PUA character
properties, and possibly encode a reference (URI?) to the definition file or
service. If Unicode does not want to do this j
[Original Message]
From: Kenneth Whistler <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Scenario: The UTC listens to you and defines some section of the PUA
as strong right-to-left by default for use in PUA-defined bidirectional
scripts. Somebody else is *already* using that section of the PUA
Peter Kirk wrote:
On 31/03/2004 14:25, [EMAIL PROTECTED] wrote:
Peter Kirk scripsit:
But, as Ken has just clarified, with NBSP Louis' neck may be
stretched rather uncomfortably, if not cut completely. Here is what I
don't want to see (fixed width font required):
Louis XVI was
guillotined
comments below.
Mark
__
http://www.macchiato.com
â à â
- Original Message -
From: "Peter Kirk" <[EMAIL PROTECTED]>
To: "Mark Davis" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wed, 2004 Mar 31 19:15
Subject: Re: What is the princi
> > * Changed: bidi class of several characters
> Won't these fixes break applications out there? I.e., won't they turn
> previously conformant applications into non conformant ones?
And the other thing to understand about this particular change
is that it is the outcome of a years-long deba
On 31/03/2004 15:32, Ernest Cline wrote:
[Original Message]
From: Peter Kirk <[EMAIL PROTECTED]>
Ernest, I support your general ideas here. But I am concerned about the
implications of defining PUA characters with combining classes other
than zero. I can see this causing some confusion with n
On 31/03/2004 14:27, Mark Davis wrote:
While I disagree with most of what you've said on this list, it is not an
unreasonable proposal to change the default properties for some ranges of the
private use blocks. I don't think that this would, in practice, really disturb
any applications, because of
Well, I've decided to start what is probably a quixotic quest for
a better set of private use characters. Such a proposal will need
to be complete, but it had best be as simple as possible. That
leads me to my first question. Where is the Arabic Shaping
Class property normally taken care of? I
Marco Cimarosti scripsit:
> So far, my understanding was that the normative properties of existing code
> points where "carved in stone".
Not all normative properties are immutable. A normative property is
simply one which you have to get right if you claim conformance to
that part of Unicode:
> > Here is what I do want:
> >
> > Louis XVI was
> > guillotinedin
> > 1793.
>
> Louis\ XVI was guillotined in 1793. If you aren't using TeX,
> and you're doing this type of justification in small columns,
> your program ought to provide a way to do this.
Other possible appro
On 31/03/2004 13:30, Kenneth Whistler wrote:
...
I think you're spitting into the wind if you think you can
force, through the character standardization process, the
major platform vendors to support the kind of PUA functionality
you are after, when they could do so *today* via much more
extensib
On 31/03/2004 14:25, [EMAIL PROTECTED] wrote:
Peter Kirk scripsit:
But, as Ken has just clarified, with NBSP Louis' neck may be stretched
rather uncomfortably, if not cut completely. Here is what I don't want
to see (fixed width font required):
Louis XVI was
guillotinedin
1793.
> [Original Message]
> From: Peter Kirk <[EMAIL PROTECTED]>
>
> Ernest, I support your general ideas here. But I am concerned about the
> implications of defining PUA characters with combining classes other
> than zero. I can see this causing some confusion with normalisation etc.
> And it do
On 31/03/2004 13:13, Peter Constable wrote:
...
E.g. SIL's Graphite technology can deal with RTL PUA characters, but then it isn't relying on system-supplied services to do complex-script shaping of text.
I am glad to hear this, as it at least offers some hope to those of us
who see the need
> Surely Unicode didn't waste two planes for something that
> no one can practically use.
Plane 15 and Plane 16 private use characters weren't the
invention of the UTC, by the way. They derive from the
original specification of ISO/IEC 10646-1. From
ISO/IEC 10646-1: 1993:
"The code positions
Didn't you send this out a few hours too early? :-)
--
Elliotte Rusty Harold
[EMAIL PROTECTED]
Effective XML (Addison-Wesley, 2003)
http://www.cafeconleche.org/books/effectivexml
http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
Peter Kirk wrote:
Louis XVI was
guillotinedin
1793.
Here is what I do want:
Louis XVI was
guillotinedin
1793.
Louis\ XVI was guillotined in 1793. If you aren't using TeX,
and you're doing this type of justification in small columns,
your program ought to provide a way
From: <[EMAIL PROTECTED]>
> XML has become the de facto standard for fancy text. It is therefore
> useful to explore ways and means of bringing XML into plain text,
> since obviously plain text is simpler than, and superior to, fancy text.
> The current method involving & and < and > and / and w
While I disagree with most of what you've said on this list, it is not an
unreasonable proposal to change the default properties for some ranges of the
private use blocks. I don't think that this would, in practice, really disturb
any applications, because of #1 below.
I have, however, a few obser
Peter Kirk scripsit:
> But, as Ken has just clarified, with NBSP Louis' neck may be stretched
> rather uncomfortably, if not cut completely. Here is what I don't want
> to see (fixed width font required):
>
> Louis XVI was
> guillotinedin
> 1793.
This, however, is a matter of presentat
"Mike Ayers" <[EMAIL PROTECTED]> writes:
> Support? ROFL! Call up one of those companies and tell them that
> you are having trouble displaying PUA fonts, eastern or otherwise. I'd like
> to snoop on that call.
Apple seemed pretty concerned about displaying PUA fonts on Mac OS X
Oops.
Well...
*That* was a day early.
Rick
On 31/03/2004 12:28, Ernest Cline wrote:
...
This is the kind of stuff the UTC refuses to start up by trying
to provide some subdivision of semantics in the PUA. *That* is
the principle, by the way, which guides the UTC position on
the PUA: Use at your own risk, by private agreement.
Whic
Peter Kirk wrote...
> I am undecided yet whether to make a formal proposal.
> Ken seems to suggest that this would be a waste of time -
Yes. I also think it would be a waste of time, but...
> although I can see some advantages in obtaining a formal rejection.
... I can also see some value in a
XML has become the de facto standard for fancy text. It is therefore
useful to explore ways and means of bringing XML into plain text,
since obviously plain text is simpler than, and superior to, fancy text.
The current method involving & and < and > and / and who knows what else
is obviously much
From: "Ernest Cline" <[EMAIL PROTECTED]>
> I'd have to take the time to list them, but a quick glance convinces
> me that there are at most several hundred combinations that would
> need to be supported if we limit things to just those combinations
> already in use. (it might take more, if for exa
Ernest suggested:
> There are currently some 10 totally unused planes, with not even any
> tentative plans for them, Allocating one or two those into additional
> Private Use Areas with a variety of default characteristics instead of
> the monotonous default characteristics of the existing Privat
> No. The *only* way to maintain compatibility between your applications
> and the system software is to ensure that your applications only do things
> that are supported by the system software.
If what is meant here by "your applications" is any applications running on your
system, then that
The NBSP issue was extensively discussed a couple of years ago, I don't
remember in which list. In short, it was wrongly used by early web users as
a fixed width space, and there is such a vast legacy it cannot be changed.
However, there are other applications that use the intended meaning - see
IS
On 31/03/2004 12:27, fantasai wrote:
Peter Kirk wrote:
LouisXVI may have lost his head, but we don't
want his number also to fall off on to the next line, or even to
become too far separated from his name. We need to know what kind of
space to use to resist the guillotine!
NBSP
You should n
On 31/03/2004 12:40, Rick McGowan wrote:
Peter Kirk wrote...
... I have a real requirement. The UTC has the power to meet my requirement,
and to do so rather simply. I am asking them to meet it.
Actually, you are not asking UTC anything. You are discussing the PUA on a
public-access mai
On 31/03/2004 10:44, Mike Ayers wrote:
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Behalf Of Peter Kirk
> Sent: Wednesday, March 31, 2004 9:12 AM
> On 30/03/2004 16:30, Kenneth Whistler wrote:
> But
> what if users of certain other scripts e.g. RTL scripts want just a
> handful of PUA c
On 31/03/2004 10:44, Mike Ayers wrote:
...
>
> Well, I don't quite see why it is business sense for software
> companies
> to support the huge PUAs for variant CJK characters, outside
Support? ROFL! Call up one of those companies and tell them
that you are having trouble displaying PUA
Peter continued:
> Thanks for the clarification. I should say that the behaviour of NBSP
> suddenly reverted to what it had been in previous versions of the
> standard, although a perhaps inadvertant change was made in 4.0.0.
Even that is not correct.
The *Introduction* to UAX #14 was expanded
Peter Kirk wrote...
> ... I have a real requirement. The UTC has the power to meet my requirement,
> and to do so rather simply. I am asking them to meet it.
Actually, you are not asking UTC anything. You are discussing the PUA on a
public-access mail list. There's a big difference. This *is* t
> [Original Message]
> From: Kenneth Whistler <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
>
> Peter Kirk continued:
>
> > >You can do it privately. See above. But attempting to do such things
> > >in terms of formally specified usages of the PUA is an invitation
> > >to failure of interoperabi
On 31/03/2004 11:57, Kenneth Whistler wrote:
... To most people, a space is a space. To rather more, there
is a second kind of space which they expect to be non-breaking and often
also expect to be fixed width. (Those who had the latter expectation
have had a nasty surprise today because with t
Peter Kirk wrote:
LouisXVI may have lost his head, but we don't want
his number also to fall off on to the next line, or even to become too
far separated from his name. We need to know what kind of space to use
to resist the guillotine!
NBSP
You should not rely on fixed-width spaces to approxim
Language Analysis Systems, Inc. Unicode list reader scripsit:
> It sorta seems like the need to keep phrases like "Louis XIV" together
> is a valid one the deserves a solution, but it also seems fairly
> esoteric-- typesetters and people who give a lot of thought to the
> presentation of their tex
On 31/03/2004 08:49, Language Analysis Systems, Inc. Unicode list reader
wrote:
So perhaps the best thing to do in cases like Ernest's and mine, where
a
fixed width non-breaking space is required, is to use FIGURE SPACE,
which I understand is non-breaking. But then perhaps this is too w
Title: RE: What is the principle?
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Peter Kirk
> Sent: Wednesday, March 31, 2004 9:12 AM
> On 30/03/2004 16:30, Kenneth Whistler wrote:
> But
> what if users of certain other scripts e.g. RTL scripts want just a
> handful
On 30/03/2004 16:30, Kenneth Whistler wrote:
...
Uh, sorry, Peter, but the implications here are so much b, err, ...
baloney.
The majority of the world's scripts are left-to-right. They also
happen to be non-Western. There are more *Indic* scripts encoded
in the Unicode Standard than *Western
On: 2004-03-31 06:43:38 -0800 Peter Kirk scribed:
The only alternative I see is to rewrite from scratch the display
routines of my favourite OS. I think banging my head against walls is
likely to be faster. After all, even the hardest wall cracks eventually,
and my head is quite hard.
Bang on, O
>So perhaps the best thing to do in cases like Ernest's and mine, where
a
>fixed width non-breaking space is required, is to use FIGURE SPACE,
>which I understand is non-breaking. But then perhaps this is too wide
in
>some circumstances - in many fonts it is twice the regular width of
SPACE.
Go
On 31/03/2004 08:08, Doug Ewell wrote:
...
The perception that no-one has yet implemented custom PUA properties
does not mean that doing so is prohibited or unworkable, any more than
the shortage of widely available rendering engines for the Tibetan and
Khmer encoding models implies that those mo
From: "Antoine Leca" <[EMAIL PROTECTED]>
> The French guides of styles (after all, we can use Unicode to write French
> as well as English, can't we?) generally say that NBSP should not be
> expanded on justification. I do not know right now (I miss access to
> definitive references) if this is gen
Peter Kirk wrote:
>> Which I assume means: "it's wrong for Unicode to make ANY property
>> pronouncements for ANY PUA characters, since that defines them, and
>> removes the P from the Use."
>
> This is of course a principle which they have already broken, as they
> have defined "default" propert
Kent:
Your doc says,
And Ó should be ordered as Ò followed by í (**which is the logical sequence, despite
the Unicode compatibility decomposition**).
What do you mean here by "logical sequence"? That that's how it should be interpreted
phonologically and for sorting purposes, or that that
On 30/03/2004 18:01, fantasai wrote:
Ernest Cline wrote:
The main usage is with compound words such as "ice cream" or
"Louis XIV" or commercial phrases such as "Camry SE" where for
esthetic reasons an author would prefer that the space not expand
upon justification,
Given wide enough measures,
[EMAIL PROTECTED] wrote:
> Thai (and Lao, whose encoding closely parallels that of Thai) are
> encoded in Unicode on unique principles: by a straight left-to-right
> typewriter-style encoding. This was done for compatibility with the
> pervasive Thai 8-bit standard. It also means that for colla
On Tuesday, March 30, 2004 11:42 PM, Ernest Cline va escriure:
> The main usage is with compound words such as "ice cream" or
> "Louis XIV" or commercial phrases such as "Camry SE" where for
> esthetic reasons an author would prefer that the space not expand
> upon justification,
Well, as one tha
On 30/03/2004 16:46, Kenneth Whistler wrote:
...
Work it out. Any proposal to assign property ranges into the PUA
would run up on the rocks of all the details. And *then* it would
meet a stonewall in the UTC. And *then* it would meet another stonewall
in SC2.
Quit banging your head against the wa
On 30/03/2004 17:32, Michael Everson wrote:
At 17:02 -0800 2004-03-30, Mike Ayers wrote:
I feel obligated to take this one step further - these folks are
forgetting that "P" stands for "private". Their use of this space is
their own problem, in all senses. It does not seem reasonable to me
t
From: "Michael Everson" <[EMAIL PROTECTED]>
> At 17:02 -0800 2004-03-30, Mike Ayers wrote:
> >I feel obligated to take this one step further - these folks are
> >forgetting that "P" stands for "private". Their use of this space
> >is their own problem, in all senses. It does not seem reasonable t
Rick McGowan wrote:
> Unicode 4.0.1 has been released! [...]
> The main new features in Unicode 4.0.1 are the following:
> [...]
> 3. Unicode Character Database:
> [...]
> * Changed: general category of U+200B ZERO WIDTH SPACE
> * Changed: bidi class of several characters
(If I am aski
From: "Kenneth Whistler" <[EMAIL PROTECTED]>
> Consider another example. The normalization algorithm has to work
> for *all* Unicode code points, assigned or not, because it guarantees
> stability into the future when characters are encoded at code points
> which were previously unencoded. It also,
From: "Doug Ewell" <[EMAIL PROTECTED]>
To: "Unicode Mailing List" <[EMAIL PROTECTED]>
Cc: "Kenneth Whistler" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Wednesday, March 31, 2004 8:38 AM
Subject: PUA properties, default or otherwise (was: Re: What is the principle?)
> This discussion has focus
60 matches
Mail list logo