Rick McGowan writes:
> John Cowan suggested...
> > We will never come close to exceeding this limit. Essentially all new
> > combining characters are either class 0 or fall into one of the
> > 200-range positional classes.
>
> Or 9, for viramas.
Or 1, for overlays. Don't forget them...
Or 7, f
Of course, as usual, this is my opinion. UTC hasn't actually made any
proclamations about what will or won't be done in terms of the classes or
what kinds of classes might be assigned in the future.
Rick
> John Cowan suggested...
>
> > We will never come close to exceeding this limit.
On 25/11/2003 08:55, Doug Ewell wrote:
Normalization may or may not have an effect on compression. It has
definitely been shown to have an effect on Hebrew combining marks.
I must ask, however, that we try to keep these issues separate in
discussion, and not let the compression topic, if there is
John Cowan suggested...
> We will never come close to exceeding this limit. Essentially all new
> combining characters are either class 0 or fall into one of the 200-range
> positional classes.
Or 9, for viramas.
One take-home point is that there won't be any more "fixed position"
classes add
gt;
Sent: Tue, 2003 Nov 25 11:18
Subject: Re: Normalisation stability, was: Compression through normalization
> Philippe Verdy wrote:
>
> > I'm not convinced that there's a significant improvement when only
> > checking for noramlization but not perfomring it. It req
On 25/11/2003 11:15, John Cowan wrote:
Peter Kirk scripsit:
If receivers are expected to check for normalisation, they are
presumably expected also to normalise
Not so. An alternative behavior, which is preferred in certain circumstances,
is to reject the input, or at least to advise h
Peter Kirk scripsit:
> If receivers are expected to check for normalisation, they are
> presumably expected also to normalise
Not so. An alternative behavior, which is preferred in certain circumstances,
is to reject the input, or at least to advise higher layers that the input
may be invalid.
Philippe Verdy wrote:
> I'm not convinced that there's a significant improvement when only
> checking for noramlization but not perfomring it. It requires at least
> a list of the characters are acceptable in a normalization form, and
> as well their combining classes.
UAX #15 begs to differ. S
On 25/11/2003 10:03, John Cowan wrote:
... And as for
canonical equivalence, the most efficient way to compare strings for
it is to normalize both of them in some way and then do a raw
binary compare. Since it adds efficiency to normalize only once,
it is worthwhile to define a few normalization
Peter Kirk wrote:
> Well, Doug, I see your point; different topics should be kept
> separate. But I changed the subject line precisely because the thread
> has shifted from discussion of compression to a general discussion of
> normalisation stability.
That's true; most people would probably not
John Cowan writes:
> Since it adds efficiency to normalize only once,
> it is worthwhile to define a few normalization forms and urge
> people to produce text in one of them, so that receivers need not
> normalize but need only check for normalization, typically much cheaper.
I'm not convinced tha
Philippe Verdy scripsit:
> I just wonder however why it was "crucial" (as Unicode says in its
> Definitions chapter) to expect a relative order of distinct non-zero
> combining classes. For me these combining classes are arbitrary not only on
> their absolute value as they are now, but even their
De : Peter Kirk [mailto:[EMAIL PROTECTED]
> Envoye : mardi 25 novembre 2003 17:06
> A : [EMAIL PROTECTED]
> Cc : [EMAIL PROTECTED]
> Objet : Re: Normalisation stability, was: Compression through
> normalization
>
>
> On 25/11/2003 07:22, Philippe Verdy wrote:
>
Normalization may or may not have an effect on compression. It has
definitely been shown to have an effect on Hebrew combining marks.
I must ask, however, that we try to keep these issues separate in
discussion, and not let the compression topic, if there is to be any,
degenerate into a wing of t
On 25/11/2003 07:22, Philippe Verdy wrote:
...
Composition exclusions have a lower impact as well as the relative orders of
canonical classes, as they don't affect canonical equivalence of strings,
and thus won't affect applications based on the Unicode C10 definition; they
are important only to
> >So it's the absence of stability which would make impossible this
> >rearrangement of normalization forms...
>
> Canonical equivalence is unaffected if combining classes are rearranged,
> though not if they are split or joined. It is only the normalised forms
> of strings which may be changed
On 24/11/2003 16:56, Philippe Verdy wrote:
Peter Kirk writes:
If conformance clause C10 is taken to be operable at all levels, this
makes a nonsense of the concept of normalisation stability within
databases etc.
I don't think that the stability of normalization influence this: as long a
17 matches
Mail list logo