RE: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Philippe Verdy
Rick McGowan writes: > John Cowan suggested... > > We will never come close to exceeding this limit. Essentially all new > > combining characters are either class 0 or fall into one of the > > 200-range positional classes. > > Or 9, for viramas. Or 1, for overlays. Don't forget them... Or 7, f

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Rick McGowan
Of course, as usual, this is my opinion. UTC hasn't actually made any proclamations about what will or won't be done in terms of the classes or what kinds of classes might be assigned in the future. Rick > John Cowan suggested... > > > We will never come close to exceeding this limit.

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Peter Kirk
On 25/11/2003 08:55, Doug Ewell wrote: Normalization may or may not have an effect on compression. It has definitely been shown to have an effect on Hebrew combining marks. I must ask, however, that we try to keep these issues separate in discussion, and not let the compression topic, if there is

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Rick McGowan
John Cowan suggested... > We will never come close to exceeding this limit. Essentially all new > combining characters are either class 0 or fall into one of the 200-range > positional classes. Or 9, for viramas. One take-home point is that there won't be any more "fixed position" classes add

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Mark Davis
gt; Sent: Tue, 2003 Nov 25 11:18 Subject: Re: Normalisation stability, was: Compression through normalization > Philippe Verdy wrote: > > > I'm not convinced that there's a significant improvement when only > > checking for noramlization but not perfomring it. It req

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Peter Kirk
On 25/11/2003 11:15, John Cowan wrote: Peter Kirk scripsit: If receivers are expected to check for normalisation, they are presumably expected also to normalise Not so. An alternative behavior, which is preferred in certain circumstances, is to reject the input, or at least to advise h

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread John Cowan
Peter Kirk scripsit: > If receivers are expected to check for normalisation, they are > presumably expected also to normalise Not so. An alternative behavior, which is preferred in certain circumstances, is to reject the input, or at least to advise higher layers that the input may be invalid.

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Doug Ewell
Philippe Verdy wrote: > I'm not convinced that there's a significant improvement when only > checking for noramlization but not perfomring it. It requires at least > a list of the characters are acceptable in a normalization form, and > as well their combining classes. UAX #15 begs to differ. S

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Peter Kirk
On 25/11/2003 10:03, John Cowan wrote: ... And as for canonical equivalence, the most efficient way to compare strings for it is to normalize both of them in some way and then do a raw binary compare. Since it adds efficiency to normalize only once, it is worthwhile to define a few normalization

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Doug Ewell
Peter Kirk wrote: > Well, Doug, I see your point; different topics should be kept > separate. But I changed the subject line precisely because the thread > has shifted from discussion of compression to a general discussion of > normalisation stability. That's true; most people would probably not

RE: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Philippe Verdy
John Cowan writes: > Since it adds efficiency to normalize only once, > it is worthwhile to define a few normalization forms and urge > people to produce text in one of them, so that receivers need not > normalize but need only check for normalization, typically much cheaper. I'm not convinced tha

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread John Cowan
Philippe Verdy scripsit: > I just wonder however why it was "crucial" (as Unicode says in its > Definitions chapter) to expect a relative order of distinct non-zero > combining classes. For me these combining classes are arbitrary not only on > their absolute value as they are now, but even their

RE: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Philippe Verdy
De : Peter Kirk [mailto:[EMAIL PROTECTED] > Envoye : mardi 25 novembre 2003 17:06 > A : [EMAIL PROTECTED] > Cc : [EMAIL PROTECTED] > Objet : Re: Normalisation stability, was: Compression through > normalization > > > On 25/11/2003 07:22, Philippe Verdy wrote: >

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Doug Ewell
Normalization may or may not have an effect on compression. It has definitely been shown to have an effect on Hebrew combining marks. I must ask, however, that we try to keep these issues separate in discussion, and not let the compression topic, if there is to be any, degenerate into a wing of t

Re: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Peter Kirk
On 25/11/2003 07:22, Philippe Verdy wrote: ... Composition exclusions have a lower impact as well as the relative orders of canonical classes, as they don't affect canonical equivalence of strings, and thus won't affect applications based on the Unicode C10 definition; they are important only to

RE: Normalisation stability, was: Compression through normalization

2003-11-25 Thread Philippe Verdy
> >So it's the absence of stability which would make impossible this > >rearrangement of normalization forms... > > Canonical equivalence is unaffected if combining classes are rearranged, > though not if they are split or joined. It is only the normalised forms > of strings which may be changed

Normalisation stability, was: Compression through normalization

2003-11-25 Thread Peter Kirk
On 24/11/2003 16:56, Philippe Verdy wrote: Peter Kirk writes: If conformance clause C10 is taken to be operable at all levels, this makes a nonsense of the concept of normalisation stability within databases etc. I don't think that the stability of normalization influence this: as long a