Re: Code point vs. scalar value

2013-09-17 Thread Stephan Stiller
[AF:] It is the wording in your posts that adds to the confusion. My fundamental point is, has been, and continues to be that whenever people use the more general word "code point" instead of the more appropriate "scalar value", that will "add to the confusion". If you make the presupposition

Re: Code point vs. scalar value

2013-09-17 Thread Asmus Freytag
On 9/17/2013 2:55 PM, Stephan Stiller wrote: [AF:] It is the wording in your posts that adds to the confusion. My fundamental point is, has been, and continues to be that whenever people use the more general word "code point" instead of the more appropriate "scalar value", that will "add to th

Re: Code point vs. scalar value

2013-09-17 Thread Stephan Stiller
On 9/17/2013 5:27 PM, Asmus Freytag wrote: On 9/17/2013 2:55 PM, Stephan Stiller wrote: [AF:] It is the wording in your posts that adds to the confusion. My fundamental point is, has been, and continues to be that whenever people use the more general word "code point" instead of the more appr

Re: Code point vs. scalar value

2013-09-17 Thread Stephan Stiller
In what way does UTF-16 "use" surrogate code /points/? An encoding form is a mapping. Let's look at this mapping: * One _inputs_ scalar values (not surrogate code points). * The encoding form will _output_ a short sequence of encoding form–specific code units. (Various voices on this list h

Re: Code point vs. scalar value

2013-09-17 Thread Philippe Verdy
2013/9/17 Stephan Stiller > [AF:] Once you add the UTF-prefix, you are, by force, speaking of code > units. > > So the high-low distinction for "surrogate" code points is misleading, and > the "surrogate" attribute for "code point" shouldn't be there, because, as > I've in fact written in a much

Re: Code point vs. scalar value

2013-09-17 Thread Philippe Verdy
2013/9/18 Stephan Stiller > In what way does UTF-16 "use" surrogate code *points*? An encoding form > is a mapping. Let's look at this mapping: > >- One *inputs* scalar values (not surrogate code points). > > In fact the input is one code point. Then only if that code point has a scalar val

Re: Code point vs. scalar value

2013-09-17 Thread Asmus Freytag
On 9/17/2013 8:40 PM, Philippe Verdy wrote: In what way does UTF-16 "use" surrogate code /points/? An encoding form is a mapping. Let's look at this mapping: * One _inputs_ scalar values (not surrogate code points). In fact the input is one code point. Then only if that code poi

Re: Code point vs. scalar value

2013-09-18 Thread Stephan Stiller
On 9/17/2013 10:54 PM, Asmus Freytag wrote: On 9/17/2013 8:40 PM, Philippe Verdy wrote: In what way does UTF-16 "use" surrogate code /points/? An encoding form is a mapping. Let's look at this mapping: * One _inputs_ scalar values (not surrogate code points). In fact the input i

Re: Code point vs. scalar value

2013-09-18 Thread Stephan Stiller
On 9/18/2013 12:02 AM, Stephan Stiller wrote: That still doesn't mean surrogates are "used by UTF-16" => 'That still doesn't mean surrogate_code point_s are "used by UTF-16"'

Re: Code point vs. scalar value

2013-09-18 Thread Philippe Verdy
Yes, because surrogate "code units" are those used by UTF-16 for which a standard behavior is formally defined. But there are still many other encodings than standard UTF-16, which uses those code points (don't forget that not all abstract characters are encoded in the UCS, and surrogates are cons

Re: Code point vs. scalar value

2013-09-18 Thread Asmus Freytag
On 9/18/2013 2:42 AM, Philippe Verdy wrote: There are "scalar values" used in so many other unrelated domains (notably in mathematics, where a scalar value is an identifiable object that remains constant in relation with some operations and independant of its context, unlike functions, differen

Re: Code point vs. scalar value

2013-09-18 Thread Stephan Stiller
On 9/18/2013 2:42 AM, Philippe Verdy wrote: There are "scalar values" used in so many other unrelated domains [...] There is no risk for confusion with vectors or complex numbers or reals or whatnot. On 9/18/2013 8:34 AM, Asmus Freytag wrote: I concur. Codepoint is the accepted way of referrin

Re: Code point vs. scalar value

2013-09-18 Thread Philippe Verdy
2013/9/18 Stephan Stiller > On 9/18/2013 2:42 AM, Philippe Verdy wrote: > > There are "scalar values" used in so many other unrelated domains [...] > > There is no risk for confusion with vectors or complex numbers or reals or > whatnot. > Yes there are such risks. I gave a meaningful example w

Re: Code point vs. scalar value

2013-09-18 Thread Asmus Freytag
On 9/18/2013 3:14 PM, Philippe Verdy wrote: I would propose exactly the opposite of what you want: avoid using "scalar value" alone. But only speak about 'Unicode scalar value character property". If it is a property, it would be a code point property... Still, I support your general point.

Re: Code point vs. scalar value

2013-09-18 Thread Philippe Verdy
2013/9/19 Asmus Freytag > On 9/18/2013 3:14 PM, Philippe Verdy wrote: > > I would propose exactly the opposite of what you want: avoid using "scalar > value" alone. But only speak about 'Unicode scalar value character > property". > > > If it is a property, it would be a code point property... >

Re: Code point vs. scalar value

2013-09-18 Thread Markus Scherer
On Wed, Sep 18, 2013 at 3:52 PM, Philippe Verdy wrote: > But the UCD and contents of the standard text are listing... oh well... > only the so-called "character properties" > Untrue. There are definitely code point properties, and surrogates have non-trivial property values for Block, Derived_Ag

Re: Code point vs. scalar value

2013-09-18 Thread Philippe Verdy
The UCD is the "Unicode Characters Database". not the "Unicode Codepoints Database". and we've used extremely frequently the terms "character properties" (the expression is also found outside TUS, in the names of many APIs, even if their input is a code point, or a "character" in the meaning of the

Re: Code point vs. scalar value

2013-09-18 Thread Stephan Stiller
Instead of selectively agreeing with Philippe's writing, it would be good to tell us why Glossary claims that surrogate code points are "[r]eserved for use by UTF-16" and why there are similar statements in the Unicode book if [AF:] [o]nce you add the UTF-prefix, you are, by force, speaking of

Re: Code point vs. scalar value

2013-09-19 Thread Hans Aberg
On 18 Sep 2013, at 04:57, Stephan Stiller wrote: > In what way does UTF-16 "use" surrogate code points? An encoding form is a > mapping. Let's look at this mapping: > • One inputs scalar values (not surrogate code points). > • The encoding form will output a short sequence of encodin

Re: Code point vs. scalar value

2013-09-19 Thread Asmus Freytag
On 9/19/2013 6:32 AM, Hans Aberg wrote: On 18 Sep 2013, at 04:57, Stephan Stiller wrote: In what way does UTF-16 "use" surrogate code points? An encoding form is a mapping. Let's look at this mapping: • One inputs scalar values (not surrogate code points). • The encoding form

Re: Code point vs. scalar value

2013-09-19 Thread Philippe Verdy
2013/9/19 Asmus Freytag > The legacy difference was the existence of UCS-2 in parallel with UTF-16. > Correct. But UCS-2 is still not extinct, eve if it is no longer used for exchanging interoperable plain-text. UCS-2 remains widely used for storing arbitrary data in "strings", without any one o

RE: Code point vs. scalar value

2013-09-19 Thread Whistler, Ken
Stephan Stiller seems unconvinced by the various attempts to explain the situation. Perhaps an authoritative explanation of the textual history might assist. Stephan demands an answer: I want to know why the Glossary claims that surrogate code points are "[r]eserved for use by UTF-16". Reason

Re: Code point vs. scalar value

2013-09-20 Thread Mark Davis ☕
Nicely stated. Mark * * *— Il meglio è l’inimico del bene —* ** On Thu, Sep 19, 2013 at 11:21 PM, Whistler, Ken wrote: > Stephan Stiller seems unconvinced by the various attempts to explain the > situation. Perhaps an authoritative explanation o

Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Doug Ewell
Oh, for heaven's sake: Code Point. (1) Any value in the Unicode codespace; that is, the range of integers from 0 to 10₁₆. (See definition D10 in Section 3.4, Characters and Encoding.) Not all code points are assigned to encoded characters. See code point type. (2) A value, or position, for a

Re: Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Asmus Freytag
On 9/16/2013 1:41 PM, Doug Ewell wrote: This has nothing to do with UTF-Anything or Normalization Form Anything. But all with keeping the discussion alive for any reason, however insignificant :) A./

RE: Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Doug Ewell
Asmus Freytag wrote: > On 9/16/2013 1:41 PM, Doug Ewell wrote: > >> This has nothing to do with UTF-Anything or Normalization Form >> Anything. > > But all with keeping the discussion alive for any reason, however > insignificant :) I guess it was too soon to try to come back to the list. 💢 --

Re: Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Asmus Freytag
On 9/16/2013 2:18 PM, Doug Ewell wrote: Asmus Freytag wrote: On 9/16/2013 1:41 PM, Doug Ewell wrote: This has nothing to do with UTF-Anything or Normalization Form Anything. But all with keeping the discussion alive for any reason, however insignificant :) I guess it was too soon to try to