Nicely stated.
Mark https://plus.google.com/114199149796022210033
*
*
*— Il meglio è l’inimico del bene —*
**
On Thu, Sep 19, 2013 at 11:21 PM, Whistler, Ken ken.whist...@sap.comwrote:
Stephan Stiller seems unconvinced by the various attempts to explain the
situation. Perhaps an
2013/9/19 Asmus Freytag asm...@ix.netcom.com
The legacy difference was the existence of UCS-2 in parallel with UTF-16.
Correct. But UCS-2 is still not extinct, eve if it is no longer used for
exchanging interoperable plain-text.
UCS-2 remains widely used for storing arbitrary data in strings,
Stephan Stiller seems unconvinced by the various attempts to explain the
situation. Perhaps an authoritative explanation of the textual history might
assist.
Stephan demands an answer:
I want to know why the Glossary claims that surrogate code points are
[r]eserved for use by UTF-16.
Reason
On 9/17/2013 8:40 PM, Philippe Verdy wrote:
In what way does UTF-16 use surrogate code /points/? An encoding
form is a mapping. Let's look at this mapping:
* One _inputs_ scalar values (not surrogate code points).
In fact the input is one code point.
Then only if that code
On 9/17/2013 10:54 PM, Asmus Freytag wrote:
On 9/17/2013 8:40 PM, Philippe Verdy wrote:
In what way does UTF-16 use surrogate code /points/? An
encoding form is a mapping. Let's look at this mapping:
* One _inputs_ scalar values (not surrogate code points).
In fact the input is
On 9/18/2013 12:02 AM, Stephan Stiller wrote:
That still doesn't mean surrogates are used by UTF-16
= 'That still doesn't mean surrogate_code point_s are used by UTF-16'
On 9/18/2013 2:42 AM, Philippe Verdy wrote:
There are scalar values used in so many other unrelated domains
(notably in mathematics, where a scalar value is an identifiable
object that remains constant in relation with some operations and
independant of its context, unlike functions,
On 9/18/2013 2:42 AM, Philippe Verdy wrote:
There are scalar values used in so many other unrelated domains [...]
There is no risk for confusion with vectors or complex numbers or reals
or whatnot.
On 9/18/2013 8:34 AM, Asmus Freytag wrote:
I concur. Codepoint is the accepted way of referring
2013/9/18 Stephan Stiller stephan.stil...@gmail.com
On 9/18/2013 2:42 AM, Philippe Verdy wrote:
There are scalar values used in so many other unrelated domains [...]
There is no risk for confusion with vectors or complex numbers or reals or
whatnot.
Yes there are such risks. I gave a
On 9/18/2013 3:14 PM, Philippe Verdy wrote:
I would propose exactly the opposite of what you want: avoid using
scalar value alone. But only speak about 'Unicode scalar value
character property.
If it is a property, it would be a code point property...
Still, I support your general point.
2013/9/19 Asmus Freytag asm...@ix.netcom.com
On 9/18/2013 3:14 PM, Philippe Verdy wrote:
I would propose exactly the opposite of what you want: avoid using scalar
value alone. But only speak about 'Unicode scalar value character
property.
If it is a property, it would be a code point
On Wed, Sep 18, 2013 at 3:52 PM, Philippe Verdy verd...@wanadoo.fr wrote:
But the UCD and contents of the standard text are listing... oh well...
only the so-called character properties
Untrue. There are definitely code point properties, and surrogates have
non-trivial property values for
The UCD is the Unicode Characters Database. not the Unicode Codepoints
Database. and we've used extremely frequently the terms character
properties (the expression is also found outside TUS, in the names of many
APIs, even if their input is a code point, or a character in the meaning
of the
Instead of selectively agreeing with Philippe's writing, it would be
good to tell us why Glossary claims that surrogate code points are
[r]eserved for use by UTF-16 and why there are similar statements in
the Unicode book if
[AF:] [o]nce you add the UTF-prefix, you are, by force, speaking of
[AF:]
It is the wording in your posts that adds to the confusion.
My fundamental point is, has been, and continues to be that whenever
people use the more general word code point instead of the more
appropriate scalar value, that will add to the confusion. If you
make the presupposition
On 9/17/2013 2:55 PM, Stephan Stiller wrote:
[AF:]
It is the wording in your posts that adds to the confusion.
My fundamental point is, has been, and continues to be that whenever
people use the more general word code point instead of the more
appropriate scalar value, that will add to the
On 9/17/2013 5:27 PM, Asmus Freytag wrote:
On 9/17/2013 2:55 PM, Stephan Stiller wrote:
[AF:]
It is the wording in your posts that adds to the confusion.
My fundamental point is, has been, and continues to be that whenever
people use the more general word code point instead of the more
In what way does UTF-16 use surrogate code /points/? An encoding form
is a mapping. Let's look at this mapping:
* One _inputs_ scalar values (not surrogate code points).
* The encoding form will _output_ a short sequence of encoding
form–specific code units. (Various voices on this list
2013/9/17 Stephan Stiller stephan.stil...@gmail.com
[AF:] Once you add the UTF-prefix, you are, by force, speaking of code
units.
So the high-low distinction for surrogate code points is misleading, and
the surrogate attribute for code point shouldn't be there, because, as
I've in fact
2013/9/18 Stephan Stiller stephan.stil...@gmail.com
In what way does UTF-16 use surrogate code *points*? An encoding form
is a mapping. Let's look at this mapping:
- One *inputs* scalar values (not surrogate code points).
In fact the input is one code point.
Then only if that code
On 9/16/2013 1:41 PM, Doug Ewell wrote:
This has nothing to do with UTF-Anything or Normalization Form Anything.
But all with keeping the discussion alive for any reason, however
insignificant :)
A./
Asmus Freytag asmusf at ix dot netcom dot com wrote:
On 9/16/2013 1:41 PM, Doug Ewell wrote:
This has nothing to do with UTF-Anything or Normalization Form
Anything.
But all with keeping the discussion alive for any reason, however
insignificant :)
I guess it was too soon to try to come
On 9/16/2013 2:18 PM, Doug Ewell wrote:
Asmus Freytag asmusf at ix dot netcom dot com wrote:
On 9/16/2013 1:41 PM, Doug Ewell wrote:
This has nothing to do with UTF-Anything or Normalization Form
Anything.
But all with keeping the discussion alive for any reason, however
insignificant :)
I
23 matches
Mail list logo