Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-16 Thread Philippe Verdy
Nah!!! STRICTLY NOBODY counts scalar values. Every one counts either - (a) code units (most often 8-bit bytes, more rarely 16-bit bytes e.g. with basic Javascript code), or - (b) code points (independantly of code units used in the storage or communication message format). The application *may*

Re: Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Asmus Freytag
On 9/16/2013 1:41 PM, Doug Ewell wrote: This has nothing to do with UTF-Anything or Normalization Form Anything. But all with keeping the discussion alive for any reason, however insignificant :) A./

Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Doug Ewell
Oh, for heaven's sake: Code Point. (1) Any value in the Unicode codespace; that is, the range of integers from 0 to 10₁₆. (See definition D10 in Section 3.4, Characters and Encoding.) Not all code points are assigned to encoded characters. See code point type. (2) A value, or position, for a

RE: Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Doug Ewell
Asmus Freytag asmusf at ix dot netcom dot com wrote: On 9/16/2013 1:41 PM, Doug Ewell wrote: This has nothing to do with UTF-Anything or Normalization Form Anything. But all with keeping the discussion alive for any reason, however insignificant :) I guess it was too soon to try to come

Re: Code point vs. scalar value (was: RE: Origin of Ellipsis (was: RE: Empty set))

2013-09-16 Thread Asmus Freytag
On 9/16/2013 2:18 PM, Doug Ewell wrote: Asmus Freytag asmusf at ix dot netcom dot com wrote: On 9/16/2013 1:41 PM, Doug Ewell wrote: This has nothing to do with UTF-Anything or Normalization Form Anything. But all with keeping the discussion alive for any reason, however insignificant :) I

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Andre Schappo
On 13 Sep 2013, at 20:02, Whistler, Ken wrote: The *interesting* question, in my opinion, is why folks feel impelled to use U+2026 to render a baseline ellipsis in Latin typography at all, rather than just using U+002E ad libitum... --Ken U+2026 is useful for microblogs when one is looking to

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Philippe Verdy
Do you mean saving two characters for posting to Tweeter ? Well may be, but Tweeter clearly does not promote correct typography and not even correct orthography. It is clearly not a good model for publishing. But given the history of this character, I just wonder why it was not mapped along with

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Doug Ewell
Andre Schappo wrote: U+2026 is useful for microblogs when one is looking to save characters Not if the microblog is in UTF-8, as almost all are. -- Doug Ewell | Thornton, CO, USA http://ewellic.org | @DougEwell ­

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Phillips, Addison
Not if the limit is counted in characters and not in bytes. Twitter, for example, counts code points in the NFC representation of a tweet. Doug Ewell d...@ewellic.org wrote: Andre Schappo wrote: U+2026 is useful for microblogs when one is looking to save characters Not if the microblog is in

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Stephan Stiller
On 9/15/2013 3:07 PM, Phillips, Addison wrote: Not if the limit is counted in characters and not in bytes. Twitter, for example, counts code points in the NFC representation of a tweet. character, code point – these are confusing words :-) From the link it isn't entirely clear whether they (a)

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Phillips, Addison
Actually, that's my bad: I meant to type scalar value. Stephan Stiller stephan.stil...@gmail.com wrote: On 9/15/2013 3:07 PM, Phillips, Addison wrote: Not if the limit is counted in characters and not in bytes. Twitter, for example, counts code points in the NFC representation of a tweet.

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Doug Ewell
Addison Phillips wrote: Not if the limit is counted in characters and not in bytes. Twitter, for example, counts code points in the NFC representation of a tweet. You're right. I take that back, about Twitter at least. Stephan Stiller wrote: From the link it isn't entirely clear whether

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Ilya Zakharevich
On Sun, Sep 15, 2013 at 09:21:47PM +0200, Philippe Verdy wrote: If there's something to do now (given it is no longer used in CJK contexts), it's to strongly recommand that fonts map them to exactly the same glyph as the one obtained by aligning three periods in a raw without any additional

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Stephan Stiller
Stephan Stiller wrote: From the link it isn't entirely clear whether they (a) count scalar values of NFC or (b) count code points of NFC. Are they not the same thing, except for surrogates? Conceptually no, but numerically yes – you are right in that regard, and I wasn't precise in my

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-15 Thread Stephan Stiller
Doug wrote me: You're not confusing code point with code unit, are you? Thanks for the note. I think what you say is that I thought (or meant to write) by first representing the sequence of scalar values in an encoding form and then counting [code points typecast from] code _units_. I think

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-13 Thread Asmus Freytag
On 9/13/2013 10:54 AM, Whistler, Ken wrote: Stephan Stiller noted: Maybe ... and the origin of the single-glyph ellipsis remains a mystery to me. As Philippe surmised, it is a compatibility character, originally included in the Unicode 1.0 repertoire for cross-mapping to existing legacy

Origin of Ellipsis (was: RE: Empty set)

2013-09-13 Thread Whistler, Ken
Stephan Stiller noted: Maybe ... and the origin of the single-glyph ellipsis remains a mystery to me. As Philippe surmised, it is a compatibility character, originally included in the Unicode 1.0 repertoire for cross-mapping to existing legacy encodings: Code Page 932: 0x81 0x64 Code Page

RE: Origin of Ellipsis (was: RE: Empty set)

2013-09-13 Thread Whistler, Ken
I wrote: As Philippe surmised, it is a compatibility character, originally included in the Unicode 1.0 repertoire for cross-mapping to existing legacy encodings: Code Page 932: 0x81 0x64 Code Page 949: 0xA1 0xA6 Asmus responded: which just pushes that question forward in time...

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-13 Thread Jukka K. Korpela
2013-09-13 22:02, Whistler, Ken wrote: The *interesting* question, in my opinion, is why folks feel impelled to use U+2026 to render a baseline ellipsis in Latin typography at all, rather than just using U+002E ad libitum... In traditional typography, an ellipsis usually has dots set apart

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-13 Thread Stephan Stiller
Exactly my thoughts: In fonts commonly used for word processing and desktop publishing, HORIZONTAL ELLIPSIS is usually not that well designed. To me the dots appear too close in plenty of fonts. But I think that the most common cause of the appearance of HORIZONTAL ELLIPSIS is that Microsoft

Re: Origin of Ellipsis (was: RE: Empty set)

2013-09-13 Thread Philippe Verdy
2013/9/13 Jukka K. Korpela jkorp...@cs.tut.fi 2013-09-13 22:02, Whistler, Ken wrote: The *interesting* question, in my opinion, is why folks feel impelled to use U+2026 to render a baseline ellipsis in Latin typography at all, rather than just using U+002E ad libitum... In traditional