Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Martin J. Dürst
On 2017/03/28 01:03, Michael Everson wrote: On 27 Mar 2017, at 16:56, John H. Jenkins wrote: The 1857 St Louis punches definitely included both the 1855 EW 𐐧 and the 1859 OI <𐐃𐐆>. Ken Beesley shows them in smoke proofs in his 2004 paper on Metafont. Good to have some actual examples. Howev

Re: different version of common/annotations/ja.xml

2017-03-27 Thread Mark Davis ☕️
Ah, yes. Sorry for my confusion. One main purpose for the short names is for TTS, and for that I think people felt that the reading was more useful. However, it would probably be better for the keywords to have the normal spelling. You might consider filing a ticket at http://unicode.org/cldr/trac

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Mark Davis ☕️
(I'm sure you know this, Philippe, but a reminder for others: as far as the Unicode projects go, discussions on this list have no effect unless they are turned into a submission (UTC or Emoji proposal, CLDR or ICU ticket).) If you see any problems in the CLDR data, please file a ticket at http://u

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Mark Davis ☕️
To add to what Ken and Markus said: like many other identifiers, there are a number of different categories. 1. *Ill-formed: *"$1" 2. *Well-formed, but not valid: *"usx". Is *syntactic* according to http://unicode.org/reports/tr51/proposed.html#def_emoji_tag_sequence, but is not *valid

Re: different version of common/annotations/ja.xml

2017-03-27 Thread Takao Fujiwara
It would be combinations of Hiragana, Katakana, Kanji. On 03/28/17 02:25, Koji Ishii-san wrote: I think he meant Kanji/Han ideographic by "committed string". 2017-03-27 19:04 GMT+09:00 Takao Fujiwara mailto:tfuji...@redhat.com>>: On 03/27/17 18:48, Mark Davis ☕️-san wrote: By "com

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
I try to summarize the situation for France, There are some missing codes France métropolitaine (deprecated: [fx]): Départements métropolitains: [fr01~19 fr2a~b fr21~68 fr70-95] (unchanged) [fr6d] Rhône (département) (missing, included in [fr69]?) Statuts par

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Markus Scherer
On Mon, Mar 27, 2017 at 5:09 PM, Philippe Verdy wrote: > I followed the links. Check your links, you are referencing the proposal, > and this contradicts the published version 4.0 of TR51. Where is stability ? > Of course I am pointing to the proposal. The version of TR 51 under review adds a me

Re: Encoding of old compatibility characters

2017-03-27 Thread Mark E. Shoulson
On 03/27/2017 05:46 PM, Frédéric Grosshans wrote: An example of a legacy character successfully encoded recently is ⏨ U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2. It came from the Soviet standard GOST 10859-64 and the German standard ALCOR. And was proposed by Leo Broukhis in this pr

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
I’ll look into whatever you’re on about the other ‘minor’ script, but with regard to what you’ve said below, I’m fairly sure I encoded the missing characters there. I believe it was A7AE and A7B0, capital letters turned K and T used in that orthography. There is a problem with turned P and p in

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
I followed the links. Check your links, you are referencing the proposal, and this contradicts the published version 4.0 of TR51. Where is stability ? 2017-03-28 2:06 GMT+02:00 Markus Scherer : > On Mon, Mar 27, 2017 at 4:58 PM, Philippe Verdy > wrote: > >> This only describes the sequences enco

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
Also these yellow statements from the initial proposal are contradicting what is now published in TR51: "UN" and "EU" are accepted even if they are "macroregions", not satisfying the quoted condition 2 in the proposed update. 2017-03-28 1:58 GMT+02:00 Philippe Verdy : > This only describes the se

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Markus Scherer
On Mon, Mar 27, 2017 at 4:58 PM, Philippe Verdy wrote: > This only describes the sequences encoded with 2 characters, not the newer > longer sequences for flags of subnational regions. the > unicode_region_subtag data does not contain anything about the flags for > the first 3 regions in GB. > P

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread David Starner
On Mon, Mar 27, 2017 at 1:34 AM Martin J. Dürst wrote: > The qualification 'minor' is less important for an alphabet. In general, > the more established and well-known an alphabet is, the wider the > variations of glyph shapes that may be tolerated. > My problem with that is that a new script is

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
This only describes the sequences encoded with 2 characters, not the newer longer sequences for flags of subnational regions. the unicode_region_subtag data does not contain anything about the flags for the first 3 regions in GB. 2017-03-28 1:35 GMT+02:00 Markus Scherer : > On Mon, Mar 27, 2017 a

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Markus Scherer
On Mon, Mar 27, 2017 at 1:34 PM, Ken Whistler wrote: > Anybody could *attempt* to convey a flag of Pomerania (a rather handsome > black gryphon on a yellow background, btw) with an emoji tag sequence right > now, I suppose. I suppose not. Since it's bound to ISO 3166 subdivision codes (possibly

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Markus Scherer
On Mon, Mar 27, 2017 at 1:39 PM, Philippe Verdy wrote: > Note also that ISO3166-2 is far from being stable, and this could > contradict Unicode encoding stability: it would then be required to ensure > this stability by only allowing sequences that are effectively registered > in http://www.unico

RE: Encoding of old compatibility characters

2017-03-27 Thread Jonathan Rosenne
GROUP MARK Best Regards, Jonathan Rosenne -Original Message- From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Fr?d?ric Grosshans Sent: Tuesday, March 28, 2017 1:05 AM To: unicode Subject: Re: Encoding of old compatibility characters Another example, about to be encoded, i

RE: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Doug Ewell
Philippe Verdy wrote: > So it's up to the UTC to create this encoding: this new relase is a > start for a new vexillology registry (within encoded sequences) which > creates a new standard for them. Fine. If you think you can persuade UTC that this is within their scope, go ahead. Let us know how

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Doug Ewell
Ken Whistler wrote: > By the way, if anybody is looking, Pomerania is there: "plpm" among > the 4925 other valid unicode_subdivision_id values. So: > > Flag of Pomerania = 1F3F4 E0070 E006C E0070 E006D E007F > > But alas, that is not a *valid* emoji tag sequence (yet), so no soup > for you! This

Re: Encoding of old compatibility characters

2017-03-27 Thread Frédéric Grosshans
Another example, about to be encoded, it the GOUP MARK, used on old IBM computers (proposal: ML threads: http://www.unicode.org/mail-arch/unicode-ml/y2015-m01/0040.html , and http://unicode.org/mail-arch/unicode-ml/y2007-m05/0367.html ) Le 27/03/2017 à 23:46, Frédéric Grosshans a écrit : An ex

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
So it's up to the UTC to create this encoding: this new relase is a start for a new vexillology registry (within encoded sequences) which creates a new standard for them. 2017-03-27 23:50 GMT+02:00 Doug Ewell : > Philippe Verdy wrote: > > > We still lack an encoding standard for vexillologists. A

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Peter Edberg
(this time from the correct account) Philippe and others, http://www.unicode.org/reports/tr51/tr51-11.html#valid-emoji-tag-sequences refers to CLDR data for the list of valid subregion sequences, see http://unicode.org/

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
And the new region of Normandie still has no formal code, but it reuses a flag that was used by one of the two former regions. Technically I don't see that as a problem except that people may want to display that flag using the code for the former region and semantically this is different (and also

RE: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Doug Ewell
Philippe Verdy wrote: > We still lack an encoding standard for vexillologists. And for now > only "Flags of the World" proposes some encoding (not based strictly > and only on ISO3166). I think that the UTC should try contacting > authors of Flags of the World and seek for advice there: we are > s

Re: Encoding of old compatibility characters

2017-03-27 Thread Frédéric Grosshans
An example of a legacy character successfully encoded recently is ⏨ U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2. It came from the Soviet standard GOST 10859-64 and the German standard ALCOR. And was proposed by Leo Broukhis in this proposal http://www.unicode.org/L2/L2008/08030r-subs

RE: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Doug Ewell
Ken Whistler wrote: > As for how "users" are supposed to know the difference. Well, they > don't. What matters is that the data file that the "implementers" will > use has these 3 emoji tag sequences in it, so that is quite likely > what everybody will see added to their phones. The "users" will j

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Richard Wordingham
On Mon, 27 Mar 2017 13:34:09 -0700 Ken Whistler wrote: > And if a flag of > California (or Pomerania or ...) then gets added to the list of emoji > tag sequences in a future version of the data, there is a good chance > that the "users" will then see the difference, because that flag will > appea

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Ken Whistler
On 3/27/2017 1:39 PM, Philippe Verdy wrote: Note also that ISO3166-2 is far from being stable, and this could contradict Unicode encoding stability: it would then be required to ensure this stability by only allowing sequences that are effectively registered in http://www.unicode.org/Public/

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
Note also that ISO3166-2 is far from being stable, and this could contradict Unicode encoding stability: it would then be required to ensure this stability by only allowing sequences that are effectively registered in http://www.unicode.org/Public/emoji/5.0/emoji-sequences.txt (independantly of the

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Ken Whistler
On 3/27/2017 12:17 PM, Doug Ewell wrote: announcements at Unicode dot org wrote: — and new regional flags for England, Scotland, and Wales. It's not clear from this text, nor from the table in Section C.1.1 of the draft, what the status is of flag emoji tag sequences other than the three abov

Re: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Philippe Verdy
2017-03-27 21:17 GMT+02:00 Doug Ewell : > announcements at Unicode dot org wrote: > > > — and new regional flags for England, Scotland, and Wales. > > It's not clear from this text, nor from the table in Section C.1.1 of > the draft, what the status is of flag emoji tag sequences other than the >

RE: Unicode Emoji 5.0 characters now final

2017-03-27 Thread Doug Ewell
announcements at Unicode dot org wrote: > — and new regional flags for England, Scotland, and Wales. It's not clear from this text, nor from the table in Section C.1.1 of the draft, what the status is of flag emoji tag sequences other than the three above. I read the relevant section a couple of

Re: Encoding of old compatibility characters

2017-03-27 Thread Philippe Verdy
TI caculators are not antique tools, and when I see how most calculators for Android or Windows 10 are now, they are not as usable as the scientific calculators we had in the past. I know at least one excellent calculator that works with Android and Windows and finally has the real look and feel o

Re: different version of common/annotations/ja.xml

2017-03-27 Thread Koji Ishii
I think he meant Kanji/Han ideographic by "committed string". 2017-03-27 19:04 GMT+09:00 Takao Fujiwara : > On 03/27/17 18:48, Mark Davis ☕️-san wrote: > >> By "committed strings", you mean the hiragana phonetic reading? >> > > Hiragana is used to the raw text of the phonetic reading by the Japan

Re: Encoding of old compatibility characters

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 17:49, Markus Scherer wrote: > > I think the interest has been low because very few documents survive in these > encodings, and even fewer documents using not-already-encoded symbols. That doesn’t mean that the few people who may need the characters now or in the centuries t

Re: Encoding of old compatibility characters

2017-03-27 Thread Ken Whistler
On 3/27/2017 7:44 AM, Charlotte Buff wrote: Now, one of Unicode’s declared goals is to enable round-trip compatibility with legacy encodings. We’ve accumulated a lot of weird stuff over the years in the pursuit of this goal. So it would be natural to assume that the unencoded characters from t

Re: Encoding of old compatibility characters

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 18:08, Garth Wallace wrote: > > Apple IIs also had inverse-video letters, and some had "MouseText" > pseudographics used to simulate a Mac-like GUI in text mode. > > I know that a couple of fonts from Kreative put these in the PUA and > Nishiki-Teki follows their lead. I th

Re: Encoding of old compatibility characters

2017-03-27 Thread Garth Wallace
Apple IIs also had inverse-video letters, and some had "MouseText" pseudographics used to simulate a Mac-like GUI in text mode. I know that a couple of fonts from Kreative put these in the PUA and Nishiki-Teki follows their lead. On Mon, Mar 27, 2017 at 9:25 AM Charlotte Buff < irgendeinbenutzern

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 15:04, Alastair Houghton wrote: > 1. Unicode has to be usable *today*; it’s no good designing for some kind of > hyper-intelligent AI-based font technology a thousand years hence, because we > don’t have that now. If it isn’t usable today for any given purpose, people > wo

Re: Encoding of old compatibility characters

2017-03-27 Thread Markus Scherer
I think the interest has been low because very few documents survive in these encodings, and even fewer documents using not-already-encoded symbols. In my opinion, this is a good use of the Private Use Area among a very small group of people. See also https://en.wikipedia.org/wiki/ConScript_Unicod

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 17:07, John H. Jenkins wrote: > This should teach me to double-check before posting. The research is a lot of fun. Can’t wait till I get Ken’s book next week. > Apparently, the earlier typeface *did* include all forty letters; it just > didn't use these two. I don't know wha

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread John H. Jenkins
> On Mar 27, 2017, at 9:56 AM, John H. Jenkins wrote: > > >> On Mar 27, 2017, at 2:04 AM, James Kass > > wrote: >> >>> >>> If we have any historic metal types, are there >>> examples where a font contains both ligature >>> variants? >> >> Apparently not. >> >>

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 16:56, John H. Jenkins wrote: >> John H. Jenkins mentioned early in this thread that these ligatures weren't >> used in printed materials and were not part of the official Deseret set. >> They were only used in manuscript. > > This is correct. Neither of the nineteenth cent

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread John H. Jenkins
> On Mar 27, 2017, at 2:04 AM, James Kass wrote: > >> >> If we have any historic metal types, are there >> examples where a font contains both ligature >> variants? > > Apparently not. > > John H. Jenkins mentioned early in this thread that these ligatures > weren't used in printed materials

Re: Encoding of old compatibility characters

2017-03-27 Thread Charlotte Buff
> It’s hard to say without knowing what the characters are. For the ZX80, the missing characters include five block elements (top and bottom halfs of MEDIUM SHADE, as well as their inverse counterparts), and inverse/negative squared variants of European digits and the following symbols: " £ $ : ?

Re: Encoding of old compatibility characters

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 15:44, Charlotte Buff wrote: > > I’ve recently developed an interest in old legacy text encodings and noticed > that there are various characters in several sets that don’t have a Unicode > equivalent. I had already started research into these encodings to eventually > prep

Encoding of old compatibility characters

2017-03-27 Thread Charlotte Buff
I’ve recently developed an interest in old legacy text encodings and noticed that there are various characters in several sets that don’t have a Unicode equivalent. I had already started research into these encodings to eventually prepare a proposal until I realised I should probably ask on the mai

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Alastair Houghton
On 27 Mar 2017, at 14:49, Michael Everson wrote: >> 3) Font features (e.g. 1855 vs. 1859) to select shapes in the same font > > Font trickery. Not portable. Not supported by most apps. I wouldn’t describe it as “trickery” or “not portable”. Features like stylistic alternates are part of the

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Alastair Houghton
On 27 Mar 2017, at 10:14, Julian Bradfield wrote: > > I contend, therefore, that no decision about Unicode should take into > account any ephemeral considerations such as this year's electronic > font technology, and that therefore it's not even useful to mention > them. I’d disagree with that,

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 09:29, Martin J. Dürst wrote: >> He is. He transcribes texts into Deseret. I’ve published three of them >> (Alice, Looking-Glass, and Snark). > > Great to know. Given that, I'd assume that you'd take his input a bit more > serious. I’m discussing it now, offline, with him a

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 09:04, James Kass wrote: > John H. Jenkins mentioned early in this thread that these ligatures weren't > used in printed materials and were not part of the official Deseret set. > They were only used in manuscript. Not quite true. Such detail will be for the proposal. Mich

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 08:05, Martin J. Dürst wrote: >> Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are >> apparently really supposed to have identical glyphs, though we use an >> old-fashioned style in the charts for the former. (Yes, I am of course aware >> that there are

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 06:42, Martin J. Dürst wrote: >> The default position is NOT “everything is encoded unified until disunified”. > > Neither it's "everything is encoded separately unless it's unified”. These Deseret letters aren’t encoded. For my part I wasn’t made aware of them in 2004 when

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Michael Everson
On 27 Mar 2017, at 05:58, James Kass wrote: > > Asmus Freytag wrote, > >> In the current case, you have the opposite, to wit, the text elements are >> unchanged, but you would like to add alternate code elements >> to represent what are, ultimately, the same text elements. That's not >> disuni

Re: different version of common/annotations/ja.xml

2017-03-27 Thread Takao Fujiwara
On 03/27/17 18:48, Mark Davis ☕️-san wrote: By "committed strings", you mean the hiragana phonetic reading? Hiragana is used to the raw text of the phonetic reading by the Japanese input method before the conversion. After users select one of the converted strings, the converted strings are c

Re: different version of common/annotations/ja.xml

2017-03-27 Thread Mark Davis ☕️
By "committed strings", you mean the hiragana phonetic reading? Mark On Mon, Mar 27, 2017 at 11:00 AM, Takao Fujiwara wrote: > Hi, > > Do you have any chances to create a different version of ja.xml of the > Japanese emoji annotation? > http://unicode.org/cldr/trac/browser/tags/latest/common/an

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Julian Bradfield
While I hesitate to dive in to this argument, Martin makes one comment where I think a point of principle arises: On 2017-03-27, =?UTF-8?Q?Martin_J._D=c3=bcrst?= wrote: [Michael wrote] >> You know, Martin, I *have* been doing this for the last two decades. I’m >> well aware of what a font is and

different version of common/annotations/ja.xml

2017-03-27 Thread Takao Fujiwara
Hi, Do you have any chances to create a different version of ja.xml of the Japanese emoji annotation? http://unicode.org/cldr/trac/browser/tags/latest/common/annotations/ja.xml That file includes Hiragana only but I'd need another file which has the committed strings, likes ja_convert.xml. E.g

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Martin J. Dürst
On 2017/03/24 23:37, Michael Everson wrote: On 24 Mar 2017, at 11:34, Martin J. Dürst wrote: On 2017/03/23 22:48, Michael Everson wrote: Indeed I would say to John Jenkins and Ken Beesley that the richness of the history of the Deseret alphabet would be impoverished by treating the 1859 le

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread James Kass
Martin J. Dürst responded to Michael Everson, > Unfortunately, much of what you wrote gave me the > impression that you may think that historical origin > is the only criterion, or a criterion that trumps all > others. If you don't think so, it would be good if you > could confirm this. If you thi

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread James Kass
Martin J. Dürst responded to Michael Everson, > Overall, we may have up to four variants, of which > three are currently explicitly supported in Unicode. Yes. > Are all of these used as spelling variants? Is there another possible use? > Is the choice of variant up to the author (for which > v

Re: Standaridized variation sequences for the Desert alphabet?

2017-03-27 Thread Martin J. Dürst
On 2017/03/27 01:20, Michael Everson wrote: On 26 Mar 2017, at 16:45, Asmus Freytag wrote: Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are apparently really supposed to have identical glyphs, though we use an old-fashioned style in the charts for the former. (Yes, I a