On 2/21/2020 7:53 AM, Costello, Roger L. via Unicode wrote:
Text files may indeed contain binary (i.e., bytes that are not
interpretable as characters). Namely, text files may contain newlines,
tabs, and some other invisible things.
Question: "characters" are defined as only the visible thi
Well, no, in this case "strange" means strange, as Ken Lunde notes. I'm
just pointing to his list, because it pulls together quite a few Han
characters that *also* have dubious cases for encoding.
Or you could turn the argument around, I suppose, and note that just
because the hieroglyph for "
You want "dubious"?!
You should see the hundreds of strange characters already encoded in the
CJK *Unified* Ideographs blocks, as recently documented in great detail
by Ken Lunde:
https://www.unicode.org/L2/L2020/20059-unihan-kstrange-update.pdf
Compared to many of those, a hieroglyph of a m
Richard,
What it comes down to is avoidance of conundrums involving canonical
reordering for normalization. The effect of variation selectors is
defined in terms of an immediate adjacency. If you allowed variation
selectors to be defined for combining marks of ccc!=0, then
normalization of se
Richard,
Given that those particular two variation selectors have already given
very specific semantics for emoji sequences, and would now be expected
to occur *only* in emoji sequences:
https://www.unicode.org/reports/tr51/#def_text_presentation_selector
usurping them to do something unrela
Shriramana,
That category is used to track character(s) in process that may have
been approved by WG2 but are not yet in ballot, or are in contention,
and may have just been dropped from ballot, but which still have
sufficient visibility to be tracked.
The process is a bit rough around the e
Shriramana,
On 12/20/2019 6:29 PM, Shriramana Sharma via Unicode wrote:
I was looking at the pipeline for something else, and for the first
time I see a character category: “not accepted by the UTC but in ISO
ballot” and two characters in it.
Those two characters changed status as of December 4,
On 12/20/2019 7:17 AM, wjgo_10...@btinternet.com via Unicode wrote:
It is indeed interesting that the Notice of Non-Approval itself uses
italics for emphasis in two places.
That text, at the present time, cannot be expressed in Unicode plain
text with the emphasis that the Notice of Non-Appro
On 10/30/2019 10:41 AM, wjgo_10...@btinternet.com via Unicode wrote:
At present I have a question to which I cannot find the answer.
Is the QID emoji format, if approved by the Unicode Technical
Committee going to be sent to the ISO/IEC 10646 committee for
consideration by that committee?
On 10/12/2019 3:15 AM, Fred Brennan via Unicode wrote:
There seems to be no conscionable reason for such a long delay after the
approval.
If that's just how things are done, fine, I certainly can't change the whole
system. But imagine if you had to wait two years to even have a chance of
using
Sorry about the typo there. I meant "the published Version 13.0 next March"
--Ken
On 10/11/2019 10:17 AM, Ken Whistler wrote:
then eventually in the published Version 13.0 next month:
Short answer is no.
The characters in the pipeline section labeled "Characters Accepted for
Version 13.0" are what will be in the beta review for 13.0 (look for
that sometime next month), and then eventually in the published Version
13.0 next month:
https://www.unicode.org/alloc/Pipeline.htm
Fred,
2 hours and 33 minutes from now (today). But you don't need to try to
synch a proposal like this to a particular script ad hoc meeting. That
group meets roughly once a month, and any new proposal coming in right
now wouldn't be on the Unicode 13.0 train, even if the UTC immediately
agre
On 9/26/2019 4:21 AM, Fred Brennan via Unicode wrote:
There is a clear demand for a SQUARE TB. In the font SMotoya Sinkai W55 W3,
which is ©2008 株式会社 モトヤ, the glyph is unencoded and accessed via the
Discretionary Ligatures (`dlig`) OpenType feature. It has name `T_B.dlig`.
Aye, there's the ru
On 8/14/2019 4:32 PM, James Kass via Unicode wrote:
If a character gets deprecated, can its decomposition type be changed
from canonical to compatibility?
Simple answer: No.
--Ken
Your helpful suggestions will be passed along to the people working on
the new site.
In the meantime, please note that the link to the "Unicode Technical
Site" has been added to the left column of quick links in the page
bottom banner, so it is easily available now from any page on the new sit
See the entry for "Magar Akkha" on:
http://linguistics.berkeley.edu/sei/scripts-not-encoded.html
Anshuman Pandey did preliminary research on this in 2011.
http://www.unicode.org/L2/L2011/11144-magar-akkha.pdf
It would be premature to assign an ISO 15924 script code, pending the
research to de
On 7/18/2019 11:50 AM, Steffen Nurpmeso via Unicode wrote:
I also decided to enter /L2 directly from now on.
For folks wishing to access the UTC document register, Unicode
Consortium standards, and so forth, all of those links will be
permanently stable. They are not impacted by the rollout
On 7/17/2019 4:54 PM, Philippe Verdy via Unicode wrote:
then the Unicode version (age) used for Hieroglyphs should also be
assigned to Hieratic.
It is already.
In fact the ligatures system for the "cursive" Egyptian Hieratic is so
complex (and may also have its own variants showing its progr
On 7/3/2019 10:47 AM, Sławomir Osipiuk via Unicode wrote:
Is my idea impossible, useless, or contradictory? Not at all.
What you are proposing is in the realm of higher-level protocols.
You could develop such a protocol, and then write processes that honored
it, or try to convince others to
On 4/30/2019 12:45 AM, Julian Bradfield via Unicode wrote:
What is its appropriate Unicode representation?
A macron.
--Ken
On 3/13/2019 2:42 AM, Janusz S. Bień via Unicode wrote:
Hi!
On Mon, Jul 16 2018 at 7:07 +02, Janusz S. Bień via Unicode wrote:
FAQ (http://unicode.org/faq/vs.html) states:
For historic scripts, the variation sequence provides a useful tool,
because it can show mistaken or nonce gl
Egmont,
On 2/9/2019 11:48 AM, Egmont Koblinger via Unicode wrote:
Are there any (non-CJK) scripts for which crossword puzzles don't exist?
There are crossword puzzles for Hindi (in the Devanagari script). Just
do an image search for "Hindi crossword puzzle".
But the conventions for these br
Richard,
On 2/1/2019 1:30 PM, Richard Wordingham via Unicode wrote:
Language tagging is already available in Unicode, via the tag characters
in the deprecated plane.
Recte:
1. Plane 14 is not a "deprecated plane".
2. The tag characters in Tag Character block (U+E..U+E007F) are not
depr
On 1/31/2019 1:41 AM, Egmont Koblinger via Unicode wrote:
I mean, for
example we can introduce control characters that specify the language.
That is a complete non-starter for the Unicode Standard. And if the
terminal implementation introduces such as one-off hacks, they will fail
completel
James,
On 1/8/2019 1:11 PM, James Kass via Unicode wrote:
But we're still using typewriter kludges to represent stress in Latin
script because there is no Unicode plain text solution.
O.k., that one needs a response.
We are still using kludges to represent stress in the Latin script
because
Michael,
On 11/21/2018 9:38 AM, Michael Everson via Unicode wrote:
What really annoys me about this is that there is no flag for Northern Ireland.
The folks at CLDR did not think to ask either the UK or the Irish
representatives to SC2 about this.
Neither CLDR-TC nor SC2 has any jurisdiction
On 11/21/2018 8:00 AM, William_J_G Overington via Unicode wrote:
Yet the interoperability does not derive from an International Standard.
The interoperability that enabled your mail to be delivered to me derives in
part from the MIME standard (RFC 2045 et seq.) which is not an International
On 11/20/2018 12:57 PM, William_J_G Overington via Unicode wrote:
quote
A Unicode Technical Standard (UTS) is an independent specification. Conformance
to the Unicode Standard does not imply conformance to any UTS.
end quote
My questions are as follows please.
Is that encoding for the Wels
On 11/2/2018 10:02 AM, Philippe Verdy via Unicode wrote:
I was replying not about the notational repreentation of the DUCET
data table (using [....] unnecessarily) but about the text of
UTR#10 itself. Which remains highly confusive, and contains completely
unnecesary steps, and just compli
On 10/31/2018 11:27 AM, Asmus Freytag via Unicode wrote:
but we don't have an agreement that reproducing all variations in
manuscripts is in scope.
In fact, I would say that in the UTC, at least, we have an agreement
that that clearly is out of scope!
Trying to represent all aspects of text
On 10/30/2018 2:32 PM, James Kass via Unicode wrote:
but we can't seem to agree on how to encode its abbreviation.
For what it's worth, "mgr" seems to be the usual abbreviation in Polish
for it.
--Ken
On 10/29/2018 8:06 PM, James Kass via Unicode wrote:
could be typed on old-style mechanical typewriters. Quintessential
plain-text, that.
Nope. Typewriters were regularly used for underscoring and for
strikethrough, both of which are *styling* of text, and not plain text.
The mere fact tha
Martin,
On 10/9/2018 12:47 AM, Martin J. Dürst via Unicode wrote:
- Using the 'capitalize' method to (try to) get the titlecase
property of a MTAVRULI character. (There's no other way
currently in Ruby to get the titlecase property.)
There may be others. If you have some ideas, I'd apprecia
On 10/2/2018 12:45 AM, Martin J. Dürst via Unicode wrote:
capitalize: uppercase (or title-case) the first character of the
string, lowercase the rest
When I say "cause problems", I mean producing mixed-case output. I
originally thought that 'capitalize' would be fine. It is fine for
lowerc
On 8/31/2018 1:36 AM, Manuel Strehl via Unicode wrote:
For codepoints.net I use that data to stuff everything in a MySQL
database.
Well, for some sense of "everything", anyway. ;-)
People having this discussion should keep in mind a few significant points.
First, the UCD proper isn't "ever
On 8/21/2018 7:56 AM, Adam Borowski via Unicode wrote:
On Mon, Aug 20, 2018 at 05:17:21PM -0700, Ken Whistler via Unicode wrote:
On 8/20/2018 5:04 PM, Mark E. Shoulson via Unicode wrote:
Is there a block of RTL PUA also?
No.
Perhaps there should be?
This is a periodic suggestion that
On 8/20/2018 5:04 PM, Mark E. Shoulson via Unicode wrote:
Is there a block of RTL PUA also?
No.
--Ken
Steffen noted:
On 8/20/2018 3:22 PM, Steffen Nurpmeso via Unicode wrote:
It was just that i have read on one of the mailing-lists i am
subscribed to a cite of a Unicode statement that i have never read
of anything on the Unicode mailing-list. It is very awkward, but
i_again_ cannot find what
Steffen,
Are you looking for the Unicode list email archives?
https://www.unicode.org/mail-arch/
Those contain list content going back all the way to 1994.
--Ken
On 8/20/2018 6:08 AM, Steffen Nurpmeso via Unicode wrote:
I have the impression that many things which have been posted here
some
On 7/19/2018 12:38 AM, Shai Berger via Unicode wrote:
If I cannot trust that
people I communicate with make the same choices I make, plain text
cannot be used.
Here is a counterexample. The following is a chunk of plain text output
from the bidi reference implementation:
Trace: Entering br
On 7/18/2018 6:43 AM, philip chastney via Unicode wrote:
there are also contexts where "Hello World!" can be read as
the function "Hello", applied to the factorial value of "World"
even though such a move wouldn't necessarily remove all ambiguity,
the easiest solution is to declare that formal
On 7/16/2018 3:51 PM, Shai Berger via Unicode wrote:
And I should add, in response to the other points raised in this
thread, from the same page in the core standard: "If the same plain text
sequence is given to disparate rendering processes, there is no
expectation that rendered text in each i
On 5/29/2018 12:49 AM, Richard Wordingham via Unicode wrote:
How would one know that they are misapplied? And what if the author of
the text has broken your rules? Are such texts never to be transcribed
to pukka Unicode?
Applying Tamil -ii (0BC0, Script=Tamil) to the Latin letter a (0061,
On 5/28/2018 9:44 PM, Asmus Freytag via Unicode wrote:
One of the general principles is that combining marks inherit the
property of their base character.
Normally, "inherited" should be the only property value for combining
marks.
There have been some deviations from this over the years,
On 5/28/2018 9:23 PM, Martin J. Dürst via Unicode wrote:
Hello Sundar,
On 2018/05/28 04:27, SundaraRaman R via Unicode wrote:
Hi,
In languages like Ruby or Java
(https://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isAlphabetic(int)),
functions to check if a character is alp
On 5/23/2018 8:53 AM, Abe Voelker via Unicode wrote:
As a user I find it troublesome because previous messages I've sent
using this character on these platforms may now be interpreted
differently due to the changed representation. That aspect has me
wondering if this change is in line with Uni
On 5/15/2018 2:46 PM, Markus Scherer via Unicode wrote:
I am proposing the addition of 2 new characters to the Musical
Symbols table:
- the half-flat sign (lowers a note by a quarter tone)
- the half-sharp sign (raises a note by a quarter tone)
In an actual proposal, I would
Henri,
There is no formal concept of a public "Editor's Draft" for the Unicode
core specification. This is mostly the result of the tools used for
editing the core specification, which is still structured more like a
book than the usual online internet specification.
Currently the Unicode ed
On 4/2/2018 7:02 PM, Philippe Verdy via Unicode wrote:
We're missing the definition of "ymojis", a safer alternatives of
"umojis" (unknown), but that "you" can create yourself for use by
yourself
Not to mention "əmojis", as in "Uh, Moe! Jeez, why are we still talking
about this?!"
--Ken
On 3/9/2018 9:29 AM, via Unicode wrote:
Documented increase such as scientific terms for new elements, flora
and fauna, would seem to be not more one or two dozen a year.
Indeed. Of the "urgently needed characters" added to the unified CJK
ideographs for Unicode 11.0, two were obscure place
On 3/9/2018 6:58 AM, Marcel Schneider via Unicode wrote:
As of translating the Core spec as a whole, why did two recent attempts crash
even
before the maintenance stage, while the 3.1 project succeeded?
Essentially because both the Japanese and the Chinese attempts were
conceived of as comm
On 3/7/2018 1:12 PM, Philippe Verdy via Unicode wrote:
Shouldn't we create a variant of IDS, using combining joiners between
Han base glyphs (then possibly augmented by variant selectors if there
are significant differences on the simplification of rendered strokes
for each component) ? What
On 3/5/2018 9:03 AM, suzuki toshiya via Unicode wrote:
I have a question; if some people try to make a
translated version of Unicode
And to add to Asmus' response, folks on the list should understand that
even with the best of effort, the concept of a "translated version of
Unicode" is a nea
John,
I think this may be giving the list a somewhat misleading picture of the
actual statistics for encoding of CJK unified ideographs. The "500
characters a year" or "1000 characters a year" limits are administrative
limits set by the IRG for national bodies (and others) submitting
repertoi
David,
On 2/22/2018 7:21 PM, David Corbett via Unicode wrote:
My confusion stems from Unicode’s online bidi utility.
That bidi utility has known defects in it. It is not yet conformant with
changes to UBA 6.3, let alone later changes to UBA. And the mapping of
memory position to display pos
On 2/22/2018 11:39 AM, David Corbett via Unicode wrote:
For example, after a right-to-left override, the Hangul string 보기
(“bogi”) becomes 기보 (“gibo”) in visual order. However, its NFD form is
reordered by jamo instead of by syllable; that is, it looks like “igob”.
Nope. *tilt* The UBA reor
On 2/16/2018 11:00 AM, Asmus Freytag via Unicode wrote:
On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote:
That doesn't square well with, "An implementation *may* render a valid
Ideographic Description Sequence either by rendering the individual
characters separately or by parsing the
On 2/16/2018 8:22 AM, Ken Whistler wrote:
The Egyptian quadrat controls, on the other hand, are full-fledged
Unicode format controls.
One more point of distinction: The (gc=So) IDC's follow a syntax that
uses Polish notation order for the descriptive operators (inherited from
the int
On 2/16/2018 8:00 AM, Richard Wordingham via Unicode wrote:
A more portable solution for ideographs is to render an Ideographic
Description Sequences (IDS) as approximations to the characters they
describe. The Unicode Standard carefully does not prohibit so doing,
and a similar scheme is being
On 2/15/2018 2:24 PM, Philippe Verdy via Unicode wrote:
And it's in the mission of Unicode, IMHO, to promote litteracy
Um, no. And not even literacy, either. ;-)
https://en.wikipedia.org/wiki/Category:Organizations_promoting_literacy
--Ken
On 2/14/2018 12:49 PM, Philippe Verdy via Unicode wrote:
RCLLTHTWHNLPHBTSWRFRSTNVNTDPPLWRTTXTLKTHS !
[ ... lots to say about the history of writing ... ]
And the use (or abuse) of emojis is returning us to the prehistory
when people draw animals on walls of caverns: this was a very slow
On 2/14/2018 12:53 AM, Erik Pedersen via Unicode wrote:
Unlike text composed of the world’s traditional alphabetic, syllabic, abugida
or CJK characters, emoji convey no utilitarian and unambiguous information
content.
I think this represents a misunderstanding of the function of emoji in
wr
Gentlemen,
On 12/14/2017 6:53 AM, Mark Davis ☕️ via Unicode wrote:
Thus I would like people who are both knowledgeable about hieroglyphs
/and/ Unicode properties to weigh in. I know that people like Andrew
Glass are on this list, who satisfy both criteria.
And what constitutes a cluster?
Asmus,
On 12/5/2017 12:35 PM, Asmus Freytag via Unicode wrote:
I don't know the history of this particular "unification"
Here are some clues to guide further research on the history.
The annotation in question was added to a draft of the NamesList.txt
file for Unicode 4.1 on October 7, 2003
On 9/27/2017 2:19 PM, Markus Scherer via Unicode wrote:
On Wed, Sep 27, 2017 at 1:49 PM, James Tauber via Unicode
mailto:unicode@unicode.org>> wrote:
I recently updated pyuca[1], my pure Python implementation of the
Unicode Collation Algorithm to work with 8.0.0, 9.0.0, and 10.0.0
Ken,
On 9/27/2017 11:10 AM, Ken Shirriff via Unicode wrote:
The IBM type catalog might be of interest. It describes in great
detail the character sets of the IBM typewriters and line printers and
the custom characters that can be ordered for printer chains and
Selectric type balls. Link:
htt
Asmus,
On 9/27/2017 10:02 AM, Asmus Freytag via Unicode wrote:
In that context it's worth remembering that there while you could say
for most typewriters that "the typewriter is the font", there were
noted exceptions. The IBM Selectric, for example, had exchangeable
type balls which allowed
Leo,
On 9/26/2017 9:00 PM, Leo Broukhis via Unicode wrote:
The next time I'm at the Mountain View CHM, I'll try to ask. However,
assuming it was an overstrike of an X and an I, then where does the
"Eris"-like glyph come from? Was there ever an IBM font with a
double-semicircular X like )( ?
Philippe,
Those aren't negative digits, per se. The usage in the manual is with an
overline (or macron) to indicate the flag bit. It does occur over a
zero, and in explanation in the text of floating point operations, it is
also shown over letters (X, M, E) representing digits of the exponent
Leo,
Yeah, I know. My point was that by examining the physical typewriter
keys (the striking head on the typebar, not the images on the keypads),
one could see what could be generated *by* overstriking. I think
Philippe's suggestion that it was simply an overstrike of "X" with an
"I" is proba
The 1620 manual accessed from the Wiki page shows the same information
but with a different glyph (which looks more like the capital zhe, and
is presumably the source of the glyph cited in the Wiki page itself). See:
http://www.bitsavers.org/pdf/ibm/1620/A26-5706-3_IBM_1620_CPU_Model_1_Jul65.pd
Albrecht,
See TUS, Section 18.3, Bopomofo, p. 707:
http://www.unicode.org/versions/Unicode10.0.0/ch18.pdf#G22553
--Ken
On 8/24/2017 12:19 AM, Dreiheller, Albrecht via Unicode wrote:
Hello Chinese experts,
The Letter I in the Bopomofo alphabet (U+3127)has a two rendering
variants, a vertic
Manuel,
I suspect that such a link may already be in the works for the
/Public/emoji/ data directory. But if you want to make sure your
suggestion is reviewed by the UTC, you should submit it via the contact
form:
http://www.unicode.org/reporting.html
--Ken
On 7/5/2017 12:37 PM, Manuel Str
On 7/5/2017 10:01 AM, Daniel Bünzli via Unicode wrote:
I know the emoji properties [1] are no formally part of the UCD (not sure
exactly why though),
Because they are maintained as part of an independent standard now (UTS
#51), which is still on track to have a faster turnaround -- and hence
I wonder IF 9 times suffice,
But IF more are required,
I'll tweet ILY, tweet it twice --
Since spelling's been retired.
On 6/21/2017 8:37 AM, William_J_G Overington via Unicode wrote:
Here is a mnemonic poem, that I wrote on Monday 20 February 2017, now published
as U+1F91F is now officially i
On 6/1/2017 8:32 PM, Richard Wordingham via Unicode wrote:
TUS Section 3 is like the Augean Stables. It is a complete mess as a
standards document,
That is a matter of editorial taste, I suppose.
imputing mental states to computing processes.
That, however, is false. The rhetorical turn i
On 6/1/2017 6:21 PM, Richard Wordingham via Unicode wrote:
By definition D39b, either sequence of bytes, if encountered by an
conformant UTF-8 conversion process, would be interpreted as a
sequence of 6 maximal subparts of an ill-formed subsequence.
("D39b" is a typo for "D93b".)
Sorry about
On 6/1/2017 2:39 PM, Richard Wordingham via Unicode wrote:
You were implicitly invited to argue that there was no need to handle
5 and 6 byte invalid sequences.
Well, working from the *current* specification:
FC 80 80 80 80 80
and
FF FF FF FF FF FF
are equal trash, uninterpretable as *anyth
On 5/26/2017 10:28 AM, Karl Williamson via Unicode wrote:
The link provided about the PRI doesn't lead to the comments.
PRI #121 (August, 2008) pre-dated the practice of keeping all the
feedback comments together with the PRI itself in a numbered directory
with the name "feedback.html". But
Richard
On 5/23/2017 1:48 PM, Richard Wordingham via Unicode wrote:
The object is to generate code*now* that, up to say Unicode Version 23.0,
can work out, from the UCD files DerivedAge.txt and
PropertyValueAliases.txt, whether an arbitrary code point was included
by some Unicode version ident
On 5/3/2017 3:20 AM, William_J_G Overington via Unicode wrote:
Surely a single code point could be found. Single code points are being found
for various emoji items on a continuing basis. Why pull up the ladder on
encoding some flags each with a single code point?
Yes, a single code point for
On 3/29/2017 1:12 PM, Doug Ewell wrote:
Is that common practice in Unicode, that if something doesn't gain
significant traction in the comparatively short term, it becomes a
candidate for deprecation?
If a mechanism was dodgy in the first place and was dubious as a part of
plain text, then ye
On 3/29/2017 1:12 PM, Doug Ewell wrote:
I would think vendors could make their own business decisions about what
flags to support. "Hmm, yeah, definitely Texas, maybe Lombardy, not so
sure about Colorado, probably not Guna Yala." I don't see why they had
to be essentially told what to support an
On 3/27/2017 1:39 PM, Philippe Verdy wrote:
Note also that ISO3166-2 is far from being stable, and this could
contradict Unicode encoding stability: it would then be required to
ensure this stability by only allowing sequences that are effectively
registered in
http://www.unicode.org/Public/
On 3/27/2017 12:17 PM, Doug Ewell wrote:
announcements at Unicode dot org wrote:
— and new regional flags for England, Scotland, and Wales.
It's not clear from this text, nor from the table in Section C.1.1 of
the draft, what the status is of flag emoji tag sequences other than the
three abov
On 3/27/2017 7:44 AM, Charlotte Buff wrote:
Now, one of Unicode’s declared goals is to enable round-trip
compatibility with legacy encodings. We’ve accumulated a lot of weird
stuff over the years in the pursuit of this goal. So it would be
natural to assume that the unencoded characters from t
On 3/17/2017 10:27 AM, Julian Bradfield wrote:
If you're working in a situation where you don't have either markup
control or the facility to use plain monospaced text, then just use
normal breves and acutes.
It's not clear to me that laying out aligned text (for which there are
many other appli
On 3/6/2017 2:48 PM, Simon Cozens wrote:
A few years back, there was a set of questions to the UTC (L2/12-133)
asking for direction on encoding Stokoe notation. Did these ever get an
answer, and is there anything currently happening with Stokoe encoding?
The short answer is no.
Stokoe notati
The UN Group of Experts on Geographical Names (UNGEGN) is also relevant:
https://unstats.un.org/unsd/geoinfo/ungegn/default.html
They keep up a list of searchable geographical names databases in a wide
variety of languages:
https://unstats.un.org/unsd/geoinfo/ungegn/geonames.html
--Ken
On
On 2/13/2017 1:26 PM, Christoph Päper wrote:
Ken Whistler :
On 2/13/2017 1:39 AM, Christoph Päper wrote:
- music/rest – is that what 〽️ or 〰️ means?
The first of those is presumably U+303D PART ALTERNATION MARK, and the second
is probably the notorious U+3030 WAVY DASH. So not emoji at all
I can't speak to the missing emoji mappings, but...
On 2/13/2017 1:39 AM, Christoph Päper wrote:
- music/rest – is that what 〽️ or 〰️ means?
The first of those is presumably U+303D PART ALTERNATION MARK, and the
second is probably the notorious U+3030 WAVY DASH. So not emoji at all.
--Ken
Richard,
On 2/3/2017 2:35 PM, Richard Wordingham wrote:
Except that the added annotation "also used distinctly as a gemination
mark which can occur with vowels" also applies to U+103A MYANMAR SIGN
ASAT. TUS 9.0 Section 16.3 Myanmar calls the base 'double-acting'
rather than 'geminate', but it'
This is a character under ballot for Amendment 1 to the 5th edition. It
isn't part of the repertoire planned for publication as part of Unicode
10.0 in June.
So if you want to have any impact on the subhead used in the charts for
A7AF, the correct mechanism now is to get a national body commen
Manish,
On 12/22/2016 10:35 AM, Manish Goregaokar wrote:
The property table should include all role and gender modifiers as GAZ.
Could this be updated?
Property values cannot be updated for *published* versions of the
standard. What you should do is submit your feedback as part of the
pub
On 12/20/2016 10:33 AM, Markus Scherer wrote:
Yes. However, some of the discussion in this thread is due to details
that were not spelled out in the PRI. There is basically a 2a and a
2b, while the examples in PRI #121 work the same in both variants.
I wasn't intending to argue the case one
Doug,
On 12/19/2016 6:08 PM, Doug Ewell wrote:
I thought there was a corrigendum or other, comparatively recent
addition to the Standard that spelled out how replacement characters
are supposed to be substituted for invalid code unit sequences --
something about detecting maximally long seque
Forwarded Message
Subject: Re: Should unassigned code points in blocks reserved for
combining marks, etc be GCB extended?
Date: Mon, 12 Dec 2016 08:26:45 -0800
From: Ken Whistler
To: Karl Williamson
On 12/12/2016 6:59 AM, Karl Williamson wrote:
These are
On 12/8/2016 6:41 PM, Fabian Giesen wrote:
1. BidiReferenceJava supports Unicode 6.3.0, but has not been updated
for later versions.
We have an updated version of BidiReferenceJava about ready to deploy
into the PROGRAMS directory.
About the bug you note in BidiReferenceC, I'll investigate
On 11/25/2016 10:20 PM, Janusz S. Bień wrote:
Now there is a follow-up question: why the character was included in
Unicode 1.1.0?
Well, it was included in Unicode 1.1 because it was published in Unicode
1.0 already. So that is the proximate reason.
That inevitably will raise the question, "
1 - 100 of 258 matches
Mail list logo