On 2/12/2020 3:26 PM, Shawn Steele via
Unicode wrote:
From the point of view of Unicode, it is simpler: If the character is in use or have had use, it should be included somehow.
That bar, to me, seems too low. Many things are only used brie
On 2/2/2020 5:22 PM, Richard Wordingham
via Unicode wrote:
On Sun, 2 Feb 2020 16:20:07 -0800
Eric Muller via Unicode wrote:
That would imply some coordination among variations sequences on
different code points, right?
E.g. <0B48> ≡ <0B47, 0B56>,
https://thehill.com/policy/technology/476086-social-media-users-call-out-twitter-over-kwanzaa-emoji
On 12/17/2019 5:49 PM, James Kass via
Unicode wrote:
Asmus Freytag wrote,
> And any recommendation that is not compatible with what the
overwhelming
> majority of software has been doing should be ignored (or
only
On 12/17/2019 11:31 AM, James Kass via
Unicode wrote:
So it
follows that any justification operation should treat NO-BREAK
SPACE and SPACE identically.
And any recommendation that is not
compatible with what the overwhelming majority of software has
On 12/17/2019 2:41 AM, Shriramana
Sharma via Unicode wrote:
On Tue 17 Dec, 2019, 16:09
QSJN 4 UKR via Unicode,
wrote:
Agree.
By the way,
On 11/19/2019 3:00 PM, Mark E. Shoulson
via Unicode wrote:
It says "foundation", not "sum total,
all there is." I don't think this is much overreach. MAYBE it
counts as "enthusiastic", but not misleading.
Why so concerned
On 11/19/2019 12:04 PM, Michael Everson
via Unicode wrote:
Of course it’s not “misleading”. Human language is best conveyed by text.
One could insert the language in [ ] to make the claim sound less
like an overreach.
It doesn't even impede the f
On 11/12/2019 12:32 PM,
wjgo_10...@btinternet.com via Unicode wrote:
>
Just because you can write something that is a very detailed
specification doesn't mean that it is, or ever should be, a
standard.
Yes, but that does not mean that
On 11/12/2019 8:41 AM,
wjgo_10...@btinternet.com via Unicode wrote:
Asmus
Freytag wrote as follows.
While I have a certain understanding for
the underlying concerns, it still is the case that this proposal
promises to be a bad exam
On 11/9/2019 3:18 PM, Peter Constable
via Unicode wrote:
Neither Unicode Inc. or ISO/IEC 10646 would _implement_ QID emoji. Unicode would provide a specification for QID emoji that software vendors could implement, while ISO/IEC 10646 would not define that specifica
On 10/13/2019 6:38 PM, Richard
Wordingham via Unicode wrote:
On Sun, 13 Oct 2019 17:13:28 -0700
Asmus Freytag via Unicode wrote:
On 10/13/2019 2:54 PM, Richard Wordingham via Unicode wrote:
Besides invalidating complexity metrics, the issue was
On 10/13/2019 2:54 PM, Richard
Wordingham via Unicode wrote:
Besides invalidating complexity metrics, the issue was what \p{Lu}
should match. For example, with PCRE syntax, GNU grep Version 2.25
\p{Lu} matches U+0100 but not . When I'm respecting
canonical equival
On 10/12/2019 1:16 AM, Daniel Bünzli
via Unicode wrote:
With all due respect for the work that has been done on the new website I think that the new structure significantly decreased the usability of the website for technical users.
^^^ This (unfortunatel
Sidebar looks same as on other pages
for me. Don't like the design, but that's a different issue.
Now, stuff on the bottom: the line with
the "terms of use" is at least one font size too small. Esp. if
the terms of use are supposed to be a clickable link.
On 10/6/2019 10:59 PM, David Starner
via Unicode wrote:
I still see the encoding of the original ellipsis as a mistake,
probably for compatibility with some older standard that included it
because the system wasn't smart enough to intelligently handle "..."
as ellip
"typographically incorrect" comma
ellipsis :)
A./
On Sun, Oct 6, 2019 at 5:02
PM Asmus Freytag via Unicode <unicode@unicode.org>
wrote:
On 10/6/2019 4:05 PM, Tex v
On 10/6/2019 4:05 PM, Tex via Unicode
wrote:
Now that comma ellipses (,,,) are a thing
(at least on social media) do we need a character proposal?
Asking for a friend,,, J
tex
On 9/30/2019 1:01 AM, Andre Schappo via
Unicode wrote:
On Sep 27, 1 Reiwa, at 08:17, Julian Bradfield via Unicode wrote:
Or one could allow IDS to have leaf components that are any
characters, not just ideographic characters, and then one could ha
On 9/29/2019 7:42 AM, Andre Schappo via
Unicode wrote:
Or one could allow IDS to have leaf components that are any
characters, not just ideographic characters, and then one could have
all sorts of fun.
I do like that idea
André Schap
On 9/13/2019 10:50 AM, Richard
Wordingham via Unicode wrote:
On Fri, 13 Sep 2019 08:56:02 +0300
Henri Sivonen via Unicode wrote:
On Thu, Sep 12, 2019, 15:53 Christoph Päper via Unicode
wrote:
ISHY/SIHY is especially useful for
On 9/12/2019 5:53 AM, Christoph Päper
via Unicode wrote:
ISHY/SIHY is especially useful for encoding (German) noun compounds in wrapped titles, e.g. on product labeling, where hyphens are often suppressed for stylistic reasons, e.g. orthographically correct _Spargel
On 8/14/2019 7:49 PM, James Kass via
Unicode wrote:
On 2019-08-15 12:25 AM, Asmus Freytag via Unicode wrote:
Empirically, it has been observed that
some distinctions that are claimed by
users, standards developers or
On 8/14/2019 2:05 AM, James Kass via
Unicode wrote:
This
presumes that the premise of user communities feeling strongly
about the unacceptable aspect of the variants is valid. Since it
has been reported and nothing seems to be happening, perhaps the
On 8/8/2019 1:06 AM, Richard Wordingham
via Unicode wrote:
This is not compliant with Unicode, but
neither is deliberately treating canonically equivalent forms
differently.
That.
A./
rime.
A./
From: Unicode
On Behalf Of Asmus Freytag via Unicode
Sent: 07 August 2019 14:19
To: unicode@unicode.org
Subject: Re: What is the time frame for USE
sha
What about text that must exist
normalized for other purposes?
Domain names must be normalized to NFC,
for example. Will such strings display correctly if passed to USE?
A./
On 8/7/2019 1:39 PM, Andrew Glass via
Unicode wro
On 7/22/2019 10:00 AM, Ken Whistler via
Unicode wrote:
Your
helpful suggestions will be passed along to the people working on
the new site.
In the meantime, please note that the link to the "Unicode
Technical Site" has been added to th
There's really no inherent need for
many spacing combining marks to have a base character. At least
the ones that do not reorder and that don't overhang the base
character's glyph.
As far as I can tell, it's largely a
convention that originally
On 7/17/2019 6:03 PM, Richard
Wordingham via Unicode wrote:
On Thu, 18 Jul 2019 01:54:52 +0200
Philippe Verdy via Unicode wrote:
In fact the ligatures system for the "cursive" Egyptian Hieratic is so
complex (and may also have its own variants show
A question has come up in another
context:
Is there any linguistic term for
describing the process of removing accents and diacritics from a
word to create its “base form”, e.g. São Tomé to Sao Tome?
The linguistic term "string normalization" appear
On 5/31/2019 7:12 AM, Michael Everson
via Unicode wrote:
No, thank you.
Not so fast. I think we need to hear from the telemdicine
community first.
A./
On 31 May 2019, at 11:18, bristol_poo via Unicode wrote:
Gre
On 5/30/2019 1:07 AM, Andre Schappo via
Unicode wrote:
This tweet made me laugh twitter.com/padolsey/status/1133835770773626881 😀🤯
André Schappo
On 5/15/2019 4:22 AM, Costello, Roger
L. via Unicode wrote:
Hello Unicode experts!
Which is correct:
(a) The input file contains a string. The string is encoded using UTF-8.
(b) The input file contains a string. The string is encoded with UTF-8.
(c) The input fi
On 5/2/2019 8:44 AM, J Andrew Lipscomb
via Unicode wrote:
Why not just use U+25E4 and U+25E2 for the triangles, and U+2215 for the diagonal?
Why not wait for evidence of that scheme
being used in text. Then we know.
A./
On 5/1/2019 3:23 AM, Shriramana Sharma
via Unicode wrote:
http://www.unicode.org/L2/L-curdoc.htm
The number of emoji-related proposals seems to be increasing compared
to the number of script-related ones.
Have we reached a plateau re scripts encoding?
Somehow thi
On 4/19/2019 6:57 PM, Shriramana Sharma
via Unicode wrote:
I don't know many modern fonts that display 007C
as a broken glyph. In fact I haven't seen a broken line pipe
glyph since the MS-DOS days. Nowadays we have 00A6 for that.
I
suspect that this work would be jibber-jabber to any non-English
speaker unfamiliar with the original Haggadah. No matter how
otherwise fluent they might be in emoji communication.
You can't escape fundamental theses:
There
On 2/22/2019 7:29 AM, Richard
Wordingham via Unicode wrote:
On Fri, 22 Feb 2019 09:07:06 +
Richard Wordingham via Unicode wrote:
My best hypothesis (not thoroughly tested) is that Windows currently
has InSc=Consonant_Killer, but can I look his
On 2/13/2019 5:19 PM, Mark E. Shoulson
via Unicode wrote:
And
again, all this is before we even consider other issues; I can't
shake the feeling that there security nightmares lurking inside
this idea.
Default ignorables are bad juju.
A./
On 2/9/2019 12:07 PM, Egmont Koblinger
via Unicode wrote:
On Sat, Feb 9, 2019 at 9:01 PM Eli Zaretskii wrote:
then what you say is that some scripts
can never be supported by text terminals.
I'm not familiar at all with all the scrip
On quick reading this appears to be a
strong argument why such emulators will
never be able to be used for certain
scripts. Effectively, the model described works
well with any scripts where characters
are laid out (or can be laid out) in fixed
width cells t
On 2/8/2019 5:42 PM, James Kass via
Unicode wrote:
William,
Rather than having the user insert the VS14 after every character,
the editor might allow the user to select a span of text for
italicization. Then it would be up to
On 2/8/2019 2:08 PM, Richard Wordingham
via Unicode wrote:
On Fri, 8 Feb 2019 17:16:09 + (GMT)
"wjgo_10...@btinternet.com via Unicode" wrote:
Andrew West wrote:
Just reminding you that "The initial char
On 2/4/2019 1:00 PM, Richard Wordingham
via Unicode wrote:
To me, 'visual order' means in the dominant order of the script.
Visual order is a term of art, meaning the characters are ordered
in memory in the same order as they are displayed on the scr
On 2/4/2019 11:21 AM, Costello, Roger
L. via Unicode wrote:
Hello Unicode Experts!
As I understand it, endian-ness applies to multi-byte words.
Endian-ness does not apply to ASCII characters because each character is a single byte.
Endian-ness does apply to UTF-1
On 1/31/2019 12:55 AM, Tex via Unicode
wrote:
As with the many problems with walls not being effective, you choose to ignore the legitimate issues pointed out on the list with the lack of italic standardization for Chinese braille, text to voice readers, etc.
The ch
On 1/30/2019 7:46 PM, David Starner via
Unicode wrote:
On Sun, Jan 27, 2019 at 12:04 PM James Kass via Unicode
wrote:
A new beta of BabelPad has been released which enables input, storing,
and display of italics, bold, strikethrough, and underline i
On 1/30/2019 4:38 PM, Kent Karlsson via
Unicode wrote:
I did say "multiple" and "for instance". But since you ask:
ITU T.416/ISO/IEC 8613-6 defines general RGB & CMY(K) colour control
sequences, which are deferred in ECMA-48/ISO 6429. (The RGB one
is implemented in
Arabic terminals and terminal emulators
existed at the time of Unicode 1.0. If you are trying to emulate
those services, for example so that older software can run, you
would need to look at how these programs expected to be fed their
data.
I see lit
On 1/26/2019 10:08 PM, Richard
Wordingham via Unicode wrote:
On Sat, 26 Jan 2019 21:11:36 -0800
Asmus Freytag via Unicode wrote:
On 1/26/2019 5:43 PM, Richard Wordingham via Unicode wrote:
That appears to
On 1/26/2019 7:53 PM, Richard
Wordingham via Unicode wrote:
On Sun, 27 Jan 2019 01:55:29 +
James Kass via Unicode wrote:
Richard Wordingham replied to Asmus Freytag,
>> To make matters worse, users for languages that "should" use
>> U+02BC a
On 1/26/2019 6:25 PM, Michael Everson
via Unicode wrote:
On 27 Jan 2019, at 01:37, Richard Wordingham via Unicode wrote:
I’ll be publishing a translation of Alice into Ancient Greek in due
course. I will absolutely only use U+20
On 1/26/2019 5:43 PM, Richard
Wordingham via Unicode wrote:
On Sat, 26 Jan 2019 17:11:49 -0800
Asmus Freytag via Unicode wrote:
To make matters worse, users for languages that "should" use U+02BC
aren't actually consistent; much d
On Fri, Jan 25, 2019 at 11:07 PM Asmus Freytag
via Unicode <unicode@unicode
On 1/25/2019 10:05 AM, James Kass via
Unicode wrote:
For U+2019, there's a note saying 'this is the preferred character
to use for apostrophe'.
Mark Davis wrote,
> When it is between letters it doesn't cause a wor
On 1/25/2019 9:39 AM, James Tauber via
Unicode wrote:
Thank you, although the word break does still
affect things like double-clicking to select.
And people do seem to want to use U+02BC for this reason
(and I'm try
On 1/24/2019 9:44 PM, Garth Wallace via
Unicode wrote:
But the root problem isn't the kludge, it's the lack of
functionality in these systems: if Twitter etc. simply
implemented some styling on their own, the whole thing would be
a moot point
On 1/20/2019 2:55 PM, James Kass via
Unicode wrote:
On 2019-01-20 10:49 PM, Garth Wallace wrote:
I think the real solution is for Twitter
to just implement basic styling and make this a moot point.
At which ti
On 1/20/2019 2:49 PM, Garth Wallace via
Unicode wrote:
I think the real solution is for Twitter to just
implement basic styling and make this a moot point.
Twitter FB and CO should implement a common "MarkDown" sch
On 1/19/2019 3:53 AM, James Kass via
Unicode wrote:
Marcel Schneider wrote,
> When you ask for knowing the foundations and that knowledge
is persistently refused,
> you end up believing that those foundations just can’t
On 1/19/2019 12:34 PM, James Kass via
Unicode wrote:
On 2019-01-19 6:19 PM, wjgo_10...@btinternet.com wrote:
> It seems to me that it would be useful to have some codes
that are
> ordinary characters in some contexts yet
On 1/18/2019 11:34 PM, Marcel Schneider
via Unicode wrote:
Current
practice in electronic publishing was to use a non-breakable
thin space, Philippe Verdy reports. Did that information come
in somehow?
==> prob
On 1/18/2019 2:46 PM, Shawn Steele via
Unicode wrote:
>> That
should not impact all other users out there interested in a
civilized layout.
I’m not sure
that the choice of the word “civilized” adds value to the
conversation. We
On 1/18/2019 2:05 PM, Marcel Schneider
via Unicode wrote:
On 18/01/2019 20:09, Asmus Freytag
via Unicode wrote:
Marcel,
about your many detailed *technical* questions about the
history of character
I would full agree and I think Mark puts it really well in the
message below why some of the proposals brandished here are no
longer plain text but "not-so-plain" text.
I think we are better served with a solution that provides some
form of "light" rich text, for ba
Marcel,
about your many detailed *technical* questions about the history
of character properties, I am afraid I have no specific
recollection.
French is not the only language that uses a space to group
figures. In fact, I grew up with thousands separators being
On 1/18/2019 7:27 AM, Marcel Schneider
via Unicode wrote:
Covering existing
character sets (National, International and Industry)
was an (not "the") important goal at
the time: such cov
On 1/18/2019 7:27 AM, Marcel Schneider
via Unicode wrote:
I understand only better
why a significant majority of UTC is hating French.
Francophobia is also palpable in Canada, beyond any
technical reasons, especially in the IT indus
On 1/17/2019 9:35 AM, Marcel Schneider
via Unicode wrote:
[quoted mail]
But the French "espace fine insécable" was requested
long long before Mongolian was discussed for encodinc in
On 1/16/2019 7:38 PM, James Kass via
Unicode wrote:
Computer
text tradition aside, nobody seems to offer any legitimate reason
why such information isn't worthy of being preservable in
plain-text. Perhaps there isn't one.
By introducing s
On 1/16/2019 6:33 AM, Marcel Schneider
via Unicode wrote:
So to
date, Unicode has only made half its way, and for every single
script in the
Standard there is another script out there that remains still
unsupported.
First things first.
On 1/14/2019 5:41 PM, Mark E. Shoulson
via Unicode wrote:
On 1/14/19 5:08 AM, Tex via Unicode
wrote:
This thread has gone on for a bit and
I question if there is any more light th
From:
Unicode [mailto:unicode-boun...@unicode.org] On
Behalf Of Asmus Freytag via Unicode
Sent: Monday, January 14, 2019 1:21 PM
To: unicode@unicode.org
Subjec
On 1/14/2019 2:43 PM, James Kass via
Unicode wrote:
Hans Åberg wrote,
> How about using U+0301 COMBINING ACUTE ACCENT: 𝑝𝑎𝑠𝑠𝑒́
Thought about using a combining accent. Figured it would just
display with a dotted ci
On 1/14/2019 3:37 PM, Richard
Wordingham via Unicode wrote:
On Tue, 15 Jan 2019 00:02:49 +0100
Hans Åberg via Unicode wrote:
On 14 Jan 2019, at 23:43, James Kass via Unicode
wrote:
Hans Åberg wrote,
How about
On 1/14/2019 2:58 PM, David Starner via
Unicode wrote:
Source code is an example of plain text, and yet adding italics into
comments would require but a trivial change to editors. If the user
audience cared, it would have been done. In fact, I suspect there
exist ed
On 1/14/2019 2:08 AM, Tex via Unicode
wrote:
Perhaps the question should be put to
twitter, messaging apps, text-to-voice vendors, and others
whether it will be useful or not.
If the discussion continues I would like
to see more of a co
On 1/12/2019 5:22 AM, Richard
Wordingham via Unicode wrote:
On Sat, 12 Jan 2019 10:57:26 + (GMT)
Julian Bradfield via Unicode wrote:
It's also fundamentally misguided. When I _italicize_ a word, I am
writing a word composed of (plain old) lette
On 1/9/2019 4:41 PM, Mark E. Shoulson
via Unicode wrote:
On 1/9/19 2:30 AM, Asmus Freytag via
Unicode wrote:
English use of italics on isolated words
to disambiguate the reading of some sentences is a
On 1/9/2019 1:37 AM, Tex via Unicode
wrote:
James Kass wrote:
If a text is published in all italics, that’s style/font
choice. If a text is published using italics and roman
contrasti
On 1/9/2019 1:06 AM, James Kass via
Unicode wrote:
Asmus Freytag wrote,
> Still, not supported in plain text (unless you abuse the
> math alphabets for things they were not intended for).
The unintended usa
On 1/8/2019 10:58 PM, James Kass via
Unicode wrote:
If a text is published in all italics, that’s style/font choice.
If a text is published using italics and roman contrastively and
consistently, and everybody else is doing it pretty much the same
On 1/8/2019 1:11 PM, James Kass via
Unicode wrote:
Asmus Freytag wrote,
> ...
> (for an extreme example there's an orthography
> out there that uses @ as a letter -- we know that
> won't work well wit
On 1/7/2019 10:40 PM, Marcel Schneider
via Unicode wrote:
The
pitch is that if some languages are still considered “needing”
rich text where others are correctly represented in plain text
(stress, abbreviations), the Standard needs to be updated in a way
On 1/7/2019 7:46 PM, James Kass via
Unicode wrote:
Making
recommendations for the post processing of strings containing the
combining low line strikes me as being outside the scope of
Unicode, though.
Agreed.
Those kinds of things are effe
On 11/22/2018 11:58 AM, Carl via
Unicode wrote:
(It looks like my HTML email got scrubbed, sorry for the double post)
Hi,
In Chapter 3 Section 13, the Unicode spec defines D146:
"A string X is a compatibility caseless match for a string Y if and only if: NFKD(t
Precisely. Not in the context of character coding so much as just
in terms of learning about writing systems. For example, is it
something that was absolutely common with "standardiyed"
conventions, or more of an ad-hoc thing?
A./
On 11/11/2018 4:20 PM, Mark E. Shoulson
via Unicode wrote:
On
11/11/18 4:16 PM, Asmus Freytag via Unicode wrote:
On 11/11/2018 12:32 PM, Hans Åberg via
Unicode wrote:
Wir sind uns dessen bewusst, dass von
On 11/11/2018 12:32 PM, Hans Åberg via
Unicode wrote:
On 11 Nov 2018, at 07:03, Beth Myre via Unicode wrote:
Hi Mark,
This is a really cool find, and it's interesting that you might have a relative mentioned in it. After looking at it more, I'm
On 11/10/2018 10:03 PM, Beth Myre via
Unicode wrote:
Hi Mark,
I (re-)transliterated it, and it reads:
Wir sind uns dessen bewusst, dass von
Seite der
Gege
On 11/2/2018 4:31 AM, James Kass via
Unicode wrote:
Suppose someone found a hundred year old form from Poland which
included a section for "sign your name" and "print your name"
which had been filled out by a man with the typically Polish name
On 11/1/2018 7:59 PM, James Kass via
Unicode wrote:
Alphabetic script users write things the way they are spelled and
spell things the way they are written. The abbreviation in
question as written consists of three recognizable symbols. An
On 11/1/2018 10:23 AM, Janusz S. Bień
via Unicode wrote:
On Thu, Nov 01 2018 at 8:43 -0700, Asmus Freytag via Unicode wrote:
On 11/1/2018 12:33 AM, Janusz S. Bień via Unicode wrote:
On Wed, Oct 31 2018 at 12:14 -0700, Ken Whistler via Unicode
On 11/1/2018 12:33 AM, Janusz S. Bień
via Unicode wrote:
On Wed, Oct 31 2018 at 12:14 -0700, Ken Whistler via Unicode wrote:
On 10/31/2018 11:27 AM, Asmus Freytag via Unicode wrote:
but we don't have an agreement
On 11/1/2018 12:52 AM, Richard
Wordingham via Unicode wrote:
On Wed, 31 Oct 2018 11:35:19 -0700
Asmus Freytag via Unicode wrote:
On the other hand, I'm a firm believer in applying certain styling
attributes to things like e-mail or discu
Organic chemistry would need sub/sup
alpha, beta and gamma (perhaps others).
A./
On 10/31/2018 3:35 PM, Piotr Karocki
via Unicode wrote:
We don't know whether the abbreviation "Mr", spelled exactly this way,
already existe
On 10/31/2018 3:37 PM, Marcel Schneider
via Unicode wrote:
On 31/10/2018 19:42, Asmus Freytag via Unicode wrote:
On 10/31/2018 11:10 AM, Marcel Schneider via Unicode wrote:
which, if my understanding of
On 10/31/2018 9:03 AM, Khaled Hosny via
Unicode wrote:
A while I was localizing some application to Arabic and the developer
“helpfully” used m² for square meter, but that does not work for Arabic
because there is no superscript ٢ in Unicode, so I had to contact the
On 10/31/2018 10:18 AM, Marcel
Schneider via Unicode wrote:
On 31/10/2018 at 17:03, Khaled Hosny wrote:
A while I was localizing some application to Arabic and the developer
“helpfully” used m² for square meter, but that does not work for Arabic
bec
1 - 100 of 186 matches
Mail list logo