Re: Unicode "no-op" Character?

2019-06-22 Thread Rebecca T via Unicode
Perhaps a codepoint from a private use area and another processing step to
add/ remove them would work for you?

On Sat, Jun 22, 2019, 1:39 AM Mark Davis ☕️ via Unicode 
wrote:

> There nothing like what you are describing. Examples:
>
>1. Display — There are a few of the Default Ignorables that are always
>treated as invisible, and have little effect on other characters. However,
>even those will generally interfere with the display of sequences (be
>between 'q' and U+0308 ( q̈ ); within emoji sequences, within
>ligatures, etc), line break, etc.
>2. Interpretation — There is no character that would always be ignored
>by all processes. Some processes may ignore some characters (eg a search
>indexer may ignore most default ignorables), but there is nothing that all
>processes will ignore.
>
> The only exception would be if some cooperating processes that had agreed
> beforehand to strip some particular character.
>
> Mark
>
>
> On Sat, Jun 22, 2019 at 6:49 AM Sławomir Osipiuk via Unicode <
> unicode@unicode.org> wrote:
>
>> Does Unicode include a character that does nothing at all? I’m talking
>> about something that can be used for padding data without affecting
>> interpretation of other characters, including combining chars and
>> ligatures. I.e. a character that could hypothetically be inserted between a
>> latin E and a combining acute and still produce É. The historical
>> description of U+0016 SYNCHRONOUS IDLE seems like pretty much exactly what
>> I want. It only has one slight disadvantage: it doesn’t work. All software
>> I’ve tried displays it as an unknown character and it definitely breaks up
>> combinations. And U+ NULL seems even worse.
>>
>>
>>
>> I can imagine the answer is that this thing I’m looking for isn’t a
>> character at all and so should be the business of “a higher-level protocol”
>> and not what Unicode was made for… but Unicode does include some odd things
>> so I wonder if there is something like that regardless. Can anyone offer
>> any suggestions?
>>
>>
>>
>> Sławomir Osipiuk
>>
>


Re: Correct way to express in English that a string is encoded ... using UTF-8 ... with UTF-8 ... in UTF-8?

2019-05-15 Thread Rebecca T via Unicode
I think that colloquially “the file contains a UTF-8 string” is best, but
perhaps not in more formal communications.

On Wed, May 15, 2019, 7:24 AM Costello, Roger L. via Unicode <
unicode@unicode.org> wrote:

> Hello Unicode experts!
>
> Which is correct:
>
> (a) The input file contains a string. The string is encoded using UTF-8.
>
> (b) The input file contains a string. The string is encoded with UTF-8.
>
> (c) The input file contains a string. The string is encoded in UTF-8.
>
> (d) Something else (what?)
>
> /Roger
>
>


Re: Reminder Ribbon U+1F397

2018-06-23 Thread Rebecca T via Unicode
>
> Perhaps a larger set of ribbon characters, with defined colors for
> each, is called for?


[image: crying pointing gun.jpg]

But uh, seriously, the ribbon was encoded in the Great Wingdings and
Webdings Migration of 2011 (see L2/12-368
 p.21) and I would
imagine that future ribbon characters would be rejected precisely
*because *they
“have different associations when used in various campaigns and movements.”

On Fri, Jun 22, 2018 at 2:08 AM Daniel R. Tobias via Unicode <
unicode@unicode.org> wrote:

> The Unicode standard for the Reminder Ribbon character (U+1F397) does
> not appear to specify or suggest a color for the ribbon (the glyph
> shown in the code chart is black, like other characters there).
> Platforms that support this character among the other emojis do
> however assign a color to it, as seen in character pick lists as well
> as where the character is shown in sent or received messages. This,
> however, is not done with any consistency; different platforms have
> used yellow, blue, and red ribbons, as shown here:
>
> https://emojipedia.org/reminder-ribbon/
>
> Different colors have different associations when used in various
> campaigns and movements; some are listed here:
>
> https://en.wikipedia.org/wiki/List_of_awareness_ribbons
>
> This can produce confusion when somebody uses the character (e.g., in
> a tweet or text message) in association with a campaign that uses the
> color that happens to match that used in the sender's platform (for
> instance, yellow ribbons have been in current use to call for release
> of Catalan prisoners held by Spain) but a reader of the message on a
> different platform sees it differently, with a color that might have
> different associations.
>
> Perhaps a larger set of ribbon characters, with defined colors for
> each, is called for? Or is this better done by creating composite
> characters with the existing ribbon character combined with a
> color-specifying code point?
>
> --
> == Dan ==
> Dan's Mail Format Site: http://mailformat.dan.info/
> Dan's Web Tips: http://webtips.dan.info/
> Dan's Domain Site: http://domains.dan.info/
>
>
>


Re: Can NFKC turn valid UAX 31 identifiers into non-identifiers?

2018-06-04 Thread Rebecca T via Unicode
I think that the benefits of inclusion from allowing non-ASCII identifiers
far outweigh any corner cases this might cause. (Although ironing out and
analyzing those is of course important, I don’t think they should be
obstacles for implementing this kind of thing.)

Something I’d love to see is translated keywords; shouldn’t be hard with a
line in the cargo.toml for a ruidmentary lookup. Again, I’m of the opinion
that an imperfect implementation is better than no attempt. I remember
reading an article about a professor who translated the keywords in...
maybe it was Python? And found their students were much more engaged with
the material. Anecdotal, of course, but it’s stuck with me.

On Mon, Jun 4, 2018 at 3:53 PM Manish Goregaokar via Unicode <
unicode@unicode.org> wrote:

> Hi,
>
> The Rust community is considering
>  adding non-ascii
> identifiers, which follow UAX #31 
> (XID_Start XID_Continue*, with tweaks). The proposal also asks for
> identifiers to be treated as equivalent under NFKC.
>
> Are there any cases where this will lead to inconsistencies? I.e. can the
> NFKC of a valid UAX 31 ident be invalid UAX 31?
>
> (In general, are there other problems folks see with this proposal?)
>
>
> Thanks,
> -Manish
>


Re: Unicode education in UK Schools

2017-07-08 Thread Rebecca T via Unicode
> That might be a good thing.

Yeah. Very seriously, it’s very important to introduce Unicode early on in
CS
education, even in a “hey, it’s not OK to exclude people who don’t speak
English or people whose names have diacritics from using the programs you
create” sort of way.

Ignorance and apathy for the world’s citizens is a terrible thing and I hope
that every year brings more access to tech, Unicode-enabled and ready, to
more
of humanity.


On Fri, Jul 7, 2017 at 3:55 PM, Doug Ewell via Unicode 
wrote:

> Asmus Freytag wrote:
>
> > I've not (yet) located any assignments that try to address any of the
> > "tricky" issues in the use of Unicode.
>
> That might be a good thing. Many introductory lessons or chapters or
> talks about Unicode dive almost immediately into the complexities and
> weirdnesses, much more so than with other technical topics. This scares
> newbies and they walk away thinking every aspect of Unicode is complex
> and weird.
>
> --
> Doug Ewell | Thornton, CO, US | ewellic.org
>
>


Re: unihan-etl: create exports of UNIHAN db to csv, json and yaml

2017-05-30 Thread Rebecca T via Unicode
Oh, thank god. I’ve wanted something like this for ages, but I’ve been too
lazy to invest the time to create a serious tool — I’ve used a lot of messy
one-time regular expressions. Will definitely be starring your repo!


Re: Proposal to add standardized variation sequences for chess notation

2017-04-07 Thread Rebecca T
> while evidently there are users who need to send BROCCOLI to one another,
> nobody but nobody needs to send an 8 x 8 chessboard matrix in a tweet. Get
> it?

I simply must disagree; sending a textual chessboard sounds awesome! A
twitter
bot that plays chess with you and shows you a graphical representation of
the
board would be great!

Don’t interpret this as an advocation of making chess pieces emoji, however;
although that might be interesting, I’ll leave the actual decision making to
those more experienced in that particular domain — I’m simply saying that I
think there are lots of rad potential applications for putting chessboards
in
tweets. Oh! We could host tournaments on twitter and merge the discussion
into
the actual tournament! That would be super cool!


Re: PETSCII mapping?

2017-04-06 Thread Rebecca T
Count me in!

I’m partial for one large unified proposal, FWIW.

On Thu, Apr 6, 2017 at 2:24 PM, Rebecca Bettencourt 
wrote:

> On Thu, Apr 6, 2017 at 10:43 AM, Doug Ewell  wrote:
>
>> Michael Everson wrote:
>>
>> > Everybody interested, raise your hand…
>>
>> I'm in. 
>
>
> I'm in as well of course.
>
>
>> Rebecca Bettencourt wrote:
>>
>> > The question is, do we want to add these missing graphics characters
>> > incrementally, platform by platform, or put together a larger proposal
>> > for, say, one big Block Elements Extended block?
>>
>> I would guess the latter. There's no tremendous rush; there should be
>> time to do a proper analysis of target platforms, evaluate which
>> proposed characters should be unified with existing or other proposed
>> characters, and so forth.
>>
>> Of course there's no guarantee this will be the last request ever for
>> 8-bit computer compatibility characters, but there doesn't seem much
>> point in intentionally dragging the process out, platform by platform.
>>
>
> You make a good point. I'm in either way. :)
>
>


Re: PETSCII mapping?

2017-04-06 Thread Rebecca T
Here’s a copy of the Teletext character set; it includes box-drawing
characters
for all combinations of a 2×3 grid of cells. 2⁶ = 64 characters, so we might
need a new block.

[1]: http://www.galax.xyz/TELETEXT/CHARSET.HTM


Re: PETSCII mapping?

2017-04-05 Thread Rebecca T
The Wikipedia page for PETSCII [1] only marks 20 characters as not having
Unicode equivalents; 2px (light) and 3px (heavy) horizontal and vertical
bars
at various non-center positions, diagonal shading characters, and corner
characters.

I’ve done some processing to the table on [1] to filter out the missing
characters — their exact codepoints and descriptions can be found in [2].

These characters are highlighted in red in the attached image (green
characters
are also missing but are duplicates of other characters in the chart), and
marked by U+FFFD � in the compact table [3].

The box-drawing characters seem to semantically represent lines (boxes) and
the
block elements seem to represent shapes and shades; this makes $7c, $7f,
$a7,
$a8, $a9, $b6, $b7, and $b8 block elements and the rest box-drawing
characters.

[1]: https://en.m.wikipedia.org/wiki/PETSCII
[2]: https://github.com/years/Unicode-PETSCII/blob/master/new.txt
[3]:
https://github.com/years/Unicode-PETSCII/blob/master/graphic-table.txt

[image: Inline image 1]

On Wed, Apr 5, 2017 at 11:32 PM, Rebecca Bettencourt 
wrote:

> On 6 April 2017 at 09:44, James Kass  wrote:
>
>> Rebecca Bettencourt wrote,
>>
>> > I can put together a unified chart, with mappings to Unicode where
>> > they exist. In fact I think I'll do that. :)
>>
>> I hope you do.  That would be a good starting point.
>>
>
> I'm working on it!
>
> On Wed, Apr 5, 2017 at 7:40 PM, Elias Mårtenson  wrote:
>
>> Do we also have to create an example font that includes these symbols?
>> That seems to be what Michael Everson did for his chess notation proposal
>> that I read recently.
>>
>
> We do have to provide Unicode with fonts, I believe. We can use an
> existing C64 font, such as Pet Me. Or, we can create a new font with
> vectorized versions of the characters.
>
>
>> Then there is the issue of what to do with the text colour and style
>> selectors. PETSCII has characters that indicate a colour change as well as
>> reverse video. At least the reverse video one is important, as it's being
>> used to construct new characters. For example, PETSCII only has a single
>> character "half block" (top part filled). The way you represent a half
>> block with the bottom part filled is to use the reverse video together with
>> the former.
>>
>> It would probably make more sense to represent the reversed symbols as
>> separate code points?
>>
>
> I would actually leave the color-change and reverse-video characters to a
> higher-level protocol.
>
>
>>
>> Regards,
>> Elias
>>
>
>


Re: Coloured Punctuation and Annotation

2017-04-05 Thread Rebecca T
> The hieroglyphs don't have the emoji property

What does the emoji property mean, semantically? That the codepoint
represents
a pictograph or that vendors have “permission” to give it a colored,
stylized
representation? If we go with the first, then hieroglyphs should certainly
be
emoji. Although it seems unlikely (to put it lightly) that hieroglyph emoji
would be deployed due to the burden on vendors, it does seem logically
appropriate that we treat all pictographs equally, and aside from usage I
see
no difference between U+1F989 OWL 黎 and U+13153 EGYPTIAN HIEROGLYPH G017
ㅓ.


Re: PETSCII mapping?

2017-04-05 Thread Rebecca T
> If there's a credible need to convert files between Unicode-based systems
and
> those using PETSCII

There is! It’s called “sharing textual information” and it’s how our society
functions. Can we afford to blithely abandon data from the best selling
computer in history [1] because nobody cared to standardize its?

> A similar scenario might exist if C64 emulators run on Unicode-based
systesm
> were a widespread phenomenon

They do! Even last month, there was a PETSCII directory-art contest. [2]

A bit off-topic, but:

As time goes on, “not in widespread use” will become a flimsier and flimsier
argument against inclusion — why isn’t there a larger community of PETSCII
enthusaists? Partially because the only way to share PETSCII is through
images!
The consortium (passively or actively) prevents communication through
exclusion
and then uses the lack of communication as a justification against
inclusion —
it’s a poor, tautological argument, and it won’t serve the consortium
long-term.

Simply put, we need new criteria for inclusion — as the vast majority of the
world’s systems (from written communication in text messages to the
manuscripts
of all new books) are already Unicode-based, we can no longer rely on a
character’s existing presence outside of Unicode as a signal to warrent
inclusion; we must weigh a character’s merits and usability on its own.
(does
it fill a gap in communication? Will it be used?)

[1]: http://www.cnn.com/2011/TECH/gaming.gadgets/05/09/commodore.64.reborn/
[2]: http://csdb.dk/event/?id=2558


Re: Proposal to add standardized variation sequences for chess notation

2017-04-01 Thread Rebecca T
> No chess symbols, encoded or proposed, are emoji, nor should they be.

Except on Samsung

.


Re: New tool unidump

2017-03-19 Thread Rebecca T
I maintain a list of various Unicode tools and resources at
unicode.yea.rs and always welcome new additions!

On Sat, Mar 18, 2017 at 1:42 AM, Janusz S. Bien  wrote:

> Quote/Cytat - Manuel Strehl  (Fri 17 Mar 2017
> 09:44:15 PM CET):
>
> Hi,
>>
>> for my work on codepoints.net and Emojipedia I found myself repeatedly
>> in a place, where I needed some tool like hexdump to inspect the content
>> of a string. However, instead of raw bytes I am more interested in the
>> code points that the string is composed of. So I wrote this tool.
>>
>
> Is somebody maintaining a list of such utilities?
>
> There is a page
>
> http://www.unicode.org/resources/online-tools.html
>
> but I remember that earlier a page on the site used to be links to the
> programs mentioned in 2012 "Tool to convert characters to character names",
> in particular to Bill Poser's uniutils (http://billposer.org/Software
> /unidesc.html) and the orphaned unihist by a student of mine (
> https://bitbucket.org/jsbien/unihistext). I'm unable to find them now.
>
> Best regards
>
> Janusz
>
> --
> Prof. dr hab. Janusz S. Bień -  Uniwersytet Warszawski (Katedra
> Lingwistyki Formalnej)
> Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
> jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~
> jsbien/
>
>


Combining solidus above for transcription of poetic meter

2017-03-17 Thread Rebecca T
When transcribing poetic meter (scansion
), it is common to use two symbols
above the line (usually a breve [U+306  ̆] for stressed syllables and a
solidus
/ slash [U+2F /] for unstressed syllables) to indicate stress patterns. Ex:

 ̆/   ̆  /  ̆   /̆  / ̆/
When I consider how my light is spent

(John Milton, On His Blindness)

Other symbols used in place of the breve are a cross / x (U+D8 × or U+78 x)
or
bullet (U+B7 · or U+2022 •).

This approach, however, is problematic; the lack of a combining slash above
character means that two lines of text must be used, and any non-monospaced
font (or any platform where multiple consecutive spaces are truncated into
one
by default, such as HTML) makes keeping the annotations properly aligned
with
the text difficult or impossible — depending on your email client, the above
example may be entirely misaligned. Being able to use combining diacritics
for
scansion would make these problems obsolete and enable a semantic
transcription
of meter.

Would a proposal to add a combining solidus above (and possibly a combining
reversed solidus above to support Hamer, Wright, and Trager-Smith
notations) be
supported?


Re: Emoji Packs

2016-12-22 Thread Rebecca T
Yes, by running process.py in a directory containing
http://unicode.org/emoji/charts/full-emoji-list.html (I re-commited and
renamed data.html to full-emoji-list.html for clarity), the script will
generate images in an /img/ sub-directory. (Be careful that /img/ already
exists! Strange things might go wrong if it doesn’t.)

Also note that I’m running Python 3.5 on Windows — I’m fairly confident the
differences between platforms is fairly minor, but a certain degree of
zaniness may

So yes: the directory should look like (at a minimum):

.../repository-directory (DIRECTORY)
├─▷ full-emoji-list.html (FILE)
├─▷ process.py   (FILE)
└─▷ /img (DIRECTORY)
 │
 └─▷ ... (OUTPUT FILES)

I hope that’s clear enough! Tell me if any of that doesn’t make sense.



On Wed, Dec 21, 2016 at 10:20 PM, Chris Monteleone <cjm2...@gmail.com>
wrote:

> Thank you so much Rebecca, this is really above and beyond.
>
> I'm messing around with the script a bit to control how it's all
> organized/named. So here's the dumb question: How do I run the script to
> get it to pull the images from the website?
>
> First of all, when I downloaded everything from github I didn't get the
> data.html, I got '.gitignore'. Is the index page you mentioned found at
> http://unicode.org/emoji/charts/index.html? or is it one of those pages
> that lists all of the objects on a website?
>
> Once I have that, do I just run process.py?
>
> I'm so sorry for being dumb, but thanks again!
>
> Chris
>
> On Wed, Dec 21, 2016 at 12:07 PM, Rebecca T <637...@gmail.com> wrote:
>
>> Okay, I threw something together.
>>
>> github.com/years/emoji has all 18,615 images from the charts, and
>> the generating script is there as process.py
>> <https://github.com/years/emoji/blob/master/process.py> as well!
>>
>> All the images are thrown together in one directory, but if there’s a
>> better way to organize them, please let me know!
>>
>> On Wed, Dec 21, 2016 at 10:28 AM, Chris Monteleone <cjm2...@gmail.com>
>> wrote:
>>
>>> "Unfamiliar" would be an understatement. If you feel like putting that
>>> together it would be appreciated, but no pressure.
>>>
>>> Thank you!
>>>
>>> On Tue, Dec 20, 2016 at 11:01 PM, Rebecca T <637...@gmail.com> wrote:
>>>
>>>> The charts include the images as inline base64, right?Parsing them out
>>>> with Python might not be a bad idea, depending on how well-organized the
>>>> HTML is. If you’re unfamiliar with it, I might be able to throw together a
>>>> quick script later.
>>>>
>>>> On Tue, Dec 20, 2016 at 10:59 PM Chris Monteleone <cjm2...@gmail.com>
>>>> wrote:
>>>>
>>>>> Sir, you are a scholar and a gentleman. Your swift actions of charity
>>>>> are much appreciated.
>>>>>
>>>>> Now if you happen to know where I can find the same thing for Samsung,
>>>>> Facebook, and Windows that would be everything I need.
>>>>>
>>>>> Thanks!
>>>>> Chris
>>>>>
>>>>> PS
>>>>> I have spent a fair amount of time looking for these, I'm not just
>>>>> delegating my tedious work to the internets.
>>>>>
>>>>> On Tue, Dec 20, 2016 at 8:09 PM, Christoph Päper <
>>>>> christoph.pae...@crissov.de> wrote:
>>>>>
>>>>> Chris Monteleone <cjm2...@gmail.com>:
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> > I would like to download emoji from every available vendor into a
>>>>> well organized folder with the code points as file names.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I assume you’re looking for <https://github.com/iamcal/emoji-data/>.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>


Re: Emoji Packs

2016-12-21 Thread Rebecca T
Okay, I threw something together.

github.com/years/emoji has all 18,615 images from the charts, and the
generating script is there as process.py
<https://github.com/years/emoji/blob/master/process.py> as well!

All the images are thrown together in one directory, but if there’s a
better way to organize them, please let me know!

On Wed, Dec 21, 2016 at 10:28 AM, Chris Monteleone <cjm2...@gmail.com>
wrote:

> "Unfamiliar" would be an understatement. If you feel like putting that
> together it would be appreciated, but no pressure.
>
> Thank you!
>
> On Tue, Dec 20, 2016 at 11:01 PM, Rebecca T <637...@gmail.com> wrote:
>
>> The charts include the images as inline base64, right?Parsing them out
>> with Python might not be a bad idea, depending on how well-organized the
>> HTML is. If you’re unfamiliar with it, I might be able to throw together a
>> quick script later.
>>
>> On Tue, Dec 20, 2016 at 10:59 PM Chris Monteleone <cjm2...@gmail.com>
>> wrote:
>>
>>> Sir, you are a scholar and a gentleman. Your swift actions of charity
>>> are much appreciated.
>>>
>>> Now if you happen to know where I can find the same thing for Samsung,
>>> Facebook, and Windows that would be everything I need.
>>>
>>> Thanks!
>>> Chris
>>>
>>> PS
>>> I have spent a fair amount of time looking for these, I'm not just
>>> delegating my tedious work to the internets.
>>>
>>> On Tue, Dec 20, 2016 at 8:09 PM, Christoph Päper <
>>> christoph.pae...@crissov.de> wrote:
>>>
>>> Chris Monteleone <cjm2...@gmail.com>:
>>>
>>>
>>> >
>>>
>>>
>>> > I would like to download emoji from every available vendor into a well
>>> organized folder with the code points as file names.
>>>
>>>
>>>
>>>
>>>
>>> I assume you’re looking for <https://github.com/iamcal/emoji-data/>.
>>>
>>>
>>>
>>>
>>>
>>>
>


Re: a character for an unknown character

2016-12-21 Thread Rebecca T
U+FFFD REPLACEMENT CHARACTER �

On Wed, Dec 21, 2016 at 3:05 AM Philippe Verdy  wrote:

> there's a "replacement" control, whose rendering is undefined. It may
> represent any missing part covering more than one character, such as parts
> that have been burned, or overstrikken. This Unicode character can act as a
> substitute but its rendering is purposely undefined. An application may
> show some greyed box there, but it should not be the tofu box used for
> characters not mapped in the specified fonts.
> Older encoduing used the ASCII control "SUB" for representing this
> function. Some terminals displayed it as a filled box Other documents have
> used the ASCII DEL control for the same purpose. However for Unicode
> encodings ASCII controls should be avoided.
>
> This is not an Emoji, as Emojis have a clear visual representation and
> semantics (and often specific colors). But you're right, it should be a
> symbol in Unicode (like Emojis, but unlike ASCII controls)
>
> 2016-12-21 3:29 GMT+01:00 Martin Mueller :
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> I’m new to this list. Please excuse my technical incompetence.
>
>
> Is there a Unicode character that says “I represent an alphanumerical
> character, but I don’t know which”.  This is a very common problem in the
> transcription of historical texts where you have lacunas. Often,
>
> the extent of the lacuna is known, and the alphabet is known as well. The
> EEBO TCP transcriptions of English texts before 1700 are good examples.
> They are SGML transcriptions, where missing stuff is represented by 
> elements with attributes about this
>
> or that. This is efficient when it comes to pages, very inefficient when
> it comes to individual characters.
>
>
> There is a Web character—a diamond with a question mark inside it—which
> means “I may know what this character represents, but I can’t display it”.
> Which is a very different message. On the other hand, if you
>
> extened the use of that character, it probably wouldn’t’ create much
> ambiguity.
>
>
>
> In the TCP project, various code points from the Geometrical were used to
> represent lacunae. The black circle (\u25cf) has been used as the character
> for a missing character.This is OK and unambiguous in its
>
> context.   But would be nice to have a special character for just that
> purpose, and given the number of emoji, this doesn’t seem to be a
> particularly frivolous request.  Which alphabet, you might ask. But that
> doesn’t really matter. There is a very high probability
>
> that the missing character comes from the character set of the surrounding
> words. And if that isn’t the case, the transcriber wouldn’t know it. S/he
> sees that there is something, perhaps even that there is just one of it,
> but doesn’t know which
>
>
>
>
>
>
> Martin Mueller
>
>
> Professor emeritus of English and Classics
>
>
> Northwestern University
>
>
>
>
>
>
>
>
>
>
>
>


Re: Emoji end goal

2016-10-12 Thread Rebecca T
Agreed. I think a good response to “that’d _double_ the codepoints, so we
should just add a ligature” is “if it would be such a burden to implement
that you don’t want to use space in the charts for what are, fundamentally,
hundreds of *semantically different* ideographs, why are we dumping that
burden onto vendors?”

On Wed, Oct 12, 2016 at 1:09 PM, Philippe Verdy  wrote:

> I think that emojis at the minimum shoudl all be dispalyable isolately,
> without being required to form pseudo ligatures or to use colors. Skin
> colors can still be displayed with a patchwork-like rectangle after it and
> could still use monochromaic pattern fills. The number of combinations is
> exploding and most of them are infact not evident at all (or are highly
> culturally oriented).
>
> Amojis should remain simple, showing basic shapes, but I don't see why it
> could not differentiate a man or a woman, independantly of the ligatures
> that may be created with them (using a completely invented adhoc
> "orthography" that actually follows no standard at all and does not match
> cultural differences or the way we perceive the associations, that are more
> and more limiting their semantic interpretation in a too much restricted
> way.
>
> We certaionly don't have enough history is using emojis for creating and
> standardizing such pseudo-orthography. Emojis remain a new pseudo-language,
> but it reuses a typography based on visible symbols that have a long
> cutlural tradition with other cultural meanings and many unexpected
> semantics that don't work with the current associations created.
>
> So in fact I only support very few associations:
> - associating two "Flag" pseudo-letters (but a rendering should still be
> OK if the emojis just show the actual letters within a left or right part
> of a frame for a flag., without attempting to combine them into an actual
> colored flag (which will need to evolve with time).
> - associating skin color emojis after an emoji for a real human person or
> perosn face (no need this in fiction characters or for coloring other parts
> such as hands, fingers, eyes, hair, nose...)
>
> In all cases, colors should always remain an option. Please keep emojis
> simple and always usable in isolation, leaving their interpretation and
> associations only to reading humans according to their local culture and
> social interactions. The way they are used now is in fact abusing the
> initial goal of Unicode encoding which is to not encode according to
> specific languages or culture, and not break their basic semantic. byt
> mising them into something that is not clearly separable and does not carry
> the same amount of semantics.
>
> 2016-10-12 18:31 GMT+02:00 Doug Ewell :
>
>> Leonardo Boiko wrote:
>>
>> 
>>
>> Gosh, even I wouldn't have gone that far.
>>
>> --
>> Doug Ewell | Thornton, CO, US | ewellic.org
>>
>>
>


Re: Emoji end goal

2016-10-12 Thread Rebecca T
Sure, and kanji have romanisations but that doesn’t make the latin alphabet
language neutral. And yes, emoji were supposed to be language neutral but
all the implementers made them default to male. I think you have an
*argument* with skin-tone neutrality but I think you’d be hard-pressed to
find any POC who think the Fitzpatrick modifiers were a mistake.

Also, the “what if my skin was blue” argument is a red herring — nobody has
blue skin, so it’s a moot point.
However, if you do find yourself drinking silver, I suggest U+1F922
濫 Nauseated Face.

On Wednesday, October 12, 2016, zelpa  wrote:

> >"all ethnicities deserve equal representation in media" or "all
> combinations of genders and professions should be considered equally"
> I wasn't aware that bald yellow people were a race, sorry. If anything,
> adding the skintone modifiers has made me feel LESS included, what if I
> don't fit in to one of the 5 categories? What if I drank too much colloidal
> silver and have blue skin? Sure would be nice to be able to express an
> emotion without also expressing my gender and race. What a wacky world
> would that be. And as for the professions? As I've said on the mailing list
> in the past, the current proposal makes it IMPOSSIBLE to display certain
> professions as gender-neutral. Is that really a step forward? Can we not
> just have gender-neutral, race-neutral emoji? Is that really too much to
> ask?
>
>
> On Thu, Oct 13, 2016 at 12:47 AM, Leonardo Boiko 
> wrote:
>
>> Yes, the end goal of the Unicode Consortium is media attention by way of
>> virtue signaling. For every online article about emoji modifiers, each
>> individual member of the Consortium earns a fifty-Euro bonus from our
>> masters, the global feminist cultural-Marxist Jewish conspiracy, for our
>> support in propagating political correctness and ultimately implementing
>> ONU's One World Government. In fact, the end goal for emoji (as originally
>> planned by Gramsci and Adorno in UAX #1922) is to be the mandatory
>> Newspeak-style writing system of the NWO, so as to brainwash citizens away
>> from scientific truths like race realism or the sociobiology of gender. As
>> soon as WOMAN+ ZWJ+President Hillary finish assassinating the last
>> remaining ASCII reactionaries, full emoji deployment will be in order, and
>> we'll indoctrinate every child to internalize standard Communist dogma such
>> as "all ethnicities deserve equal representation in media" or "all
>> combinations of genders and professions should be considered equally
>> valid". The lead experiments at Tumblr and Instagram were very successful,
>> proving that emoji have great potential as tools of indoctrination.
>>
>> 2016/10/12 10:02 "zelpa" :
>>
>>> So what exactly is the end goal for emoji? First we had the fitzpatrick
>>> skin modifiers, now there's the proposal for gendered emoji sequences using
>>> ZWJ. There was even the proposal for the hair colour modifier in TR 53. So
>>> what is the true end goal? Will we one day be able to display our Fallout 4
>>> character with a single emoji and 60 modifiers? And honestly, who is asking
>>> for these additions? Does anybody WANT a hair colour modifier? Seems to me
>>> like the consortium might just be pandering to a few silly requests (by
>>> people who have no actual idea what unicode is) to get media attention.
>>>
>>
>


Re: Emoji end goal

2016-10-12 Thread Rebecca T
Well, I think it’s definitely important to have representation and
expression for people of all skin tones and genders even in things like
emoji.

I think we’re rapidly reaching a limit for variation sequences, and I’m
personally not begging for hair color modifiers (although I would welcome
them).

I do worry a bit about the burden of supporting emoji on new systems.
Drawing thousands (not that anyone can even count how many emoji there are)
is a significant burden on developers creating new systems, and the
alternative (tofu) isn’t appealing. There is Symbola (which leaves
something to be desired, to say the least) and the graphical solutions,
like Apple’s image-based or Microsoft’s layered-vector approach, have
non-trivial implementations (stuff I wouldn’t want to take care of if I was
creating a new system).

I guess what I’m saying is: does anyone want to extent Unifont into the
astral planes?

On Wednesday, October 12, 2016, zelpa  wrote:

> So what exactly is the end goal for emoji? First we had the fitzpatrick
> skin modifiers, now there's the proposal for gendered emoji sequences using
> ZWJ. There was even the proposal for the hair colour modifier in TR 53. So
> what is the true end goal? Will we one day be able to display our Fallout 4
> character with a single emoji and 60 modifiers? And honestly, who is asking
> for these additions? Does anybody WANT a hair colour modifier? Seems to me
> like the consortium might just be pandering to a few silly requests (by
> people who have no actual idea what unicode is) to get media attention.
>