Re: metric for block coverage

2018-02-18 Thread Leonardo Boiko via Unicode
The most useful feature for me (Debian user, linguist) would be a search system where I can provide a string, and filter fonts to those who include glyphs for all characters; ideally if I could also combine it with other search criteria, like OTF features (true small caps, etc.). I often write aca

Text rendering of emojis (was: Re: First bonafide use (≠ mention) of emoji by an academic publisher?)

2017-07-24 Thread Leonardo Boiko via Unicode
per via Unicode : > Leonardo Boiko: > > > > It would just be more > > satisfying for me if the blue books were encoded in the font as U+1F4D8s, > > rather than U+F02Ds. Or, if the colors are done at a CSS level, as 📕 > > U+1F4D5 CLOSED BOOKs or the like. Same goes

Re: First bonafide use (≠ mention) of emoji by an academic publisher?

2017-07-24 Thread Leonardo Boiko via Unicode
help this sort of interoperability, while causing no problems for anyone (it's, after all, just a matter of choosing which numbers you give to which icons; calling it #128213 is as easy as calling it #61485). 2017-07-24 1:45 GMT+02:00 Doug Ewell via Unicode : > Leonardo Boiko wro

Re: Curly Lips Code Point Proposal

2017-01-24 Thread Leonardo Boiko
t means, do we have a code point for it already? >> >> If we do, maybe that'd be already enough. >> >> There are indeed already many emoji misused here and there due different >> visual meaning in different cultures (the triumph face, as example, the one >>

Re: Curly Lips Code Point Proposal

2017-01-24 Thread Leonardo Boiko
I find it curious that this community defines the ":3" emoji as "" or "om nom nom". In my circles it's quite the frequent emoticon/emoji, but I've never seen it used this way. Instead, they usually employ it as "cat mouth" or "cat face", implying the mood of cuteness, perkiness or mischievou

Re: On the upcoming LATIN LETTER SMALL CAPITAL Q

2016-12-26 Thread Leonardo Boiko
2016-12-26 13:45 GMT-02:00 Yifán Wáng <747.neut...@gmail.com>: > You may be under impression that the letter has something to do with > morphology, but my argument is that the original "Letter for > representation of morpheme in Japanese" is a misnomer and this letter > is totally unrelated to mor

Re: On the upcoming LATIN LETTER SMALL CAPITAL Q

2016-12-26 Thread Leonardo Boiko
n with its fellows /ɴ/, /ʀ/, /ʜ/ etc. Some books make all of them capitals, but others all small capitals. Making into small capitals avoids possible confusions with variables like /C/ or /V/. 2016-12-26 5:03 GMT+09:00 Leonardo Boiko : > Agreed with Yifán Wáng... But I wonder about the need for

Re: On the upcoming LATIN LETTER SMALL CAPITAL Q

2016-12-25 Thread Leonardo Boiko
Agreed with Yifán Wáng... But I wonder about the need for the character in the first place. Are we going to add a full small-caps set, too, given its use in morphological glosses? Isn't it enough to use a regular 'Q' in plain-text, and style to small caps in rich text? I can see the rationale for

Re: Manatee emoji?

2016-11-23 Thread Leonardo Boiko
I support the creation of manatee emoji, but only if it’s accompanied by a new modifier for emoji size, coming in the varieties: TINY, SMALL, LARGE, HUGE. This would allow us to say "oh, the [HUGE MANATEE]" in emoji. 2016-11-23 13:15 GMT-02:00 James Kass : > http://patch.com/florida/southtampa/pe

Re: Emoji end goal

2016-10-12 Thread Leonardo Boiko
Yes, the end goal of the Unicode Consortium is media attention by way of virtue signaling. For every online article about emoji modifiers, each individual member of the Consortium earns a fifty-Euro bonus from our masters, the global feminist cultural-Marxist Jewish conspiracy, for our support in p

Re: Noto unified font

2016-10-08 Thread Leonardo Boiko
That's not "his" definition of non-free. Restrictions on selling copies commercially violate the Free Software Foundation's definition of non-free: https://www.gnu.org/philosophy/free-sw.html https://www.gnu.org/licenses/license-list.html#NonFreeSoftwareLicenses And also the Open Source Initiativ

Re: What happened to Unicode CLDR's site?

2016-10-04 Thread Leonardo Boiko
The Google error message felt a bit too harsh for a webhosting client who merely exceeded their allotted bandwidth. It made it sound like the website was hosting something illegal. 2016-10-04 13:00 GMT-03:00 Philippe Verdy : > It looks that an automated bot run by Google detected an excessive us

Re: Why incomplete subscript/superscript alphabet ?

2016-10-03 Thread Leonardo Boiko
2016-10-03 14:51 GMT-03:00 Jukka K. Korpela : > They are not control or formatting characters. They are markup used at > higher protocol levels – in different markup systems > > That's exactly the point, yes.

Re: Why incomplete subscript/superscript alphabet ?

2016-10-03 Thread Leonardo Boiko
Besides, there are already control/formatting characters for such purposes – several ones, even. They look like this: , ^{}, \textsuperscript{}, \*{ \*} … What's more, these powerful control/formatting characters allow one to apply not only super/subscript and blackletter, but many more features

Re: Why incomplete subscript/superscript alphabet ?

2016-09-30 Thread Leonardo Boiko
The Unicode codepoints are not intended as a place to store typographically variant glyphs (much like the Unicode "italic" characters aren't designed as a way of encoding italic faces). The correct thing here is that the markup and the font-rendering systems *should* automatically work together to

Emoji semantic drift

2016-09-02 Thread Leonardo Boiko
This isn't news, but I find it interesting how some emoji are being used in ways that differ from their Unicode names, reflecting alternative interpretations of common glyphs. I'll compare data from the Unicode chart with interpretations taken from Emojipedia, which I think do reflect real-world us

Re: I'm excited about the proposal to add a brontosaurus emoji codepoint

2016-08-29 Thread Leonardo Boiko
We obviously need an emoji for every species name listed within The Official Registry of Zoological Nomenclature. I propose a new set of Basic Latin characters, the Zoological Nomenclature Indicator Symbols, to be used for spelling scientific names, which are then rendered as cutesy colorful icons

Re: Whitespace characters in Unicode

2016-08-04 Thread Leonardo Boiko
entifiers", which are, and suggests various compromises between them. 2016-08-04 17:44 GMT-03:00 Sean Leonard : > I read through TR18...it mainly says that == \s == \p{Whitespace} > == property White_Space is true. Does it say anything else or more > significant than that, that I&

Re: Whitespace characters in Unicode

2016-08-04 Thread Leonardo Boiko
What Mark Davis said; also, depending on what you need, consider taking a look at the definitions used by Unicode regexpes, at http://unicode.org/reports/tr18/ . 2016-08-04 16:37 GMT-03:00 Sean Leonard : > Hi Unicode Folks: > > I am trying to come up with a sensible sets of characters that are >

Re: Implementation of ideographic description characters

2016-08-04 Thread Leonardo Boiko
Hi, the IDS provide too little information for rendering kanji properly. Take a look into https://en.m.wikipedia.org/wiki/Chinese_character_description_languages . Hello, As I read that it is possible for an implementation of Unicode that can render those ideographic description characters into r

Re: Re: Adding half-star to Unicode?

2016-06-24 Thread Leonardo Boiko
> My bet is that they'll prefer using whatever code they want, hacking fonts as necessary to overtake another political symbol when they'll want. They could liberate a code point from the private use area. 2016-06-24 14:10 GMT-03:00 Philippe Verdy : > My bet is that they'll prefer using whatev

Re: non-breaking snakes

2016-05-04 Thread Leonardo Boiko
2016-05-04 4:14 GMT-03:00 Shriramana Sharma : > Isn't there some Japanese orthography feature that already does > something like this? Japanese (and Chinese) vertical calligraphy can do arbitrary-length stretching of lines (like the Arabic kashida under discussion, and like most cursive scripts in

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Leonardo Boiko
Yeah, I've stumbled upon this a lot in academic Japanese/Chinese texts. I try to copy some Chinese character, only to find out that it's really a string of random ASCII characters. Is there only one of those crap PDF pseudo-encodings? If so, I'll use a conversor next time... 2016-03-17 14:57 GMT

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Leonardo Boiko
The PDF *displays* correctly. But try copying the string 'ti' from the text another application outside of your PDF viewer, and you'll see that the thing that *displays* as 'ti' is *coded* as Ɵ, as Don Osborn said. 2016-03-17 14:26 GMT-03:00 Pierpaolo Bernardi : > That document displays correctl

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-10 Thread Leonardo Boiko
Isn't it better to use some sort of COMBINING ENCLOSING CIRCLE? 2016/03/10 8:30 "Andrew West" : > On 10 March 2016 at 07:00, Martin J. Dürst wrote: > > > > because these numbers can go up to the 200s, it doesn't make sense to > > register them all as characters (one would need over 500!). > > I d

Re: Girl, 12, charged for threatening her school with emojis

2016-03-01 Thread Leonardo Boiko
Ah but that is a "majority" by a dictionary/type count. Due to Zipf's Law, in language matters we should always distinguish dictionary counts from actual usage. E.g. Twitter is very popular in Japan, and I think we'll all agree that the top used kanji are predominantly modal: http://emojitracker.

Re: Girl, 12, charged for threatening her school with emojis

2016-02-29 Thread Leonardo Boiko
It's a picture-character, sure; but I'd think that, like kaomoji before them, they've been used since the beginning to express the attitude of the writer, a kind of "emotion" (in linguistic terms, the "mood" of the utterance). For example, consider the ubiquitous ♥ sign, which also predates cellph

Re: Hentaigana proposal

2015-12-16 Thread Leonardo Boiko
I like the more descriptive names, but I'd like to have this data available in some supplementary table available anyway, regardless of the naming scheme. 2015-12-16 16:17 GMT-02:00 Garth Wallace : > On Wed, Dec 9, 2015 at 7:55 AM, Nicolas Tranter > wrote: > > I comment as a western Japanologist

Re: Stationary vs. waving flags (was: Re: Adding RAINBOW FLAG to Unicode)

2015-07-06 Thread Leonardo Boiko
2015-07-06 17:11 GMT-03:00 Doug Ewell : > Is it your belief that users who wish to display an emoji flag care > whether the flag is shown stationary versus flapping in the wind? I think a waving white flag is an emoji symbol for "truce/surrender/come in peace", whereas a white rectangle doesn't ea

Re: "Bunny hill" symbol, used in America for signaling ski pistes for novices

2015-05-28 Thread Leonardo Boiko
Serious question: Has someone discussed a generic combining mechanism? I mean, characters with an effect like "combine the last two". Say, '!' + '?' + COMBINING OVERLAY = '‽'. '!' + '!' + COMBINING SIDE BY SIDE = '‼', and so on. Similar in spirit to the Ideographic Description Characters, but me

Re: "Bunny hill" symbol, used in America for signaling ski pistes for novices

2015-05-28 Thread Leonardo Boiko
You could use U+1F407 RABBIT combined with U+20E4 COMBINING ENCLOSING UPWARD POINTING TRIANGLE, and pretend the triangle is a hill. 🐇 ⃤ If only we had a combining rabbit, we could add rabbits to U+1F3D4 SNOW CAPPED MOUNTAIN. Or anything else. 2015-05-28 16:46 GMT-03:00 Philippe Verdy : > Is t

Re: (R), (c) and ™

2014-12-18 Thread Leonardo Boiko
For the record, the emoji selection issue is also affecting the Google Talk/Hangouts web client, where U+2122 (trademark, ™), U+00AE (registered, ®), U+00A9 (copyright, ©), and U+2194 (left right arrow, ↔) seem to be treated as emoji and displayed in funky blue: http://namakajiri.net/pics/screensh

Re: The rapid ... erosion of definition ability

2014-11-17 Thread Leonardo Boiko
2014-11-17 10:15 GMT-02:00 Mark Davis ☕️ : > I agree (except for the derivation of "emoji"). > Oh, you're totally right: *e-* “drawing” plus *-moji *“character”. My mistake! 😖 ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/list

Re: The rapid … erosion of definition ability

2014-11-17 Thread Leonardo Boiko
2014-11-17 9:10 GMT-02:00 Andreas Stötzner : > [sign] in its generality it is just perfect. […] At least, we should (in English) speak of Emoticons and not Emoji. […] if precise terming is tricky I find it better to generalize These are your opinions. I find them to be perfectly valid (exactly as

Re: The rapid … erosion of definition ability

2014-11-17 Thread Leonardo Boiko
2014-11-17 9:08 GMT-02:00 Magnus Bodin ☀ : > Just to clarify. The transcribed form "ji" in the japanese emoji word > 絵文字 is probably not from mandarin, since 字 is pronounced "zi" in mandarin. > Is it pronounced "ji" in an other chinese language? > Japanese doesn't usually borrow from Mandarin. R

Re: The rapid … erosion of definition ability

2014-11-17 Thread Leonardo Boiko
"Sign" is too general. The word has no less than 12 meanings, and can refer e.g. to many Unicode characters that are not emojis ("the sharp sign", "the less-than sign").[1] It's useful to have a specialized word referring specifically to the new pictograms used to color electronic messages with

Re: Quasiquotation marks

2014-06-10 Thread Leonardo Boiko
What about using U+0331 "combining macron below" or U+0320 "combining minus below"? Here are some samples: U+0331 "̱test"̱ “̱test”̱ U+0320 "̠test"̠ “̠test”̠ 2014-06-10 9:39 GMT-03:00 Philippe Verdy : > (overstriking with or in HTML) Modern HTML phased out , and has semantic meanings inna

Re: Swift

2014-06-04 Thread Leonardo Boiko
Even Ruby could do it for years, despite having notoriously bad Unicode string support back then: irb> 日本語 = 'むらさき' => "むらさき" irb> íslenska = 'fjólublár' => "fjólublár" irb> 日本語 + ' ' + íslenska => "むらさき fjólublár" I don't think this feature saw much use, since programme

Re: CJK stroke order data: kRSUnicode v. kRSKangXi

2014-03-09 Thread Leonardo Boiko
I don't know about the points you raise, but I wish it was easier to help proofread Unihan data. Back in 2012 I compared kKangXi to kIRGKangXI and found 252 conflicts, besides the cases where a character only has one or the other. I even put together a simple tool to help fixing this, with links

Re: problem with combining diacritcs in HTML5

2012-10-07 Thread Leonardo Boiko
On 7 October 2012 04:37, Jukka K. Korpela wrote: > Inspecting the Courier New font, version 5.11, I noticed that the advance > width of the glyph for U+0332 (glyph uni0331) is 1129 units. I think this > explains it all. The advance width should be 0. > > And other fonts have the same problem, at l

kKangXi and kIRGKangXi fields in Unihan

2012-05-23 Thread Leonardo Boiko
plain text, and it should be simple to compare several tries for double-checking. I don’t know if there’s interest in such a thing at the moment, but if so, there you go. All values apply to Unihan data downloaded a week ago or so. -- Leonardo Boiko http://namakajiri.net/nikki References: [1] http

Japanese shinjitai/kyuujitai mappings in Unihan

2011-08-17 Thread Leonardo Boiko
Hi, There are lots of variant fields mapping characters in the Unihan database. Do any of the fields (or a combination) reproduce the Japanese kyūjitai↔shinjitai (“old character forms”/“new character forms”) mappings, as present in the standard Jōyō Kanji-hyō table[1]? I looked around but faile

Re: Best smart phones & apps for diverse scripts?

2010-10-29 Thread Leonardo Boiko
you’re writing Chinese or Japanese, but if you’re writing, say, Spanish, or English with a single symbol requiring you to engage Unicode mode, you’re back to telegram age. I don’t know in your countries, but here the price per SMS really bites… -- Leonardo Boiko

Re: Creative people on Twitter

2010-10-12 Thread Leonardo Boiko
I guess it’s only a matter of 𝐭𝐢𝐦𝐞 before people start doing things like 𝖙𝖍𝖎𝖘 (notice this email is plain-text). -- Leonardo Boiko

Re: ,,semi-virgula''

2010-08-31 Thread Leonardo Boiko
ay it’s the word for “comma” in Portuguese. We also call the semicolon a “ponto e vírgula” – period and comma, dot-and-comma. http://www.etymonline.com/index.php?search=virgula -- Leonardo Boiko

Re: TeX: insert Unicode character

2010-08-24 Thread Leonardo Boiko
spec) that has its glyph. -- Leonardo Boiko

Re: Accessing alternate glyphs from plain text

2010-08-10 Thread Leonardo Boiko
of “true meaning”. Plain text is to me simply yet another attempt to represent language, and like all similar tools, has its strengths and weaknesses—in particular, like all language representation tools, it can encode some kinds of “meanings” and not others. -- Leonardo Boiko

Re: Draft Proposal to add Variation Sequences for Latin and Cyrillic letters (was Re: long s (was: Draft Proposal to add Variation Sequences for Latin and Cyrillic letters))

2010-08-04 Thread Leonardo Boiko
email William, my mistake.) -- Leonardo Boiko

Re: Most complete (free) Chinese font?

2010-08-02 Thread Leonardo Boiko
:57, Leonardo Boiko wrote: > >> Emphasis on “the only font _I know_”.  I didn’t know Andron nor Everson >> Mono.  Besides, while quality, both seem to be non-free, which is something >> I’m not interested in as a Debian user (nothing against it, it just isn’t my >> thing

Re: Most complete (free) Chinese font?

2010-08-02 Thread Leonardo Boiko
Aug 2010, at 08:52, Andreas Stötzner wrote: > >> Am 01.08.2010 um 13:03 schrieb Leonardo Boiko: >> >>> And it’s the only font I know with U+2E19 PALM BRANCH ⸙ >> >> It is not. Andron has it. > > As does Everson Mono. > > Michael Everson * thttp

Re: Most complete (free) Chinese font?

2010-08-01 Thread Leonardo Boiko
Also I think the developers themselves declare it to be "ugly, but > complete", if I remember correctly. > > /jan > > Leonardo Boiko wrote: >> >> Unifont is not ugly for its intended purpose: a bitmapped, fixed-width >> 16-pixel font.  It’s great for terminals

Re: Most complete (free) Chinese font?

2010-07-30 Thread Leonardo Boiko
font (the hànzì in Unifont are actually based on it, IIRC). The website is http://wenq.org/enindex.cgi , but it’s pre-packaged for all major distros. -- Leonardo Boiko http://namakajiri.net