Re: About cultural/languages communities flags

2015-02-10 Thread Joan Montané
2015-02-10 17:16 GMT+01:00 Doug Ewell :

>
> In order to make a system like this work with an arbitrary number of
> symbols, a terminating symbol would have to be defined. Finding the
> longest match between a string of symbols and a TLD wouldn't work;
> someone might really want to encode "Brazil, United States, Sweden,
> Lesotho" consecutively, and would not want this converted to "Brussels."
>
> And as Ken pointed out, TLDs are TLDs; they are not a general-purpose
> geographic coding system. They don't include every sub-national region
> or separatist group, only the ones that Donuts and similar companies
> chose to register. There's no TLD for Abkhazia, for example, or for
> ISIS.
>
>
well, my propose for using GeoTLDs is an answer to the question "where do
you put the line?"

I agree a terminating symbol would help in expanding RIS system.


> IMHO keept tied to 2-alpha codes is a poor choice for users. May be
> > industry manufactures could find a better approach.
>
> Let's hope that industry manufacturers adhere to the standard instead of
> going off on their own. I thought that was the idea when all these
> cell-phone symbols were added to Unicode in the first place.
>
>
I really full agree. Manufacturers must follow standards. I support
standard, but IMHO RIS dessign is very strict.

Unicode doesn't define flags.
Unicode doesn't define country flags.
Unicode define a mechanism to define ISO country (and dependent
territories) flags

But manufacturers doesn't follow 100% ISO country codes, for instance,
dependent territories codes are usually mapped to country flag [1]. This is
a choice made by industry manufacturers, but, it's not in ISO.

Another choice made by industry is using a private code, like XK for
Kosovo, that's good!

The issue with Scotland, Walles, Catalonia and similar flags is a chicken
and egg situation. If a manufacturer wants to add such flags, standard
doesn't allow it!!! (PUA can be used, of course). And Unicode doesn't
expand RIS because manufacturers doesn't use these flags.

IMHO RIS mechanism should be expanded being more flexible, beyond 2 char
RIS. Unicode doesn't define flags, it defines a mechanism. Manufacturers
will choice supported flags, just like they are doing now!

So, the real question here is: Where do you put the line?

Currently it's put on ISO 3166-1 + some customizations made by industry,
but always it's tied to 2 char RIS. IMHO this is too poor for covering real
world use/request.

I suggested using currently ISO country codes + cultural/language TLDs.
Maybe there is a better approach

Best regards,
Joan Montané



[1] https://github.com/googlei18n/region-flags/blob/master/ALIASES
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Shervin Afshar
This thread turns more and more absurd by the email! I apologize to people
on the list who have to tolerate this; it might be noisy and annoying, but
it is important.

Doug Ewell asked:

You mean the one where you said that Gmail has had ROBOT FACE for a long
> time?


Let me use copy-paste for your convenience:

Robot Face is available on Gmail (GChat), Facebook, and Twitch among others
> (calculating the size of user community is left as an assignment for the
> reader). That's enough usage for consideration by the UTC even if the
> symbol is not present in a character encoding standard.


and then, Doug Ewell wondered:

You mean to say that any time Gmail or someone adds a private-use
> character or embeddable graphic for TOILET PAPER or TIRE IRON or BEER
> KEG, that Unicode is essentially obliged to add an emoji to maintain
> compatibility with it?
>

Yes, but the industry is already moving away from character-based solutions
and towards sticker-based solutions as we speak. Right now, Facebook is
moving in this direction, as well as Line, Trello, and many others. But
things which were added beforehand have precedence to be proposed to
Unicode.


> Well, perhaps that's how it is now. But that isn't the way Unicode used
> to be.


Well...Since you seem to be so keen on Internet memes, here's one[6] for
you.

[6]:
http://www.quickmeme.com/img/2a/2ab86791fe23ec5c73dc6d46c2cc5bef14e5ca47ba9208571b79c078fb2af561.jpg


↪ Shervin

On Tue, Feb 10, 2015 at 10:27 AM, Doug Ewell  wrote:

> Shervin Afshar  wrote:
>
> >> Of course not. But that's been a stated condition for labeling
> >> something as "compatibility."
> >
> > It *is* compatibility; go back and read my email where I mentioned
> > exactly where it was used.
>
> You mean the one where you said that Gmail has had ROBOT FACE for a long
> time?
>
> You mean to say that any time Gmail or someone adds a private-use
> character or embeddable graphic for TOILET PAPER or TIRE IRON or BEER
> KEG, that Unicode is essentially obliged to add an emoji to maintain
> compatibility with it?
>
> Well, perhaps that's how it is now. But that isn't the way Unicode used
> to be.
>
> Fuddily-duddily,
>
> --
> Doug Ewell | Thornton, CO, USA | http://ewellic.org
>
>
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: About cultural/languages communities flags

2015-02-10 Thread Christopher Fynn
One area where this would be useful is for indicating national teams
in football (soccer), rugby and other sports where England, Scotland,
Wales and N. Ireland play separately internationally.

On 10 February 2015 at 12:10, Mark Davis ☕️  wrote:
>
> On Tue, Feb 10, 2015 at 12:11 AM, Ken Whistler  wrote:
>>
>> for the full context, and for the current 26x26 letter matrix which is
>> the basis for the flag glyph implementations of regional indicator
>> code pairs on smartphones.
>>
>> SC, SO, ST are already taken, but might I suggest putting in for
>> registering
>> "AB" for Alba? That one is currently unassigned.
>>
>> Yeah, yeah, what is the likelihood of BSI pushing for a Scots two-letter
>> code?! But seriously, if folks are planning ahead for Scots independence
>> or even some kind of greater autonomy, this is an issue that needs to
>> be worked, anyway.
>>
>> In the meantime, let me reiterate that there is *no* formal relationship
>> between TLD's and the regional indicator codes in Unicode (or the
>> implementations
>> built upon them). Well, yes, a bunch of registered TLD's do match the
>> country
>> codes, but there is no two-letter constraint on TLD's. This should already
>> be apparent, as Scotland has registered ".scot" At this point there isn't
>> even
>> a limitation of TLD's to ASCII letters, so there is no way to map them
>> to the limited set of regional indicator codes in the Unicode Standard.
>>
>> Not having a two letter country code for Scotland that matches the
>> four letter TLD for Scotland might indeed be a problem for someone,
>> but I don't see *this* as a problem that the Unicode Standard needs
>> to solve.
>
>
> I want to add to that that there are already a fair number of ISO 2-letter
> codes for regions that are administered as part of another country, like
> Hong Kong. There are also codes for crown possessions like Guernsey. So
> having a code for Scotland (and Wales, and N. Ireland) do not really break
> precedent. But as Ken says, the best mechanism is for the UK to push for a
> code in ISO and the UN.
>
> Mark
>
> — Il meglio è l’inimico del bene —
>
> ___
> Unicode mailing list
> Unicode@unicode.org
> http://unicode.org/mailman/listinfo/unicode
>

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Shervin Afshar
>
> I was responding to a point that Frédéric Grosshans made [1] about
> these symbols being added for compatibility with Japanese telco usage.
> That argument could be used for the original emoji set, but not for new
> emoji; those are supposed to follow the regular criteria.


The compatibility argument can also be applied to major vendors who are
using emoji other than Japanese vendors; you can find a list of 20-30 of
them here[3]. Add to that list, Facebook and Google. If it is commonly in
use, it has a precedence to be proposed for addition to Unicode.

To have an informing, objective conversation, people should first look at
the actual criteria[4] (as well as the criteria for encoding symbols[5])
and see if what they are claiming is actually according to the criteria or
not.

[3]: http://www.emoji-cheat-sheet.com/
[4]: http://www.unicode.org/reports/tr51/#Selection_Factors
[5]: http://unicode.org/pending/symbol-guidelines.html


> If you look at the set of new emoji proposed in L2/15-054 [2], you'll
> see that quite a few of them are justified by their current popularity
> on the Web. ("Selfie are very popular" was kind of striking. I guess at
> least one of my predictions was right.)
> [2] http://www.unicode.org/L2/L2015/15054r-emoji-tranche5.pdf
>

First of all, these are just proposed and not accepted. Secondly, requests
by online communities (either directly to UTC or through corp members)
creates a precedence for UTC to consider the symbol for encoding.


> > For a longer while now, some folks tend to use emoji as means to an
> > end other than what is in the scope of conversation regarding emoji.
> > And that is not acceptable.
> Sorry, I don't understand this.


No worries. I don't blame you. It's just the good ol' circular logic.


↪ Shervin

On Tue, Feb 10, 2015 at 10:07 AM, Shervin Afshar 
wrote:

> > Of course not. But that's been a stated condition for labeling something
> > as "compatibility."
>
> It *is* compatibility; go back and read my email where I mentioned exactly
> where it was used.
>
>
> ↪ Shervin
>
> On Tue, Feb 10, 2015 at 9:03 AM, Doug Ewell  wrote:
>
>> Mark Davis [image: ☕]️  wrote:
>>
>> >> In what character encoding standard, or extension, does ROBOT FACE
>> >> appear?
>> >
>> > Unicode has never been limited to what is in other character encoding
>> > standard or extensions, "official" or de facto.
>>
>> Of course not. But that's been a stated condition for labeling something
>> as "compatibility."
>>
>> --
>> Doug Ewell | Thornton, CO, USA | http://ewellic.org
>>
>>
>
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


RE: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Doug Ewell
Shervin Afshar  wrote:

>> Of course not. But that's been a stated condition for labeling
>> something as "compatibility."
>
> It *is* compatibility; go back and read my email where I mentioned
> exactly where it was used.

You mean the one where you said that Gmail has had ROBOT FACE for a long
time?

You mean to say that any time Gmail or someone adds a private-use
character or embeddable graphic for TOILET PAPER or TIRE IRON or BEER
KEG, that Unicode is essentially obliged to add an emoji to maintain
compatibility with it?

Well, perhaps that's how it is now. But that isn't the way Unicode used
to be.

Fuddily-duddily,

--
Doug Ewell | Thornton, CO, USA | http://ewellic.org


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Shervin Afshar
> Of course not. But that's been a stated condition for labeling something
> as "compatibility."

It *is* compatibility; go back and read my email where I mentioned exactly
where it was used.


↪ Shervin

On Tue, Feb 10, 2015 at 9:03 AM, Doug Ewell  wrote:

> Mark Davis ☕️  wrote:
>
> >> In what character encoding standard, or extension, does ROBOT FACE
> >> appear?
> >
> > Unicode has never been limited to what is in other character encoding
> > standard or extensions, "official" or de facto.
>
> Of course not. But that's been a stated condition for labeling something
> as "compatibility."
>
> --
> Doug Ewell | Thornton, CO, USA | http://ewellic.org
>
>
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


RE: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Doug Ewell
Mark Davis ☕️  wrote:

>> In what character encoding standard, or extension, does ROBOT FACE
>> appear?
>
> Unicode has never been limited to what is in other character encoding
> standard or extensions, "official" or de facto.

Of course not. But that's been a stated condition for labeling something
as "compatibility."

--
Doug Ewell | Thornton, CO, USA | http://ewellic.org


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Doug Ewell
Shervin Afshar  wrote:

>>> The issue is with your very rigid interpretation of the criteria for
>>> encoding new symbols. Is "appearing in an industry character set
>>> extension" an official phrasing that you keep referring to?
>>
>> It was either from the WG2 Principles and Procedures document, or
>> some other bit of Unicode/10646 folklore that I've read over the past
>> 22 years of keeping up with Unicode/10646. I should look up the exact
>> wording.
>
> Yes, please. I would like to have that policy noted for my future use.

I hadn't said, of course, that no new symbols could ever be encoded
unless they appeared in an industry character set or extension.

I was responding to a point that Frédéric Grosshans made [1] about
these symbols being added for compatibility with Japanese telco usage.
That argument could be used for the original emoji set, but not for new
emoji; those are supposed to follow the regular criteria.

[1] http://unicode.org/pipermail/unicode/2015-February/001246.html

Here is a passage from TUS 7.0, Section 2.3 that may shed light:

"Conceptually, compatibility characters are characters that would not
have been encoded in the Unicode Standard except for compatibility and
round-trip convertibility with other standards. Such standards include
international, national, and vendor character encoding standards. For
the most part, these are widely used standards that pre-dated Unicode,
but because continued interoperability with new standards and data
sources is one of the primary design goals of the Unicode Standard,
additional compatibility characters are added as the situation warrants.

"Compatibility characters can be contrasted with ordinary (or
non-compatibility) characters in the standard—ones that are generally
consistent with the Unicode text model and which would have been
accepted for encoding to represent various scripts and sets of symbols,
regardless of whether those characters also existed in other character
encoding standards."

> It's not about encoding what "they" please. Compatibility was the
> issue with the first set of emoji symbols. The rest of symbols are
> being added for various other reasons; e.g. diversity, parity,
> requests, etc.

Right. So the "compatibility with Japanese telcos" argument cannot be
used here.

> Also, random JPEG and meme don't apply here and you're mistaken to
> assume that GChat and Facebook fit in this category.

If you look at the set of new emoji proposed in L2/15-054 [2], you'll
see that quite a few of them are justified by their current popularity
on the Web. ("Selfie are very popular" was kind of striking. I guess at
least one of my predictions was right.)

[2] http://www.unicode.org/L2/L2015/15054r-emoji-tranche5.pdf

>> Great. Go ahead and encode them, UTC. But don't say it's because your
>> hands are tied and you have no choice.
>
> Quoting an official UTC communication?

Quoting an off-list remark.

> For a longer while now, some folks tend to use emoji as means to an
> end other than what is in the scope of conversation regarding emoji.
> And that is not acceptable.

Sorry, I don't understand this.

--
Doug Ewell | Thornton, CO, USA | http://ewellic.org


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Mark Davis ☕️
> In what character encoding standard, or extension, does ROBOT FACE appear?

Unicode has never been limited to what is in other character encoding
standard or extensions, "official" or de facto.


Mark 

*— Il meglio è l’inimico del bene —*

On Mon, Feb 9, 2015 at 9:16 PM, Doug Ewell  wrote:

> Shervin Afshar  wrote:
>
> >> There is no longer any requirement that the robot faces and
> >> burritos appear first in any sort of industry character set
> >> extension, with which Unicode is then obliged to maintain
> >> compatibility.
> >
> > Only if you don't consider existing usage and popular requests as
> > requirement and precedence; for example Gmail had Robot Face for a
> > long time.
>
> I said there was no longer a requirement *that the items appear first in
> an industry character set extension*, right?
>
> In what character encoding standard, or extension, does ROBOT FACE
> appear? "Gmail has it" is not a character encoding standard. Neither is
> "People want to see it."
>
> "Most popularly requested," as a criterion for adding a character, is
> absolutely new to Unicode. Earlier I wrote privately to a Unicode
> officer about whether PERSON TAKING SELFIE and GIRL TWERKING and PERSON
> DUMPING ICE BUCKET OVER HEAD would be ephemeral enough, and got no
> reply. (What, you've forgotten the ice-bucket craze already? That's
> exactly why "most popular at the moment" wasn't supposed to be a
> criterion.)
>
> --
> Doug Ewell | Thornton, CO, USA | http://ewellic.org
>
>
> ___
> Unicode mailing list
> Unicode@unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

2015-02-10 Thread Mark Davis ☕️
We are being pretty conservative about what we add. There are approximately
1,200 emoji characters now (see tr51), and we're anticipating adding
perhaps 50 per release. And we are encouraging a "sticker" approach for the
longer term.

On the other hand, I wouldn't be surprised if the 41 emoji characters that
we are planning on for Unicode 8.0 end up having a higher frequency of use
than the other 7K characters in the release.


Mark 

*— Il meglio è l’inimico del bene —*

On Mon, Feb 9, 2015 at 9:36 PM, Michael Everson 
wrote:

> I like symbols a lot. But I know that I and a number of people have been
> thinking that too much emphasis is being put on emoji.
>
> Michael Everson * http://www.evertype.com/
>
>
> ___
> Unicode mailing list
> Unicode@unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: About cultural/languages communities flags

2015-02-10 Thread Doug Ewell
Joan Montané  wrote:

> As far as I see, my informal request for expanding current RIS design
> hasn't a good response. I understand it. Flags are cause of disputes,
> and it isn't an issue for Unicode encode them.

There are technical limitations as well. Because the mechanism is
already defined on pairs of symbols, it's not trivial to expand it to
three or more symbols. Earlier, you had written:

> I agree some strange behaviour can appear if a 3 RIS string, take CAT,
> is shown in a system with only 2 RIS support (a Canadian will appear
> followed by a T).

but in fact, every one of the combinations in the original post will
generate incorrect output (if any):

> [S][C][O][T] --> it shows Scottish flag

Seychelles, "undefined"

> [C][Y][M][R][U] --> it shows a Welsh flag

Cyprus, Mauritania, unpaired symbol

> [B][Z][H] --> it shows a Breton flag

Belize, unpaired symbol

> [C][A][T] --> it shows Catalan flag

Canada, unpaired symbol

> [E][U][S] --> it shows a Basque flag

"Undefined" (or European Union if the implementation happens to include
an extension to ISO 3166 exceptionally reserved code elements), unpaired
symbol

> [G][A][L] --> it shows a Gallician flag

Gabon, unpaired symbol

In order to make a system like this work with an arbitrary number of
symbols, a terminating symbol would have to be defined. Finding the
longest match between a string of symbols and a TLD wouldn't work;
someone might really want to encode "Brazil, United States, Sweden,
Lesotho" consecutively, and would not want this converted to "Brussels."

And as Ken pointed out, TLDs are TLDs; they are not a general-purpose
geographic coding system. They don't include every sub-national region
or separatist group, only the ones that Donuts and similar companies
chose to register. There's no TLD for Abkhazia, for example, or for
ISIS.

> IMHO keept tied to 2-alpha codes is a poor choice for users. May be
> industry manufactures could find a better approach.

Let's hope that industry manufacturers adhere to the standard instead of
going off on their own. I thought that was the idea when all these
cell-phone symbols were added to Unicode in the first place.

--
Doug Ewell | Thornton, CO, USA | http://ewellic.org


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode