Re: Flag tags

2012-05-31 Thread Philippe Verdy
Note that I gave an URL for the Flags Of The World site which is
hosted by a commercial vendor of manufactured flags.
But as the site is built from a collection of static HTML webpages,
without any script, its is easily mirrored on various place.
For now, Wikipedia prefers referencing a vendor-neutral website at this address:

http://flagspot.net/flags/

The pages are identical, only the base URL change, all relative URLs
are identical starting at the "flag/" folder.

It is fed by discussions and contributors on its old mailing list (the
main place of discussions related to the FOTW project), whose volume
is huge (about than one half million messages sent since 1993, about
2000 or 3000 mails per month), notably because it also conveys photos,
and graphic designs. But the effective discussions are even larger
within the local associations that are members of FIAV.

The FIAV itself (from which the FOTW wide is just a small visible part
containing a summary of the huge collection of flags discussed and
maintained by the various member associations and its contributors)
has offices in Belgium (presidence), Texas (general secretariat), and
UK (conferences). I think it is illusory to restart completely the
huge work already performed by the FIAV and exposed partly in the FTOW
website.

If you ever want to know how best the codification should be made (how
many distinct characters you need to support the reencoding into
abstract symbols that will later be recombinable into ligatures
showing the actual flags), I suggest that the UTC contacts the general
secretary.

All contact details are on this page: http://flagspot.net/flags/vex-fiav.html

(once again ignore the base URL "http://flagspot.net"; before "/flag",
which varies depending on the various website mirrors you'll find
easily on web search engines).

Immediately, you won't need anything more that a subset of symbols to
represent each letter of the code. The registry can be developped
later only for standardizing the recognized ligatures. In the Unicode
standard, there's no need to encode any subcollection of flags, even
if we can explain how to use these symbols into ligatures.

The representative glyphs shown in TUS and ISO/IEC 10646 will just
display the default symbols containing the associated ASCII letter
used in the registry (and most probably this should be restricted to
ASCII characters usable in common filesystems for naming graphic files
in whatever format the rendering applications will recognize, or to
name the glyphs of ligatures when developing fonts showing more than
just the separate representative glyphs displaying the codes). For
allowing compativility with filenames in various filesystems, I just
suggest using a single letter case, avoiding characters like "/" or
"\" which could be incompatible with some OSes or with the syntax of
hierarchial URLs.

Characters currently allowed for language codes should all be usable :
ASCII letters, digits, hyphen separators. The slash could be added
later for precise versioning purpose. The slash in a standard code
would be mapped to the symbol not showing any letter or digit, but a
space, in their representative glyph.
I'm not sure that the colon shuld be used as it may cause
compatitibility problems when deciphering a series of symbols into
therir associated ASCII character par of codes that would be rempped
to filenames. And the dot should not be used if it breaks file
extensions in local filesystems or in URLs for reeiving a known flag
from a collection of prebuilt glyphs stored as graphic files (SVG,
PNG...).

If we encode each character of the Flag code into symbols, we'll need
then less than 50 characters in each subcollection for the start
symbol, the medial symbols, and the final sybols. All would fit within
192 codepoints allocated in the SMP (or in Place 14, but that plane is
not intended for visible symbols).

As long as a policy is documented that allows starting representing
immediately at least the generic country flags with their ISO 3166
codes, in a viable namespace, using just 3 Unicode symbols, it will
remain safe for immediate use. Versioned flags may be encoded later
once the registry is working.

2012/6/1 Asmus Freytag :
> On 5/31/2012 5:06 PM, Michael Everson wrote:
>>
>> On 1 Jun 2012, at 00:59, Doug Ewell wrote:
>>
>>> So I could propose, say, the Pigpen cipher?
>>
>> I would rather you help convince people about the Unifon proposal.
>>
> hehe.
>
> A./
>
> PS:what's Unifon and what's it got to do with it?
>>
>>
>>
>



Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 5:06 PM, Michael Everson wrote:

On 1 Jun 2012, at 00:59, Doug Ewell wrote:


So I could propose, say, the Pigpen cipher?

I would rather you help convince people about the Unifon proposal.


hehe.

A./

PS:what's Unifon and what's it got to do with it?








Re: Flag tags

2012-05-31 Thread Philippe Verdy
e.g. the empty namespace could be reserved for country codes.
Namespace separation could use the hyphen (like in language codes).

So the generic US flag would be coded as simply as -US (with the leading hyphen)

If rendering the defautl glyphs, you'll see that hyphen. The
laternative being to use a space separator, so that the standard code
would just be rendered showing only the country code with the default
glyphs.

Other namespaces extensions will use a non empty prefix per category.

2012/6/1 Philippe Verdy :
> That's why I just propose an external registry rather then a direct
> encoding of individual flags.
>
> A naming convention (using namespace prefixes) could be used to make
> sure that the common codes from ISO 3166-1 will be usable.
>
> I'm not sure that the CLDR TC is currently competent to develop such a
> registry, but it may work along with the IAVA to develop the naming
> convention for use in the registry (which could be hosted by IAVA or
> by Unicode. To be decided later.
>
> The CLDR TC would be involved in the development of the registry
> rules, for its stability.
>
> 2012/6/1 Doug Ewell :
>> This would be a great resource for developing a flags code, as Philippe
>> suggested earlier, an idea I actually think has quite a bit of merit.
>> However, I'm not sure it has much relevance to character encoding. It's not
>> that hard to imagine encoding 220 or so current national flags or
>> placeholders, but you wouldn't want to expand this to, say, tens of
>> thousands.



Re: Flag tags

2012-05-31 Thread Philippe Verdy
That's why I just propose an external registry rather then a direct
encoding of individual flags.

A naming convention (using namespace prefixes) could be used to make
sure that the common codes from ISO 3166-1 will be usable.

I'm not sure that the CLDR TC is currently competent to develop such a
registry, but it may work along with the IAVA to develop the naming
convention for use in the registry (which could be hosted by IAVA or
by Unicode. To be decided later.

The CLDR TC would be involved in the development of the registry
rules, for its stability.

2012/6/1 Doug Ewell :
> This would be a great resource for developing a flags code, as Philippe
> suggested earlier, an idea I actually think has quite a bit of merit.
> However, I'm not sure it has much relevance to character encoding. It's not
> that hard to imagine encoding 220 or so current national flags or
> placeholders, but you wouldn't want to expand this to, say, tens of
> thousands.



Re: Flag tags

2012-05-31 Thread Doug Ewell

Michael Everson wrote:


So I could propose, say, the Pigpen cipher?


I would rather you help convince people about the Unifon proposal.


I actually wasn't planning to propose Pigpen. I was just surprised the 
idea would even be considered.


--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­




Re: Flag tags

2012-05-31 Thread Doug Ewell
This would be a great resource for developing a flags code, as Philippe 
suggested earlier, an idea I actually think has quite a bit of merit. 
However, I'm not sure it has much relevance to character encoding. It's 
not that hard to imagine encoding 220 or so current national flags or 
placeholders, but you wouldn't want to expand this to, say, tens of 
thousands.


--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­


-Original Message- 
From: Philippe Verdy

Sent: Thursday, May 31, 2012 18:06
To: Doug Ewell
Cc: Asmus Freytag ; Shawn Steele ; Michael Everson ; unicode Unicode 
Discussion

Subject: Re: Flag tags

Let's not forget the largest collection of flags collected on the web
: Flags of the World, maintained since lots of years (initially via
Usenet before the Internet we know today). All other references are
found there, including the International Association of Vexillologal
Association (IAVA), that should be involved in the project of building
and maintaining a registry of flag codes.

The FOTW seb site has always had several domains, some disappearing,
but mirrored together. This one is the most stable :

http://www.crwflags.com/fotw/flags/index.html 





Re: Flag tags

2012-05-31 Thread Philippe Verdy
2012/6/1 Asmus Freytag :
>> They are not stable across history, so they
>> should be versioned, but most frequent uses will omit the precise
>> versioning, so that flags will be instantly replaced at any time (e.g.
>> if you encode a flag for US, how many stars will there be on it ?
>
> Obviously these are all glyph variants?

If you speak about the flag of Lybia, differences are significant when
there are opposed parties. During the last Libyan revolution, those
flags were used very distinctly. They were not free variants of each
other.

Yes you may have a genic flag code that maps to the latest version of
the flag, but versioned flags should be encoded separately.

Similar to the encoding of languages : you may have "en" or "en-US"
vs. "en-GB" and several subtags for variants...



Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 3:29 PM, Philippe Verdy wrote:

2012/5/31 Asmus Freytag:

On 5/31/2012 12:07 PM, Karl Pentzlin wrote:

Am Donnerstag, 31. Mai 2012 um 20:09 schrieb John H. Jenkins:

JHJ>
JHJ>... that because some
JHJ>countries have currency symbols with decidated code points, other
JHJ>countries will make *new* currency symbols and demand that *they*
JHJ>get dedicated code points ...

Seriously speaking, flag symbols and currency signs are completely
different topics.

Every country has exactly one flag, right now.

This is wrong if you consider their dependencies. Some dependencies
legally have their own flag used *instead* of the flag for the
main/metropolitan part of the country. So countries can have several
flags.
And some have well established flags for their constituent parts - 
because they arose of a

federation of entities.


Then consider that countries may also have several flags for different
usages (national flag, civil flag, naval flag...)


Good point.


Also the same flag may be shared by different political entities (e.g.
The European Union reuses the flag of the Council of Europe, with
permission, and made it one of its official emblems). Some flags are
also shared without permission, because the original design was not
protected internaitonally or had fallen in public domain (including in
the country of origin).

Examples?



Flags have strong political issues that are out of scope for encoding
directly in the UCS. They are not stable across history, so they
should be versioned, but most frequent uses will omit the precise
versioning, so that flags will be instantly replaced at any time (e.g.
if you encode a flag for US, how many stars will there be on it ?

Obviously these are all glyph variants?

A./


Libya changed its flag recently, returning to an older flag ; in many
cases it will not really matter, but if you have to deal with encoded
texts that are also versioned themselves, it will not be acceptable to
have flag designs freely interchanged as it would cause confusion :
consider the case of countries that appeared in the history as part of
a split or merge, in an article speaking about their history, and
identifying the armies and generals with their respective flag...).






Re: Flag tags

2012-05-31 Thread Philippe Verdy
Let's not forget the largest collection of flags collected on the web
: Flags of the World, maintained since lots of years (initially via
Usenet before the Internet we know today). All other references are
found there, including the International Association of Vexillologal
Association (IAVA), that should be involved in the project of building
and maintaining a registry of flag codes.

The FOTW seb site has always had several domains, some disappearing,
but mirrored together. This one is the most stable :

http://www.crwflags.com/fotw/flags/index.html



Re: Flag tags

2012-05-31 Thread Michael Everson
On 1 Jun 2012, at 00:59, Doug Ewell wrote:

> So I could propose, say, the Pigpen cipher?

I would rather you help convince people about the Unifon proposal.

Michael Everson * http://www.evertype.com/




Re: Flag emoji

2012-05-31 Thread Markus Scherer
On Thu, May 31, 2012 at 4:18 PM, Mark Davis ☕  wrote:

> If we used ZWSP, then we'd have:
>
>  ← 🇦🇦 // but the code wouldn't know when to also absorb
> adjacent ZWSPs.
>
>  → 🇦🇦 // but the code would need context to know when to add
> adjacent ZWSPs.
>

I think we could do this reasonably well by providing two mappings for the
same sjis bytes:

sjis <-> A+A+ZWSP
sjis <- A+A

A longest-match conversion would get the desired results.

I believe there were more objections to the ZWSP approach though. I think
one was about losing the ZWSP in editing and copy-paste. (I didn't write
down details.)

markus


Re: Flag tags

2012-05-31 Thread Doug Ewell

So I could propose, say, the Pigpen cipher?

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­

-Original Message- 
From: Asmus Freytag

Sent: Thursday, May 31, 2012 16:03
To: Doug Ewell
Cc: Shawn Steele ; verd...@wanadoo.fr ; Michael Everson ; unicode 
Unicode Discussion

Subject: Re: Flag tags

On 5/31/2012 12:03 PM, Doug Ewell wrote:



Another alphabet, even that with 1:1 correspondence to Latin, but,
again, not recognizable as such are the "dancing men". They at least
can be demonstrated to have appeared in print.
Are substitution ciphers candidates for encoding?



To the degree that the use of the substitution is "style", no. Fraktur
and Insular forms have been unified for Latin. But these styles are also
recognizable (if not to all users, then a significant number). And,
there's a benefit in identifying them primarily with the Latin alphabet,
and only secondarily with the precise style.

The "dancing men" are more like Braille. There's one source where they
have been given a particular "mapping" to the Latin alphabet, but that
mapping is not the only one possible. The whole point of them is that
the actual mapping has to be known or discovered each time.

So, yes, these would have to be encoded by shape, not by target.
A./ 





Flag emoji

2012-05-31 Thread Mark Davis ☕
The UTC considered as one of the possible approaches to the problem. While
easier in terms of line breaking, there'd still be a requirement to change
grapheme cluster boundaries and word boundaries to join sequences
like 🇦🇦, and people felt the approach didn't work well with encoding
conversion. About conversion, I think the discussion was something like the
following:

It is relatively simple to have a mapping like:

   ↔   🇦[joiner]🇦

If we used ZWSP, then we'd have:

 ← 🇦🇦 // but the code wouldn't know when to also absorb
adjacent ZWSPs.

 → 🇦🇦 // but the code would need context to know when to add
adjacent ZWSPs.

Both of those would be complicated for encoding converters to handle.
People also felt that 🇦[joiner]🇦 would be more consistent with treating
the sequence as a unit, both conceptually and in fonts.

I personally favored the ZWSP, but was convinced during the discussion that
ZWJ was a better approach.

--
Mark 
*
*
*— Il meglio è l’inimico del bene —*
**



On Thu, May 31, 2012 at 2:47 AM, Andrew West  wrote:

> On 31 May 2012 00:24, Mark Davis ☕  wrote:
> >
> > There is definitely a problem.
>
> Is it really such a problem?  Why can't implementations simply use
> ZWSP to demarcate the 2-character units in a sequence of more than two
> regional indicator symbols (and maybe always emit 2-character codes
> wrapped between ZWSP on either side to be safe), so for example
> USESGE would be parsed as the regional indicator symbols
> for USA, SPAIN and Georgia, whereas USESGE would be
> parsed as the regional indicator symbols for U (invalid), Sweden,
> Singapore and E (invalid).  Algorithms such as line-breaking would not
> break between two regional indicator symbols, but only at a ZWSP.
>
> And if implementations wanted to support two- and three-letter
> regional codes, they might parse
> GBCYMENGNIRSCO as the codes for
> United Kingdom, Wales, England, Northern Ireland, and Scotland, and
> represent them visually with the appropriate flag icons.
>
> Andrew
>
>
>


Unicode sessions at Localization World Paris

2012-05-31 Thread Lisa Moore
On Monday, 4 June, noted experts Richard Ishida (W3C) and Addison Phillips
(Lab126) will team up to present a full day of sessions on Unicode.

In the morning, Richard Ishida will present “An Introduction to Writing
Systems and Unicode”, a tutorial that will introduce the basic functioning
of Unicode in dealing with non-Latin writing systems. It is an excellent
orientation for people new to these concepts, but it also offers content
for people at intermediate and advanced levels due to the breadth of
scripts discussed.

In the afternoon, Addison will present "Internationalization: An
Introduction", a two-part tutorial covering:

•What is internationalization?
•What is Unicode? Implementing and using the standard.
•How do you prepare software localization and translation?

Finally, Richard and Addison will present " Towards the Promised Land:
Globalization Developments in Web Standards", which surveys current
developments at the W3C.

You may register for any or all of these sessions via
http://localizationworld.com/lwparis2012/registration.php where you will
see the sessions in the preconference day.

This is an opportunity to get a taste of the Unicode conference to be held
in California on the following October 22-24, and see how the people on
your staff can benefit from a deeper knowledge of Unicode and
internationalization.

Lisa Moore
--














Re: Flag tags

2012-05-31 Thread Philippe Verdy
2012/5/31 Doug Ewell :
> Philippe Verdy wrote:
>
>> So to represent the flag of Japan, you could encode:
>>
>> FLAG INITIAL SYMBOL J
>> FLAG FINAL SYMBOL P
>> [...]
>
> For me, the existing Plane 14 mechanism would have worked just as well,
> without requiring three more duplicate sets of printable Basic Latin.

You can perfectly map this small set of  symbols in Plane 14.

And no, they are NOT confusable and not a duplicate set of Basic Latin
: their representative glyphs will be clearly different. They will be
REAL symbols, even if they embed a letter in their default
representative glyph (this letter will disappear when the ligatures
will be generated by renderers supporting a mapping from flag codes to
actual glyphs, either with fonts build specifically for some
recognized ligatured, or with the help of an external protocol to get
a flag from an external flags registry (which we don't need to specify
in Unicode).



Re: Flag tags

2012-05-31 Thread Philippe Verdy
2012/5/31 Asmus Freytag :
> On 5/31/2012 12:07 PM, Karl Pentzlin wrote:
>>
>> Am Donnerstag, 31. Mai 2012 um 20:09 schrieb John H. Jenkins:
>>
>> JHJ>  
>> JHJ>  ... that because some
>> JHJ>  countries have currency symbols with decidated code points, other
>> JHJ>  countries will make *new* currency symbols and demand that *they*
>> JHJ>  get dedicated code points ...
>>
>> Seriously speaking, flag symbols and currency signs are completely
>> different topics.
>>
>> Every country has exactly one flag, right now.

This is wrong if you consider their dependencies. Some dependencies
legally have their own flag used *instead* of the flag for the
main/metropolitan part of the country. So countries can have several
flags.

Then consider that countries may also have several flags for different
usages (national flag, civil flag, naval flag...)

Also the same flag may be shared by different political entities (e.g.
The European Union reuses the flag of the Council of Europe, with
permission, and made it one of its official emblems). Some flags are
also shared without permission, because the original design was not
protected internaitonally or had fallen in public domain (including in
the country of origin).

Flags have strong political issues that are out of scope for encoding
directly in the UCS. They are not stable across history, so they
should be versioned, but most frequent uses will omit the precise
versioning, so that flags will be instantly replaced at any time (e.g.
if you encode a flag for US, how many stars will there be on it ?
Libya changed its flag recently, returning to an older flag ; in many
cases it will not really matter, but if you have to deal with encoded
texts that are also versioned themselves, it will not be acceptable to
have flag designs freely interchanged as it would cause confusion :
consider the case of countries that appeared in the history as part of
a split or merge, in an article speaking about their history, and
identifying the armies and generals with their respective flag...).




Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 3:13 PM, Philippe Verdy wrote:

2012/5/31 Asmus Freytag:

On 5/30/2012 10:15 PM, Doug Ewell wrote:

A seemingly straightforward solution to the “unambiguous mapping” problem
would be to use the existing Plane 14 tag letters along with a new FLAG TAG,
say at U+E0002. Then  would unequivocally denote the
current Swiss flag. No need for separate lead and trail. Simple.

... What’s that? Oh, sorry, never mind. Deprecated.


Doug,

you put your finger on it. Any form of combining scheme is doomed to fail.

This includes the current approach of "Regional indicators".

You're wrong. The Régional indicators failed because they were encoded
at the character level, so that their scope of effect was supposed to
extended to arbitrary lengths of texts.


Why you call my position "wrong" when you agree with it, I don't know.

A./





Re: Flag tags

2012-05-31 Thread Philippe Verdy
2012/5/31 Asmus Freytag :
> On 5/30/2012 10:15 PM, Doug Ewell wrote:
>
> A seemingly straightforward solution to the “unambiguous mapping” problem
> would be to use the existing Plane 14 tag letters along with a new FLAG TAG,
> say at U+E0002. Then  would unequivocally denote the
> current Swiss flag. No need for separate lead and trail. Simple.
>
> ... What’s that? Oh, sorry, never mind. Deprecated.
>
>
> Doug,
>
> you put your finger on it. Any form of combining scheme is doomed to fail.
>
> This includes the current approach of "Regional indicators".

You're wrong. The Régional indicators failed because they were encoded
at the character level, so that their scope of effect was supposed to
extended to arbitrary lengths of texts.

Here it's just about how to represent a glyph (even if it's colored)
locally representing a flag. The scope of the encoded substring will
not go outside of this flag indicator, so it will work the same way as
if this were encoded as ligatures.

You can perfectly create a breaking rule that will aboid breaking the
sequence of encoded characters representing the flag with its code. It
can be handled perfectly as if it was an unbreakable word, surrounded
by two punctuation marks (which will still be a valid fallback display
method, in case of absence of the glyphs in fonts for this type of
string).

You can perfectly assign representative glyphs for the indididual
characters (these glyphs don't have to represent any complete flag,
just a part of a flag showing internally its code.

In fact, all characters used will be treted as separate symbols
(independantly of the fact that they *may* be ligatured to show the
actual flag design. The encoding will provide a clear indication that
substituting the list of default representative glyphs to an actual
flag will be valid (it won't break the character identities, as long
as there exists a registry describing the assigned flag codes,
reencoded with these symbols).

In other words, it avoids completely the need to encode directly any
flag of any political entity (or with a naming convention applied in
the vexillologist registry, for any other personal or organisational
flag). It avoids all copyright issues and the problem of legal
restriction of use of flags (including in some countries where some
flags are prohibited).




Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-31 Thread Philippe Verdy
Here he probably meant that if we need to encode many flags, each flag
code may be arbitrarily long. A solution based on combining characters
will not work correctly, and it will be better to use leading and
traling markers, or to use a codification that allows knowing where a
flag starts and where it finishes.

There are two solutions:

(1) use specific punctuation-like characters acting like brackets
(those brackets can be given also a visual glyph by themselves), and
encode the intermediate flag code using usual characters. This would
allow viable fallback representations of flags, even if they show the
codes (as letters will be encloded, for reasability, the set should be
restricted and probably only uppercase, so that letters can be reduced
easily within the enclosing sym

(2) restrict the subset of characters that are usable in flag
identification codes to a useful and productive subset of ASCII, then
reencode them as enclosed letters marking the start and end of the
code, as well as eventual medial codes. This eases the production of
fonts for a reasonnable representation of these codes within a visual
band looking like a flag, as well as allows those sequences to ve
easily converted into ligatures for showing the actual flags
(including with their colors if needed).

Your solution based on SWSP *separator* does not solve anything, it
does not clearly indicates that this is representing a flag, and will
not allow automated recognition and production of ligatures.

2012/5/31 Andrew West :
> On 31 May 2012 00:24, Mark Davis ☕  wrote:
>>
>> There is definitely a problem.
>
> Is it really such a problem?  Why can't implementations simply use
> ZWSP to demarcate the 2-character units in a sequence of more than two
> regional indicator symbols (and maybe always emit 2-character codes
> wrapped between ZWSP on either side to be safe), so for example
> USESGE would be parsed as the regional indicator symbols
> for USA, SPAIN and Georgia, whereas USESGE would be
> parsed as the regional indicator symbols for U (invalid), Sweden,
> Singapore and E (invalid).  Algorithms such as line-breaking would not
> break between two regional indicator symbols, but only at a ZWSP.
>
> And if implementations wanted to support two- and three-letter
> regional codes, they might parse
> GBCYMENGNIRSCO as the codes for
> United Kingdom, Wales, England, Northern Ireland, and Scotland, and
> represent them visually with the appropriate flag icons.
>
> Andrew
>
>




Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 22:57, Asmus Freytag wrote:

> See, there you go.

What do you mean by this?

Michael Everson * http://www.evertype.com/




Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 12:03 PM, Doug Ewell wrote:



Another alphabet, even that with 1:1 correspondence to Latin, but,
again, not recognizable as such are the "dancing men". They at least
can be demonstrated to have appeared in print.
Are substitution ciphers candidates for encoding?



To the degree that the use of the substitution is "style", no. Fraktur 
and Insular forms have been unified for Latin. But these styles are also 
recognizable (if not to all users, then a significant number). And, 
there's a benefit in identifying them primarily with the Latin alphabet, 
and only secondarily with the precise style.


The "dancing men" are more like Braille. There's one source where they 
have been given a particular "mapping" to the Latin alphabet, but that 
mapping is not the only one possible. The whole point of them is that 
the actual mapping has to be known or discovered each time.


So, yes, these would have to be encoded by shape, not by target.
A./



Re: Flag tags

2012-05-31 Thread Asmus Freytag

  
  
On 5/31/2012 1:56 PM, Shawn Steele wrote:

  
  
  
  
  
> First, reprinting Shakespeare's
  works using flags would make it immediately
  
> and utterly illegible to most
  speakers of English. So they would fail the test
  
> of being recognizably the same
  letter.
 
FWIW: The "Alpha" flag doesn't mean
  "A".  For example it also means "Diver Down".  Most of the
  flags have other meanings beyond just a letter, like Quebec
  & Quarantine.  So it's not just a substitution cipher. 
  Combinations can also have special meanings.  Additionally,
  repeaters make it more complicated than a simple substitution
  cipher,  eg: November, Oscar, Repeat2, Repeat1 for noon == 4
  different flags for 2 letters.
 

  

  

  
  

  
  

  
  

  

  

 
 
  

See, there you go.

A./

  



Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 12:07 PM, Karl Pentzlin wrote:

Am Donnerstag, 31. Mai 2012 um 20:09 schrieb John H. Jenkins:

JHJ>  
JHJ>  ... that because some
JHJ>  countries have currency symbols with decidated code points, other
JHJ>  countries will make *new* currency symbols and demand that *they*
JHJ>  get dedicated code points ...

Seriously speaking, flag symbols and currency signs are completely
different topics.

Every country has exactly one flag, right now.


But not all of these flags are used in writing - right now.

This is similar to not all currencies having a "symbol".

There's nothing wrong with encoding a subset and leaving the door open 
for additions - there's no reason to jump to encoding hundreds of 
concrete cloth and thread "symbols" without any indication that they are 
used in text. Or is there?


Also, for those of you not residing in North America, a point of 
information: the state flags of the 50 states of the USA are flown 
widely - if not as widely as the federal flag, and the accompanying 
symbols and designs (including seals) are widely used in publications. 
So, there's not a simple 1 country : 1 flag principle here - if you look 
at actual usage, there's a wide variety of practices.


A./

  Thus, in fact an
encoding proposal proposing only a few of them based on an
arbitrary collection made by some telephone companies without proving
any scrutiny for its making never can be acceptable for most national
bodies represented in ISO.

On the other hand, currencies may exist without a currency symbol
(as in fact most currencies do). In fact, all currency symbols
assigned to currencies valid today are included in Unicode now, with
only two exceptions after acceptance for the new Turkish Lira sign:

AZN Azerbaijan Manat (waiting for confirmation of its actual use),

ANG Netherlands Antillean guilder (used formerly mostly for NLG Dutch
  guilder which was valid until 2002; problematically unified with
  U+0192 LATIN SMALL LETTER F WITH HOOK; see
  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3588.pdf )

On this base, nobody will request the addition of other symbols
as precondition for acceptance for any new currency sign on ballot.

- Karl












RE: Flag tags

2012-05-31 Thread Shawn Steele
> First, reprinting Shakespeare's works using flags would make it immediately

> and utterly illegible to most speakers of English. So they would fail the test

> of being recognizably the same letter.



FWIW: The "Alpha" flag doesn't mean "A".  For example it also means "Diver 
Down".  Most of the flags have other meanings beyond just a letter, like Quebec 
& Quarantine.  So it's not just a substitution cipher.  Combinations can also 
have special meanings.  Additionally, repeaters make it more complicated than a 
simple substitution cipher,  eg: November, Oscar, Repeat2, Repeat1 for noon == 
4 different flags for 2 letters.


[Description: ICS 
November.svg]

[Description: ICS Oscar.svg]

[Description: ICS Repeat 
Two.svg]

[Description: ICS Repeat 
One.svg]






-Shawn


<><><><>

Re: Flag tags

2012-05-31 Thread David Starner
On Thu, May 31, 2012 at 12:03 PM, Doug Ewell  wrote:
> Asmus Freytag  wrote:
>
>> First, reprinting Shakespeare's works using flags would make it
>> immediately and utterly illegible to most speakers of English. So they
>> would fail the test of being recognizably the same letter.
>[...]
>> Another alphabet, even that with 1:1 correspondence to Latin, but,
>> again, not recognizable as such are the "dancing men". They at least
>> can be demonstrated to have appeared in print.
>
> Are substitution ciphers candidates for encoding?

Exactly. I've always thought that Cyrillicized Latin fonts (Яussiaи
with all Latin backing) and flag letters and various other weird
symbolic conversions are perfectly legal if limited Unicode fonts. The
Dancing Men are really a special font for Latin.

-- 
Kie ekzistas vivo, ekzistas espero.




Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 20:07, Karl Pentzlin wrote:

> AZN Azerbaijan Manat (waiting for confirmation of its actual use),

Which we cannot get. I even tried talking to an Azeri students' group on 
Facebook, and could not get them to even acknowledge that they understood what 
I was asking for. 

Somebody else sent us to the Central Bank.

All we would need is a photo of the symbol being used in a convenience store. 
But we can't even get that. 

> ANG Netherlands Antillean guilder (used formerly mostly for NLG Dutch
> guilder which was valid until 2002; problematically unified with
> U+0192 LATIN SMALL LETTER F WITH HOOK; see
> http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3588.pdf )

I wish there were an easy solution for that one.

Michael Everson * http://www.evertype.com/





Re: Flag tags

2012-05-31 Thread Karl Pentzlin
Am Donnerstag, 31. Mai 2012 um 20:09 schrieb John H. Jenkins:

JHJ> 
JHJ> ... that because some
JHJ> countries have currency symbols with decidated code points, other
JHJ> countries will make *new* currency symbols and demand that *they*
JHJ> get dedicated code points ...

Seriously speaking, flag symbols and currency signs are completely
different topics.

Every country has exactly one flag, right now. Thus, in fact an
encoding proposal proposing only a few of them based on an
arbitrary collection made by some telephone companies without proving
any scrutiny for its making never can be acceptable for most national
bodies represented in ISO.

On the other hand, currencies may exist without a currency symbol
(as in fact most currencies do). In fact, all currency symbols
assigned to currencies valid today are included in Unicode now, with
only two exceptions after acceptance for the new Turkish Lira sign:

AZN Azerbaijan Manat (waiting for confirmation of its actual use),

ANG Netherlands Antillean guilder (used formerly mostly for NLG Dutch
 guilder which was valid until 2002; problematically unified with
 U+0192 LATIN SMALL LETTER F WITH HOOK; see
 http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3588.pdf )

On this base, nobody will request the addition of other symbols
as precondition for acceptance for any new currency sign on ballot.

- Karl








RE: Flag tags

2012-05-31 Thread Doug Ewell
Asmus Freytag  wrote:

> First, reprinting Shakespeare's works using flags would make it
> immediately and utterly illegible to most speakers of English. So they
> would fail the test of being recognizably the same letter.
>
> Second, one place where the flags are still used today is sailboat
> races. Replacing the flag by a placard showing the letter would also
> not be acceptable in that context.
>
> So, seeing that Unicode nowadays has the support of SMS-specific
> symbols as part of its scope, who would like to be able to communicate
> with flags?
>
> Another alphabet, even that with 1:1 correspondence to Latin, but,
> again, not recognizable as such are the "dancing men". They at least
> can be demonstrated to have appeared in print.

Are substitution ciphers candidates for encoding?

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­






RE: Flag tags

2012-05-31 Thread Doug Ewell
One possible problem with either (a) encoding flags or (b) encouraging
the display of Regional Indicator Symbols as flags is that some authors
would want to use them to indicate the language of the text that
follows. I'm not talking about inline, plain-text "language tagging" in
the sense that UTC frowns upon, but literally a visual display of a
flag. 

It's common, particularly in Europe, to see English-language text marked
with a Union Jack, French-language text marked with the flag of France,
and so forth. Of course, we all know the problems with using national
flags to indicate languages, but it's common practice nevertheless.
Having Unicode characters for flags, especially well-supported ones,
might encourage this practice.

Of course, the Japanese phone users might have been doing this all along
with the existing 10 emoji flags.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­






Re: Preliminary proposal to encode Unifon in the UCS.

2012-05-31 Thread Jean-François Colson

Hello

I wrote: “1st possibility: a separate script. There’ll be no problem.”
You wrote: “There would, because the bulk of the script would look just 
like Latin, and the encoding committees consider this to be a security 
issue for internet spoofing for instance.”

I don’t understand.
Internet spoofing would be possible for example by mixing Latin and 
Cyrillic letters in internationalized domain names. For example, instead 
of paypal.com, you could take advantage of the fact that the first five 
letters all have looking alike Cyrillic letters and register one of the 
31 (2⁵-1) DIFFERENT domain names paypаl.com, payрal.com, payраl.com, 
paуpal.com, paуpаl.com, paурal.com, paураl.com, pаypal.com, pаypаl.com, 
pаyрal.com, pаyраl.com, pауpal.com, pауpаl.com, pаурal.com, pаураl.com, 
рaypal.com, рaypаl.com, рayрal.com, рayраl.com, рaуpal.com, рaуpаl.com, 
рaурal.com, рaураl.com, раypal.com, раypаl.com, раyрal.com, раyраl.com, 
рауpal.com, рауpаl.com, раурal.com or раураl.com to ask their paypal 
e-mail and password to your “customers”. That could only work if the 
said customer is very distracted or if he has previously typed 
“about:config” in the address bar and set network.IDN_show_punycode to 
false. (That works with Firefox. The way to do it could be different 
with other browsers.)
But, as far as I know, the domain names are commonly written in 
lowercase. When I type in capital a domain name which doesn’t exist, 
such as CUYOPUIESVRDKRSIXTVESVRDSHKSE.com, it is automatically converted 
in lowercase (http://www.cuyopuiesvrdkrsixtvesvrdshkse.com/) before the 
“not found” message is displayed.
In Unifon, only the capital letters would look alike. The lowercase 
letters would be different. There could be a problem with the letter o, 
but that would be a drop in the ocean, not more problematic than the 
letter ᴏ (small capital o), ο (Greek omicron), о (Cyrillic o), ⲟ (Coptic 
o), 𐐬 (Deseret o), ჿ (Georgian labial sign), ੦ (Gurmukhi zero), all the 
zeros, most of which look like circles, etc.
What exactly is the real security issue with Unifon as a separate 
script? Some one who wants to spoof will find a way to do it without that.






NOW, a few comments about the Unifon proposal.

You didn’t correct “for several the Hupa, Yurok, Tolowa, and Karok 
languages”.

There’s also the word “Karok”. Below, you write “Karuk”.

In the Unifon letters unified with existing characters, you forgot the 
letter I.


You propose a Latin capital letter small capital i to be paired with ɪ 
(Latin letter small capital i). Would ɪ have wider serifs when displayed 
in small caps?


For the Latin capital beta, you wrote: “The unique Latin capital form 
meets one of the major criteria for disunification.”
Could I use the same formula for Unifon? The unique Unifon small forms 
meet one of the major criteria for disunification…


In the previous proposal, you also included a letter which looked a 
little like a ƆC ligature or a rounded X. You called it zhay in n4195. 
Have you forgotten it deliberately? That’s the last letter in figure 1, 
although you wrote X in the caption.


You also used an X in Figure 7’s caption: it would be strange to have an 
X pronounced /ʒ/ (zh) in a phonemic alphabet for English.


In the first three columns of the table at page 12, the two parts of 
Latin letter oy are detached. In all samples of Unifon I’ve seen which 
use that letter, the vertical line of the turned Ⱶ is tangent to the 
right of the O.


In the same table, the Latin letter dhe should have a round shape. 
That’s one of the two features which permit to distinguish it from the 
Latin letter the.
In all Unifon fonts I know except one, the left part of the letter dhe 
is not really a T but something midway between a T and a Γ.


I think Latin letter the should have a small top bar.

In this table of the Tolowa Unifon alphabet, 
http://unifon.org/images/TOLOWA.jpg , some letters have a different 
value when followed by a small stroke which looks like an apostrophe. 
Should it be an ASCII apostrophe, a ’ (U+2019), a ʼ (U+02BC), a Ꞌ 
(saltillo) or something else?


On page 3, the capital ʃ looks like an enlarged form of the lowercase 
letter, different from the Greek capital sigma-like Ʃ. Would the unique 
Latin capital form meets one of the major criteria for disunification. 
What about the capital U with a tail?


I wonder whether the 8th letter of the 42-letter “Indian Unifon 
Single-Sound Alphabet” is a turned or a reversed C.


For the turned e-r, I think a new lower case is needed.

For the Latin letter reversed-e e, could the double ϵ, used for the same 
sound in the Initial Teaching Alphabet, be used as a lower case letter?


Would a separate proposal be required for the Initial Teaching Alphabet 
(http://en.wikipedia.org/wiki/Initial_Teaching_Alphabet)?

28 or 29 letters of this 44 letter alphabet are already supported:
b, c, d, f, ɡ, h, j, k, l, m, n are already supported.
ng ligature is different from ŋ.
p, r, s are already suppo

RE: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-31 Thread Doug Ewell
William_J_G Overington 
wrote:

> Further to that point of order, is there any rule that absolutely
> prevents the deprecated status of a character or collection of
> characters being removed?

UTC has not ever shown the slightest inclination to do so, if that
answers your question.

> I feel that by hybridizing the suggestions of Doug and Philippe that
> an elegant solution using tags and an advanced format font could be
> designed.

I had forgotten that the Regional Indicator Symbols from U+1F1E6 through
U+1F1FF had already been encoded. You can create such a font today if
you like, mapping pairs of these symbols to a flag representing the
country with that ISO 3166-1 code element. See TUS 6.1, Section 5.10,
next-to-last subsection (page 534) for details.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­






Re: Flag tags

2012-05-31 Thread John H. Jenkins

Michael Everson  於 2012年5月31日 上午11:57 寫道:

> When you encode a flag for Germany and the US, you automatically get a demand 
> for the encoding of a flag for Ireland and Iceland. That's the way it is. 


Oh, c'mon, Michael, next you'll be saying that because some countries have 
currency symbols with decidated code points, other countries will make *new* 
currency symbols and demand that *they* get dedicated code points, too. We all 
know how unrealistic a scenario *that* is.


=
John H. Jenkins
jenk...@apple.com






Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 18:51, Asmus Freytag wrote:

> The right answer would have been to encode the 10 flags and then agree to 
> *study* the needs for and best solutions available to address a more 
> comprehensive system "at a future date". The main problem I see in that 
> regard is impatience.

ISO NBs were, correctly, uncomfortable with the idea of encoding the flags of 
some countries and not of others. As representative of one of those NBs, I have 
no regrets about having made our proposal, which is still better than the 
current solution. 

> It's like with currency symbols - you code things when there's demonstrated 
> demand, you don't put place holders in, and you don't give codes to all the 
> three letter currency codes (like "USD" "CND" etc.).

When you encode a flag for Germany and the US, you automatically get a demand 
for the encoding of a flag for Ireland and Iceland. That's the way it is. And 
no, waiting for some vendor to put more flags in the phone is not going to 
solve it. If you don't understand the politics of this matter, well, I can't 
help you to do it. 

Michael Everson * http://www.evertype.com/





Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 9:34 AM, Michael Everson wrote:

On 31 May 2012, at 17:26, Asmus Freytag wrote:


you put your finger on it. Any form of combining scheme is doomed to fail.

That's why http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3680.pdf was the right 
solution.




No Michael.

While I've come to the conclusion that encoding some form of 
combinatorial tags is indeed doomed, I don't believe that encoding 
images for codes (or if you will, ASCII strings) is the answer - that's 
meta encoding of a different sort.


The right answer would have been to encode the 10 flags and then agree 
to *study* the needs for and best solutions available to address a more 
comprehensive system "at a future date". The main problem I see in that 
regard is impatience.


It's like with currency symbols - you code things when there's 
demonstrated demand, you don't put place holders in, and you don't give 
codes to all the three letter currency codes (like "USD" "CND" etc.).


A./



Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 9:30 AM, Michael Everson wrote:

On 31 May 2012, at 17:19, Asmus Freytag wrote:


Some of them can be substituted and will be recognized by all as "jolly roger", 
others will not.

The former set "may" be glyph variants - that is, if there's no contrastive 
usage, the latter cannot be.

They are logos for the actual dead pirate captains.


That's so. Do their heir's claim rights to them? That would exclude them 
from encoding forever.


But wait, aren't national flags "logos" for their respective countries?

A./

PS: This is the part I can't find funny:


They are glyph variants of "pirate flag" otherwise. Some are just obscure glyph 
variants.


In this case, on top of that, many represent symbols identifying particular 
bands, captains or ships (or nowadays, movie cycles). As such they resemble the 
distinguishing function of national flags.

Then, yes, but now we do have a notion of "pirate flag" which is basically 
black with a skull and crossbones on it.


"Pirate flag" is a generic concept. Encoding generic concept as such in 
Unicode is a problematic notion - especially if from that the mistaken 
conclusion is drawn that all concrete realizations of symbols that 
somehow pertain to the same general concept are mere glyph variants.


What you would encode is not the concept of "pirate flag" but the 
"archetypical representation of a (generic) pirate flag". That means 
that minor variations in the skull and crossbones are indeed glyph 
variants (representing different artists' attempt to depict the same 
thing), but that other types of flags, used as pirate flags, do not 
constitute mere variants, but represent their own symbols (of related, 
but not identical semantics).


The distinction between these concepts has been sorely lacking in much 
of the recent and not so recent discussion of encoding symbols, and 
that's why I can't find it funny...




Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-31 Thread William_J_G Overington
Doug Ewell  wrote:
 
> A seemingly straightforward solution to the “unambiguous mapping” problem 
> would be to use the existing Plane 14 tag letters along with a new FLAG TAG, 
> say at U+E0002. Then  would unequivocally denote the 
> current Swiss flag. No need for separate lead and trail. Simple.
 
> ... What’s that? Oh, sorry, never mind. Deprecated.
 
On a point of order, is deprecation of a character or collection of characters 
carried out by only the Unicode Technical Committee or by both of the Unicode 
Technical Committee and the ISO/IEC 10646 Committee?
 
Further to that point of order, is there any rule that absolutely prevents the 
deprecated status of a character or collection of characters being removed?
 
I feel that by hybridizing the suggestions of Doug and Philippe that an elegant 
solution using tags and an advanced format font could be designed.
 
William Overington
 
31 May 2012







Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 9:40 AM, Shawn Steele wrote:

Which ones are used in print?  Isn't that the criteria?  Personally, I'd like to see the maritime flags 
encoded, because I've always been interested in them, but I can see a case for them not being encoded.  
(Though a couple weeks ago on a cruise ship I did see them used in several places "in print" as it 
were, though I'd have to concede that the reason they were "in print" was primarily decorative, 
though they were readable.  Eg: "Signals" bar spelled out in flags).


The decorative use of those is in fact not uncommon, and when they are 
used that way, in print, they do form "strings".


They do, by definition, require colors for their representation, 
although, the design is such that colors and shapes work together in a 
redundant way, to improve their recognition under poor visibility.


They are also not "glyph variants" of ordinary letters and digits, even 
where there is a 1:1 correspondence.


First, reprinting Shakespeare's works using flags would make it 
immediately and utterly illegible to most speakers of English. So they 
would fail the test of being recognizably the same letter.


Second, one place where the flags are still used today is sailboat 
races. Replacing the flag by a placard showing the letter would also not 
be acceptable in that context.


So, seeing that Unicode nowadays has the support of SMS-specific symbols 
as part of its scope, who would like to be able to communicate with flags?


Another alphabet, even that with 1:1 correspondence to Latin, but, 
again, not recognizable as such are the "dancing men". They at least can 
be demonstrated to have appeared in print.


A./


Seems like swimming flags or shark flags or dive flags wouldn't be used much in 
print?

-Shawn

-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf 
Of Asmus Freytag
Sent: Poʻahā, Mei 31, 2012 9:00 AM
To: verd...@wanadoo.fr
Cc: Michael Everson; unicode Unicode Discussion
Subject: Re: Flag tags

On 5/31/2012 2:06 AM, Philippe Verdy wrote:

2012/5/31 Asmus Freytag:

On 5/30/2012 7:19 PM, Philippe Verdy wrote:

2012/5/31 Michael Everson:

On 31 May 2012, at 00:24, Mark Davis ☕ wrote:
Members of ISO National Bodies quite properly thought that it is
inapprioprate for an International Standard to encode the flags of
some countries and not the flags of others. You can stuff your
condescension, Mark.

I fully agree. Either all of them or none of them (or just a generic
white flag).

No at least the black pirate flag, and the checkered flag (for car racing).

There are two black pirate flags. One is all black (the most generic
one), another has bones and skullhead. OK these ones are generic
enough to not convey country/territory specific information.

There are also conventional sky blue flags used in Europe (may be
elsewhere) for the quality of waters. There may be others used for
signaling (including surveillance of beaches and dangers for swimming
: red, orange, green) : may be unified with the all-black flag (if
color is not really encoded but assignable by external styles).

If you add the flag cor car racing, then why wouldn't there flags used
in other transportation areas ?

You are right! I missed these:

Add also flags used as maritime alphabets (they are a true script by
themselves, whose mapping to actual letters depend on the locale's
script, so they are not really a visual variant of any script, just
like the Braille script is not tied to Latin), or othe "ideographic"
flags displayed much like the pirate flag (e.g. signaling deceases on
board)...











RE: Flag tags

2012-05-31 Thread Shawn Steele
Which ones are used in print?  Isn't that the criteria?  Personally, I'd like 
to see the maritime flags encoded, because I've always been interested in them, 
but I can see a case for them not being encoded.  (Though a couple weeks ago on 
a cruise ship I did see them used in several places "in print" as it were, 
though I'd have to concede that the reason they were "in print" was primarily 
decorative, though they were readable.  Eg: "Signals" bar spelled out in flags).

Seems like swimming flags or shark flags or dive flags wouldn't be used much in 
print?

-Shawn

-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf 
Of Asmus Freytag
Sent: Poʻahā, Mei 31, 2012 9:00 AM
To: verd...@wanadoo.fr
Cc: Michael Everson; unicode Unicode Discussion
Subject: Re: Flag tags

On 5/31/2012 2:06 AM, Philippe Verdy wrote:
> 2012/5/31 Asmus Freytag:
>> On 5/30/2012 7:19 PM, Philippe Verdy wrote:
>>> 2012/5/31 Michael Everson:
 On 31 May 2012, at 00:24, Mark Davis ☕ wrote:
 Members of ISO National Bodies quite properly thought that it is 
 inapprioprate for an International Standard to encode the flags of 
 some countries and not the flags of others. You can stuff your 
 condescension, Mark.
>>> I fully agree. Either all of them or none of them (or just a generic 
>>> white flag).
>> No at least the black pirate flag, and the checkered flag (for car racing).
> There are two black pirate flags. One is all black (the most generic 
> one), another has bones and skullhead. OK these ones are generic 
> enough to not convey country/territory specific information.
>
> There are also conventional sky blue flags used in Europe (may be
> elsewhere) for the quality of waters. There may be others used for 
> signaling (including surveillance of beaches and dangers for swimming
> : red, orange, green) : may be unified with the all-black flag (if 
> color is not really encoded but assignable by external styles).
>
> If you add the flag cor car racing, then why wouldn't there flags used 
> in other transportation areas ?
You are right! I missed these:
>
> Add also flags used as maritime alphabets (they are a true script by 
> themselves, whose mapping to actual letters depend on the locale's 
> script, so they are not really a visual variant of any script, just 
> like the Braille script is not tied to Latin), or othe "ideographic"
> flags displayed much like the pirate flag (e.g. signaling deceases on 
> board)...
>









Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 17:26, Asmus Freytag wrote:

> you put your finger on it. Any form of combining scheme is doomed to fail. 

That's why http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3680.pdf was the right 
solution.

Michael Everson * http://www.evertype.com/





Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 17:19, Asmus Freytag wrote:

> Some of them can be substituted and will be recognized by all as "jolly 
> roger", others will not.
> 
> The former set "may" be glyph variants - that is, if there's no contrastive 
> usage, the latter cannot be.

They are logos for the actual dead pirate captains. They are glyph variants of 
"pirate flag" otherwise. Some are just obscure glyph variants. 

> In this case, on top of that, many represent symbols identifying particular 
> bands, captains or ships (or nowadays, movie cycles). As such they resemble 
> the distinguishing function of national flags.

Then, yes, but now we do have a notion of "pirate flag" which is basically 
black with a skull and crossbones on it. 

Michael Everson * http://www.evertype.com/





Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/30/2012 10:15 PM, Doug Ewell wrote:
A seemingly straightforward solution to the “unambiguous mapping” 
problem would be to use the existing Plane 14 tag letters along with a 
new FLAG TAG, say at U+E0002. Then  would 
unequivocally denote the current Swiss flag. No need for separate lead 
and trail. Simple.

... What’s that? Oh, sorry, never mind. Deprecated.


Doug,

you put your finger on it. Any form of combining scheme is doomed to fail.

This includes the current approach of "Regional indicators".

The simple reason is that the use case is too remote to attract 
implementer attention, and, because the users community is not, in 
principle, limited, implementation support would have to be widespread 
to make any of these schemes work if&where desired.


The lesson for the UTC (and WG2) should be to cease such efforts of 
"meta encoding".


A./


Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 17:19, Asmus Freytag wrote:

> On 5/31/2012 8:56 AM, Shawn Steele wrote:
> We are missing the JOLLY ROGER.
 At least one, there're lots :)
 http://en.wikipedia.org/wiki/Pirate_flag#Jolly_Roger_gallery
>>> A, glyph variants.
>> Ar, you're right, missed that :)
> 
> No, that's a misunderstanding of glyph variants.

Lordy. It was FUNNY, Asmus. 

Michael Everson * http://www.evertype.com/




Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 8:56 AM, Shawn Steele wrote:

We are missing the JOLLY ROGER.

At least one, there're lots :)
http://en.wikipedia.org/wiki/Pirate_flag#Jolly_Roger_gallery

A, glyph variants.

Ar, you're right, missed that :)




No, that's a misunderstanding of glyph variants.

Some of them can be substituted and will be recognized by all as "jolly 
roger", others will not.


The former set "may" be glyph variants - that is, if there's no 
contrastive usage, the latter cannot be.


Why? Because for symbols, you don't have a word-context that gives you a 
definite, secondary clue to the identity of a shape, so the shape alone 
has to be recognized. Hence, designs that cannot be recognized for each 
other are not glyph variants.


In this case, on top of that, many represent symbols identifying 
particular bands, captains or ships (or nowadays, movie cycles). As such 
they resemble the distinguishing function of national flags.


A./



Re: Unicode 6.2 to Support the Turkish Lira Sign

2012-05-31 Thread Asmus Freytag

On 5/30/2012 11:29 PM, Philippe Verdy wrote:


The situation became a problem when the Japanese ISO 646 started to be
mapped to Unicode/ISO/IEC 10646 within fonts using incorrect mappings.
This occured in the early stages of ISO/IEC 10646 development.


The situation was a problem a long time before that.

A./





Re: Flag tags

2012-05-31 Thread Asmus Freytag

On 5/31/2012 2:06 AM, Philippe Verdy wrote:

2012/5/31 Asmus Freytag:

On 5/30/2012 7:19 PM, Philippe Verdy wrote:

2012/5/31 Michael Everson:

On 31 May 2012, at 00:24, Mark Davis ☕ wrote:
Members of ISO National Bodies quite properly thought that it is
inapprioprate for an International Standard to encode the flags of some
countries and not the flags of others. You can stuff your condescension,
Mark.

I fully agree. Either all of them or none of them (or just a generic
white flag).

No at least the black pirate flag, and the checkered flag (for car racing).

There are two black pirate flags. One is all black (the most generic
one), another has bones and skullhead. OK these ones are generic
enough to not convey country/territory specific information.

There are also conventional sky blue flags used in Europe (may be
elsewhere) for the quality of waters. There may be others used for
signaling (including surveillance of beaches and dangers for swimming
: red, orange, green) : may be unified with the all-black flag (if
color is not really encoded but assignable by external styles).

If you add the flag cor car racing, then why wouldn't there flags used
in other transportation areas ?

You are right! I missed these:


Add also flags used as maritime alphabets (they are a true script by
themselves, whose mapping to actual letters depend on the locale's
script, so they are not really a visual variant of any script, just
like the Braille script is not tied to Latin), or othe "ideographic"
flags displayed much like the pirate flag (e.g. signaling deceases on
board)...






RE: Flag tags

2012-05-31 Thread Shawn Steele
>>> We are missing the JOLLY ROGER.
> 
>> At least one, there're lots :)
> 
>> http://en.wikipedia.org/wiki/Pirate_flag#Jolly_Roger_gallery

> A, glyph variants. 

Ar, you're right, missed that :)

-Shawn






Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 16:04, Shawn Steele wrote:

>> We are missing the JOLLY ROGER.
> 
> At least one, there're lots :)
> 
> http://en.wikipedia.org/wiki/Pirate_flag#Jolly_Roger_gallery

A, glyph variants. 

Yo ho ho,
Michael Everson * http://www.evertype.com/




RE: Flag tags

2012-05-31 Thread Shawn Steele
> We are missing the JOLLY ROGER.

At least one, there're lots :)

http://en.wikipedia.org/wiki/Pirate_flag#Jolly_Roger_gallery






[OT] Re: Flag tags

2012-05-31 Thread Doug Ewell
Philippe Verdy wrote:

> Also there should exist somewhere a registry of known flag codes.
> There are wellknown vexillologic sites that list large collections of
> flags, but for now they still did not develop a standard (ASCII-based)
> codification.
>
> [...]
>
> But this registry does not have to be defined and maintained by the
> Unicode Consortium or by ISO, unless they have the desire to develop
> it.

This doesn't seem at all within the scope of Unicode, though perhaps
CLDR would want it.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­






Re: Flag tags

2012-05-31 Thread Doug Ewell
Philippe Verdy wrote:

> So to represent the flag of Japan, you could encode: 
>
> FLAG INITIAL SYMBOL J 
> FLAG FINAL SYMBOL P 
> [...]

For me, the existing Plane 14 mechanism would have worked just as well,
without requiring three more duplicate sets of printable Basic Latin.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­






Re: Flag tags

2012-05-31 Thread Philippe Verdy
2012/5/31, Michael Everson  wrote:
> U+26FF WHITE FLAG WITH HORIZONTAL MIDDLE BLACK STRIPE

What does this mean ? Is it really useful for something ?



Re: Flag tags

2012-05-31 Thread Andrew West
On 31 May 2012 10:20, Michael Everson  wrote:
>
>> No at least the black pirate flag, and the checkered flag (for car racing).
>
> U+2690 WHITE FLAG
> U+2691 BLACK FLAG
> U+26FF WHITE FLAG WITH HORIZONTAL MIDDLE BLACK STRIPE
> U+1F38C CROSSED FLAGS
> 1F3C1 CHEQUERED FLAG
>
> We are missing the JOLLY ROGER.

I propose U+20F1 COMBINING ENCLOSING FLAG, and a named sequence
 = JOLLY ROGER.

Andrew



Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-31 Thread Andrew West
On 31 May 2012 00:24, Mark Davis ☕  wrote:
>
> There is definitely a problem.

Is it really such a problem?  Why can't implementations simply use
ZWSP to demarcate the 2-character units in a sequence of more than two
regional indicator symbols (and maybe always emit 2-character codes
wrapped between ZWSP on either side to be safe), so for example
USESGE would be parsed as the regional indicator symbols
for USA, SPAIN and Georgia, whereas USESGE would be
parsed as the regional indicator symbols for U (invalid), Sweden,
Singapore and E (invalid).  Algorithms such as line-breaking would not
break between two regional indicator symbols, but only at a ZWSP.

And if implementations wanted to support two- and three-letter
regional codes, they might parse
GBCYMENGNIRSCO as the codes for
United Kingdom, Wales, England, Northern Ireland, and Scotland, and
represent them visually with the appropriate flag icons.

Andrew




Re: Flag tags

2012-05-31 Thread Michael Everson
On 31 May 2012, at 04:49, Asmus Freytag wrote:

> On 5/30/2012 7:19 PM, Philippe Verdy wrote:
>>> 2012/5/31 Michael Everson:
>>> Members of ISO National Bodies quite properly thought that it is 
>>> inapprioprate for an International Standard to encode the flags of some 
>>> countries and not the flags of others. [...]
>> 
>> I fully agree. Either all of them or none of them (or just a generic white 
>> flag).
> 
> No at least the black pirate flag, and the checkered flag (for car racing).
> 
> Those would constitute the minimum useful set.

U+2690 WHITE FLAG
U+2691 BLACK FLAG
U+26FF WHITE FLAG WITH HORIZONTAL MIDDLE BLACK STRIPE
U+1F38C CROSSED FLAGS
1F3C1 CHEQUERED FLAG

We are missing the JOLLY ROGER.

Michael Everson * http://www.evertype.com/





Re: Flag tags

2012-05-31 Thread Philippe Verdy
2012/5/31 Asmus Freytag :
> On 5/30/2012 7:19 PM, Philippe Verdy wrote:
>>
>> 2012/5/31 Michael Everson:
>>>
>>> On 31 May 2012, at 00:24, Mark Davis ☕ wrote:
>>> Members of ISO National Bodies quite properly thought that it is
>>> inapprioprate for an International Standard to encode the flags of some
>>> countries and not the flags of others. You can stuff your condescension,
>>> Mark.
>>
>> I fully agree. Either all of them or none of them (or just a generic
>> white flag).
>
> No at least the black pirate flag, and the checkered flag (for car racing).

There are two black pirate flags. One is all black (the most generic
one), another has bones and skullhead. OK these ones are generic
enough to not convey country/territory specific information.

There are also conventional sky blue flags used in Europe (may be
elsewhere) for the quality of waters. There may be others used for
signaling (including surveillance of beaches and dangers for swimming
: red, orange, green) : may be unified with the all-black flag (if
color is not really encoded but assignable by external styles).

If you add the flag cor car racing, then why wouldn't there flags used
in other transportation areas ?

Add also flags used as maritime alphabets (they are a true script by
themselves, whose mapping to actual letters depend on the locale's
script, so they are not really a visual variant of any script, just
like the Braille script is not tied to Latin), or othe "ideographic"
flags displayed much like the pirate flag (e.g. signaling deceases on
board)...




Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-31 Thread Philippe Verdy
Also there should exist somewhere a registry of known flag codes.
There are wellknown vexillologic sites that list large collections of
flags, but for now they still did not develop a standard (ASCII-based)
codification.

In my opinion, this codication should just need BASIC LATIN CAPITAL
LETTERs, Arabo-European digits, the ASCII HYPHEN as a separator for
country/region subcodes, and the colon and dot for versioning/dating,
and it should be based on ISO 3166-1 (using extension/private codes
for historic countries or regions that are not encoded in ISO 3166)

Such registry should contain a search form for codes, showing the
designs, the preferred aspect ratio metric, the color mappings, and if
the flag itself is protected by some copyright restrictions (this
won't limit the usage of fallback glyphs (showing letters in an
enclosing blank flag) showing just the code in free fonts that do not
want to violate these copyright restrictions, when they will still
define some ligatures for flag designs that are free from those
restrictions.

But this registry does not have to be defined and maintained by the
Unicode Consortium or by ISO, unless they have the desire to develop
it. In any case, it is not necessary to make it part of the Unicode
and ISO/IEC 10646 standards themselves (but there could be an
informative reference to the registry, to help font developers.



Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-31 Thread Philippe Verdy
2012/5/31 Doug Ewell :
> A seemingly straightforward solution to the “unambiguous mapping” problem
> would be to use the existing Plane 14 tag letters along with a new FLAG TAG,
> say at U+E0002. Then  would unequivocally denote the
> current Swiss flag. No need for separate lead and trail. Simple.
>
> ... What’s that? Oh, sorry, never mind. Deprecated.

Not necessaryly: you could very well have sets of characters with
unambigous glyphs showing the ASCII capital letter martly enclosed:
- in a first set, it encloses the letter on the left/top/bottom sides
with the strokes that start displaying the flag (this glyph could also
include the pole)
- in a second set, it encloses the letter only on the top/bottom sides
- in the third set, it encloses the letter only on the top/bottom/right sides.

Let's not forget that even if countries do not change, and keep their
ISO 3166-1 code, their flag may change over time. So a flag encoded
with such characters should contain a year of their first official use
: this would require mapping in the second set the colon ":" and
digits for specifying the year, and mapping in the last set the digits
as well. The colon and digits are "a priori" not needed in the first
set.

So to represent the flag of Japan, you could encode:

FLAG INITIAL SYMBOL J
FLAG FINAL SYMBOL P

But if you want to use explicitly the post-1945 flag (and not the
imperial flag with sun rays), you would encode:

FLAG INITIAL SYMBOL J
FLAG MEDIAL SYMBOL P
FLAG MEDIAL SYMBOL COLON
FLAG MEDIAL SYMBOL ONE
FLAG MEDIAL SYMBOL NINE
FLAG MEDIAL SYMBOL FOUR
FLAG MEDIAL SYMBOL SIX

Which would render mostly like this, if there's no ligature defined
(several lines used here to approximate the glyphs) :

 +–––\
  | J P : 1 9 4 6   >
 +–––/
  |

Here again, a font-defined ligature (if available) could remap it to
the actual flag.

A font can then eaqily be made, with the only constraint that the
glyphs in them should join theses enclosures. If needed, those fonts
can then create ligatures for wellknown flags, showing their apparent
goemetry. The pole could be also removed, and colors added if
supported by the font technology, or replaced by hatches in a basic
monochromatic font technology.

All these would remain standard symbols (they are superficially
"letter-like" except that the standard can say that the letters shown
in the enclosing glyphs are only used as a default fallback, but
ligatures can SAFELY replace them by the actual flag, including with
its true colors. The renderer can use the color capavilities of fonts,
if the font format supports it, or a set of icons (e.g. encoded in a
zipped archive containing SVG files an a small maping files
identifying the flag codes with the name of a SVG file, or within a
single SVG file, containing this mapping internally and mapping this
code to an internal XML anchor ID's using standard XML href's)

Note that OpenType currently does not contain any standard allowing to
map true colors used in glyphs, but there's nothing in OpenType that
prevents a font to expose several glyph variants for mapping the same
characters (or their defined ligatures) : a monochromatic version like
today, and with a new OpenType feature, a colorful version, with an
extra table found in the font that exposes the color mapping either
into an sRGBA color, or to a hatched filling pattern exposed as well
by the font as a rectangle glyph with metrics (and possibly an angle
relative to the baseline).

I am still surprised to see that OpenType still does not include such
standard. Note that hatching patterns will be defined using the
em-square of glyphs assigned to characters and ligatures, so they will
scale the same way, and would be frid-fitted and hinted the same way.
A separate definition of patterns would simplify the design of colored
fonts, as the same glyph geometries would be used. But there could
also be a separate monochromatic glyph to be defined as well in the
same font, in such a way that the glyph is defined with the pattern
integrated to its geometry.

And that CSS for example could specify a way to indicate that the
rendered characters should not use an sRGBA color (with the hatching
pattern defined in the font) but the "natural" colors defined by the
glyphs themselves: this would require only a new value for "color:
natural". If the font does not define any "natural" color for its
mapped glyphs, or the glyphs do not map any hatching patterns, then
this CSS value would be interpreted as if it was "color:inherit". An
extended version could be also "color: natural #rrggbb" : the #rrggbb
would still allow to specify the color to use if there's no natural
color in the font, or if the colors defined in the font (those that
are marked as being "important") are incompatible or not easily
distinguished with the current background (according to user's
preferences), or not accessible to the user (also according to his
preferences) : in which case the renderer would use the hatching