date:20120530

Re: Unicode 6.2 to Support the Turkish Lira Sign

2012-05-30 Thread Philippe Verdy

2012/5/30 "Martin J. Dürst" :
> On 2012/05/30 4:42, Roozbeh Pournader wrote:
>
>> Just look what happened when the Japanese did their own font/character set
>> hack. The backslash/yen problem is still with us, to this day...
>
>
> To be fair, the Japanese Yen at 0x5C was there long before Unicode, in the
> Japanese version of ISO 646. That it has remained as a font hack is very
> unfortunate, but for that, not only the Japanese, but also major
> international vendors are to blame.

As long as it was part of the Japanese version of ISO 646 (which
itself was only the first page of the SJIS encoding), there was
absolutely NO problem at all. This was not different from the
situation of all other national versions of ISO 646, which were all
distinct encodings.

The situation became a problem when the Japanese ISO 646 started to be
mapped to Unicode/ISO/IEC 10646 within fonts using incorrect mappings.
This occured in the early stages of ISO/IEC 10646 development.

And unfortunately several OSes for Japan used those incorrect
mappings, assuming that it was still safe to convert blindly texts
containing backslashes by showing yen symbols instead, just like the
same systems blindly converted US-ASCII (American version of ISO 646)
into SJIS with broken algorithms, simply because those softwares could
not really work with Unicode but still worked only with SJIS, and did
not track correctly which source encoding was used.

This would have probably not occured if Japan had defined and
standardized an ISO 8859 version for mapping the Yen out of ASCII
(along with basic Kana letters and Asian punctuations); but they
prefered to develop only SJIS to support Kanjis (and later the
emerging UCS remapped on it). And it would also have offered an easier
migration.

They were ambitious at the beginning, but the ambition was premature
when the surrounding technologies to support a large character set was
still very incomplete (forcing a lot of software to use unsafe/lossy
remappings to a smaller character sets). So for several decennials,
there has been a lot of interoperability problems caused by the
various implementations of SJIS, many of them not compatible with each
other in their limitations or in the way the "simplifications" were
applied to support different parts of it.

The backslash character, though it was common in many programming
languages and OSes, then appeared to be replaced there by the yen
symbol, and people were trained with it (for example when using
pathnames in DOS/Windows filesystems, or when using the yen symbol as
the escaping prefix when programming in C/C++); and it was then
perceived that the backslash was for them a variant form (of their yen
symbol) that they did not need (SJIS was later adapted to map the
backslash somewhere else, but the SJIS users did not immediately fix
it).

As a result, the mapping of 0x5C in SJIS has always been ambiguous,
depending on the implementations, but it has never been ambiguous in
the Japanese version of ISO 646, that did not include the backslash.

So don't criticize ISO 646, there was no problem there. The problem is
fully within the early versions of SJIS which allowed such variation
of glyphs, when it should have considered the yen symbol and the
backslash as distinct abstract characters requiring separate mappings.

But who uses the Japanese version of ISO 646 now in Japan ? Only SJIS
seems to survive now, with all its intrinsic ambiguities and its many
incompatible implementations (whose exact versions are most often not
identified correctly in most softwares).

The Japanese NB should have stopped this nightmare by fixing a rule to
strongly deprecate (and remove all past recommandations), so that only
one version of SJIS should survive, and that old data encoded with
ambiguous SJIS version being left in their blackbox :

It would have been simpler and more effective for the Japanease NB to
rename the SJIS standard for the only remaining version, such as
"UJIS" ("U" for "Universal", meaning that it has a full roundtrip
compatibility with the UCS and no longer any ambiguity allowed) and
then freeze it completely at this state (all other developments being
made in the UCS), with a strong recommandation to NOT perform any
blind conversion to UJIS or interpretation as UJIS of any past data
encoded for an unversioned SJIS : all ambiguous characters in these
old data should be detected as ambiguous, meaning that the
document/data  was not convertible without proper versioning.

This would have forced also the various private software makers and
manufacturers that had used their own version of SJIS to register
again to the Japanese NB a SINGLE (and unique) string recommanded to
identify their implementation of SJIS, removing all past known aliases
that were also ambiguous between each other, so that the effective
encofing old data could be uniquely identified and would then become
uniquely convertible first to the national standard UJIS, then to the
UCS by its warran

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-30 Thread Doug Ewell

A seemingly straightforward solution to the “unambiguous mapping” problem would 
be to use the existing Plane 14 tag letters along with a new FLAG TAG, say at 
U+E0002. Then  would unequivocally denote the current 
Swiss flag. No need for separate lead and trail. Simple.

... What’s that? Oh, sorry, never mind. Deprecated.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell 



From: Mark Davis ☕ 
Sent: Wednesday, May 30, 2012 17:24
To: Martin J. Dürst 
Cc: unicode Unicode Discussion 
Subject: Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

There is definitely a problem. 


The origin is complicated. All that anyone really needed were 10 characters for 
emoji flags, encoded as compatibility characters. However, certain people (I'll 
call Completionists) who think that if you encode one member of a set (even for 
compatibility characters!), you need to encode all of them. So the request 
expanded from 10 to all countries, then to all possible countries. And I 
wouldn't be surprised to have them then push for state/provincial flags for 
completeness, and who knows, maybe someday municipal flags (my old town 
http://en.wikipedia.org/wiki/File:Sargans-coat_of_arms.svg). 


So, some people came up with a way to handle this, using combinations of 
special characters. The only problem is that we didn't have lead and trail 
characters separately defined, to allow for an unambiguous mapping.





Mark


— Il meglio è l’inimico del bene —




On Tue, May 29, 2012 at 2:38 AM, "Martin J. Dürst"  
wrote: 


... 

  On a slightly (although maybe only slightly) related matter, what about if 
Unicode didn't judge how difficult it should be to display national flags. 
Creating a way to display flags from two-tag combinations and then later 
realizing that a sequence of such tags didn't locally parse, and the whole 
thing has to be redone, doesn't seem like a very good alternative to just 
encoding these things (not that I think that just encoding these is a very good 
alternative either, though).

  Regards,   Martin.

Re: Flag tags

2012-05-30 Thread Asmus Freytag


On 5/30/2012 7:19 PM, Philippe Verdy wrote:

2012/5/31 Michael Everson:

On 31 May 2012, at 00:24, Mark Davis ☕ wrote:
Members of ISO National Bodies quite properly thought that it is inapprioprate 
for an International Standard to encode the flags of some countries and not the 
flags of others. You can stuff your condescension, Mark.

I fully agree. Either all of them or none of them (or just a generic
white flag).

No at least the black pirate flag, and the checkered flag (for car racing).

Those would constitute the minimum useful set.

A./

Re: Preliminary proposal to encode Unifon in the UCS.

2012-05-30 Thread Benjamin M Scarborough

Actually, I just noticed that Hupa and Yurok have TLE sorted after Y, so point 
ϛʹ is moot.

—Ben Scarborough

Re: Preliminary proposal to encode Unifon in the UCS.

2012-05-30 Thread Benjamin M Scarborough

I do have a few comments and questions I'd like to make about N4262.

αʹ) I think LATIN LETTER TURNED-E R should be disunified from U+025A LATIN 
LETTER SCHWA WITH HOOK. I don't think the identity of the new capital character 
matches the established identity of U+025A. Of the five glyphs provided for 
LATIN SMALL LETTER TURNED-E R, I think the first one is the best choice. The 
second glyph resembles ɚ too closely (confusable!), and the other three use a 
small capital r which doesn't seem fitting.

βʹ) Should the glyph for LATIN SMALL LETTER CHE extend below the baseline, like 
in the Metelko alphabet? Obviously this doesn't matter for Unifon, where the 
character will appear as a small capital anyway. However, this could make it 
look too similar to U+0265 LATIN SMALL LETTER TURNED H.

γʹ) On page 7, there are two characters that "derive from earlier versions of 
Unifon." The letter on the right is clearly U+023D LATIN CAPITAL LETTER L WITH 
BAR, but the character on the left is discussed nowhere else in the document. 
What is it? I honestly can't tell.

δʹ) In the Lepsius text example on page 5, on the sixth line I see a 
delta-looking symbol. I assume this is U+1E9F LATIN SMALL LETTER DELTA. Since 
this is normally-cased text, is there any evidence of a LATIN CAPITAL LETTER 
DELTA, or is this particular letter just an anomaly?

εʹ) LATIN LETTER OVERTURNED WINEGLASS stands out to me as an odd character 
name. I know that a few other characters, such as U+0264 LATIN SMALL LETTER 
RAMS HORN, have such illustrative names, but this still seems like an odd name 
choice to me. However, I cannot think of a more fitting name.

ϛʹ) The only Unifon alphabets that use LATIN LETTER TLE put it at the very 
beginning of the alphabet. Will the finished proposal sort TLE before A? Could 
this have a negative impact on collation? (I notice that N4262 does not address 
the issue of collation for any character.)

That's all I can think of for now.

—Ben Scarborough

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-30 Thread Philippe Verdy

2012/5/31 Michael Everson :
> On 31 May 2012, at 00:24, Mark Davis ☕ wrote:
> Members of ISO National Bodies quite properly thought that it is 
> inapprioprate for an International Standard to encode the flags of some 
> countries and not the flags of others. You can stuff your condescension, Mark.

I fully agree. Either all of them or none of them (or just a generic
white flag).

>> So the request expanded from 10 to all countries, then to all possible 
>> countries.
>
> THIS is the actual proposal, Mark: 
> http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3680.pdf which you should have cited 
> while you were rubbishing it. In my judgement it is better than the kludge 
> that Martin criticized.

The interpretation of "all countries" has political and historical
consequences :

How many countries ? Only those that are UN members TODAY, excluding
historic UN members and other historic countries that have had various
flags before adopting a standard one long before the UN (or the former
SDN) ever existed ?

And what about existing countries that are not UN members ? We should
then add UN observers, plus countries that are still recognized by the
UN but not existing in their territories (e.g. Western Sahara), or
whose existence is contested but really have an local administration
not obeying the legilation of the other UN member that claims their
authority on them (e.g. Taiwan, Palestine, or the cessessionist part
of Moldova) ?

And do all countries really have an official flag ?

And should we also include flags that are legalized in dependencies of
another country (e.g. in the French dependency of New Caledonia, with
its special and transitory autonomy status) ? Then what about the
non-official flags used informally in those dependencies (see for
example flags used in Martinique), which we often see as flags
representing Internet domains or ISO 3166-1 entities ?

For Unicode, the guide should be based on actual usage of these flags,
but there will be vetoes at ISO... This is a difficult issue, for
which the current solution based on images embedded in rich texts
still works. For plain-texts, we still have country codes, country
names, possibly with surrounding puntuation marks, and in my opinion
it is enough.

Finally, many countries will have flags that are confusive if colors
are not very precisely represented (some of them only vary by a small
shading difference in one of their colors, using exactly the same
geometries; tricolors are the most frequent cases, wth the additional
problem that some flags are normally shown vertically when others are
shown horizontally, or may even be in the opposite direction, or in
all possible directions where they will create new confusions).

So I'd advocate for now only very few flags : only generic white or
black flag symbols, not representing any particular political entity.

The alternative would be to encode a pair of special punctuations
surrounding an ISO 3166-1 code, so that renderers would draw either
these special flag-like brackets around the code, or the code whithin
a genric white flag, or subsitute an image...

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-30 Thread Michael Everson

On 31 May 2012, at 00:24, Mark Davis ☕ wrote:

> There is definitely a problem. 

The problem is the condescending revisionism you are about to indulge in, Mark. 

> The origin is complicated. All that anyone really needed were 10 characters 
> for emoji flags, encoded as compatibility characters.

This is incorrect. All that some guys in Japan who had never thought about 
character encoding did was to encode a few flags that they liked into some 
proprietary telephone standards. Then those flags started leaking into e-mails 
and one large company that you happen to work for decided that they needed to 
encode these blorts that the guys in Japan had put into their phones. 

> However, certain people (I'll call Completionists)

This is condescending and offensive. 

> who think that if you encode one member of a set (even for compatibility 
> characters!), you need to encode all of them.

Members of ISO National Bodies quite properly thought that it is inapprioprate 
for an International Standard to encode the flags of some countries and not the 
flags of others. You can stuff your condescension, Mark. 

> So the request expanded from 10 to all countries, then to all possible 
> countries.

THIS is the actual proposal, Mark: 
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3680.pdf which you should have cited 
while you were rubbishing it. In my judgement it is better than the kludge that 
Martin criticized.

> And I wouldn't be surprised to have them then push for state/provincial flags 
> for completeness, and who knows, maybe someday municipal flags (my old town 
> http://en.wikipedia.org/wiki/File:Sargans-coat_of_arms.svg).

You ALWAYS jump to this unwarranted conclusion, Mark. There are, outside of the 
UTC, people who are concerned that the symbol sets encoded are coherent and 
useful. We do not believe that "compatibility with industry" is always 
sufficient. 

What you have said is insulting to the German and Irish National Bodies who 
proposed N3680, and to their representatives. 

> So, some people came up with a way to handle this, using combinations of 
> special characters. The only problem is that we didn't have lead and trail 
> characters separately defined, to allow for an unambiguous mapping.

No, we came up with a good sensible scheme, and you wouldn't have it, so we the 
committees came up with the kludge of combining letters in pairs to represent 
these flags. That's what Martin is complaining about, and it's the UTC's fault 
that the scheme is complex rather than simple.

As far as your thesis that "ten flags are enough", that does not fly in an 
International Standard, and you ought well to know it.

Michael Everson * http://www.evertype.com/

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

2012-05-30 Thread Mark Davis ☕

There is definitely a problem.

The origin is complicated. All that anyone really needed were 10 characters
for emoji flags, encoded as compatibility characters. However, certain
people (I'll call Completionists) who think that if you encode one member
of a set (even for compatibility characters!), you need to encode all of
them. So the request expanded from 10 to all countries, then to all
possible countries. And I wouldn't be surprised to have them then push for
state/provincial flags for completeness, and who knows, maybe someday
municipal flags (my old town
http://en.wikipedia.org/wiki/File:Sargans-coat_of_arms.svg).

So, some people came up with a way to handle this, using combinations of
special characters. The only problem is that we didn't have lead and trail
characters separately defined, to allow for an unambiguous mapping.

--
Mark 
*
*
*— Il meglio è l’inimico del bene —*
**

On Tue, May 29, 2012 at 2:38 AM, "Martin J. Dürst"
wrote:
>
>
>> ...

>
> On a slightly (although maybe only slightly) related matter, what about if
> Unicode didn't judge how difficult it should be to display national flags.
> Creating a way to display flags from two-tag combinations and then later
> realizing that a sequence of such tags didn't locally parse, and the whole
> thing has to be redone, doesn't seem like a very good alternative to just
> encoding these things (not that I think that just encoding these is a very
> good alternative either, though).
>
> Regards,   Martin.
>
>

Re: Unicode 6.2 to Support the Turkish Lira Sign

2012-05-30 Thread Asmus Freytag


On 5/29/2012 9:34 PM, Jukka K. Korpela wrote:


For comparison: The design of the euro sign was published in 1996. It 
was added to Unicode in version 2.1 in 1998. As physical money, notes 
and coins, the euro was taken into use in 2002. Considerable resources 
were spent into the introduction of the euro sign, as part of a very 
large process of introducing the euro currency. Now, over ten years 
later, the adoption of the euro sign is still incomplete. Informal and 
formal texts, printed and online, not to mention receipts and other 
documents generated by various systems, “eur”, “EUR”, “e”, “E”, and 
simple omission of currency denotation are still very common.




EUR is like using USD for $ - it may be done for other reasons than font 
issues.


That aside, while ALL changes to character encoding have a long 
trail of incompatible support, the fact is that the Euro is correctly 
displayed in millions if not billions of documents and websites. And 
that this began pretty much immediately across large parts of Europe.


None of this would have been any easier by *waiting* with encoding a 
character - or refusal by the character encoding committees to act, 
based on some principled objections to the design of the symbol or a 
myriad of other specious reasons that some people seem to delight in 
raising.


A./

PS: I fully agree with the more large-picture part of your post: adding 
a character code merely acts as an enabler - it does not actually 
deliver the support. And yes, some people do forget that on occasion. 
But this is not one of them.

Re: Preliminary proposal to encode Unifon in the UCS.

2012-05-30 Thread Michael Everson

On 30 May 2012, at 20:46, Doug Ewell wrote:

> N4262 says the same, and so do practically all proposal forms in response to 
> that question, no matter how similar any of the characters are to others in 
> appearance or function. I think authors know it's a big red flag if they say 
> "Yes."

That, or we don't really care about any but the lines in the form which are 
actually looked at when a script is discussed in WG2, namely the block name and 
character count. 

Michael Everson * http://www.evertype.com/

RE: Preliminary proposal to encode Unifon in the UCS.

2012-05-30 Thread Doug Ewell

Michael Everson  wrote:

>> “10a. Can any of the proposed character(s) be considered to be
>> similar (in appearance or function) to an existing character?”
>> “No.”
>> I’m a little surprised. If the 2nd possibility was envisioned, isn’t
>> it because many Unifon letters are similar in appearance and often in
>> function with some capital Latin letters?
>
> I didn't bother with that in an exploratory proposal.

N4262 says the same, and so do practically all proposal forms in
response to that question, no matter how similar any of the characters
are to others in appearance or function. I think authors know it's a big
red flag if they say "Yes."

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell

Re: [OT] Re: Exact positioning of Indian Rupee symbol according to Unicode Technical Committee

2012-05-30 Thread Richard Wordingham

On Tue, 29 May 2012 12:52:12 -0700
"Doug Ewell"  wrote:

> And yes, of course it's possible to stack an entire new layer on top
> of the existing Windows key architecture, as Keyman does. Maybe that
> is the long-term solution, but I haven't heard that MS is planning to
> go that route.

I'm confused by this technology discussion.  I thought the Windows
'Text Services Framework' was already available as a ready way to
implements one's own IME.  For example, it's used by an open source
Keyman for Linux (KMFL) implementation for the Windows by the name of
Ekaya, and at the very least it has been used to implement a brute force
method of entering BMP characters by hex. (At least, it sounded like
brute force to me.)

Richard.

Preliminary proposal to encode Unifon in the UCS.

2012-05-30 Thread Michael Everson


On 28 May 2012, at 01:33, Jean-François Colson wrote:

> Le 05/03/12 19:34, Michael Everson a écrit :
>> Comment is invited.
>> 
>> http://std.dkuug.dk/JTC1/SC2/WG2/docs/n4195.pdf
>> 
>> I have had some feedback from the UTC already.
>> 
>> Michael Everson * http://www.evertype.com/
> 
> Hello
> 
> I’m sorry I missed your post. I hope it’s not too late to comment.
> 
> You wrote: “Unifon was adapted principally by Tom Parsons of Humboldt State 
> University to provide a practical orthography for several the Hupa, Yurok, 
> Tolowa, and Karok  languages.”
> Shouldn’t you remove the word “several”?

Yes. That may have been corrected in the second version. I will check anyway.

> Encoding model.
> 1st possibility: a separate script. There’ll be no problem.

There would, because the bulk of the script would look just like Latin, and the 
encoding committees consider this to be a security issue for internet spoofing 
for instance.

> 2nd possibility: Latin extensions. We’ll have to format the text in lowercase 
> to get correct small Unifon letters. The situation won’t be a 
> lot better than today.

I don't see how. Firstly most Unifon text is in all caps. So the minority of 
text which would have to be styled in small caps would not be problematic. 

> Now, in mixed text (where we used both Latin and Unifon letters), we must 
> change the font whenever we change the alphabet.

Yes, but if Unifon were unified with Latin then a single font would serve them 
both, so long as there were no overunifications for a few of the letters. 

> If the 2nd possibility is chosen, well have to change the format (either 
> standard text or small caps) endlessly.

Only if you were mixing casing Unifon orthography with regular English text. 

> I wonder whether we could imagine a 3rd possibility: use Latin letters with a 
> variation selector which would be interpreted as “The preceding 
> letter is a Unifon letter. The lowercase should be displayed as a small 
> capital.” The VS could be encoded automatically by the keyboard driver.

That would not tempt either me or the encoding committees. It would simply be 
adding another mechanism to produce small caps. 

> Combining diacritical marks.
> Why did you keep the letters hah, kah, ghah and xah in the chart?  Shouldn’t 
> you remove them? They could be written with the combining  diacritics.

The exploratory proposal you reviewed was not a proposal per se. Those were 
listed just so people could see the repertoire. 

> On page 5, I read “UNIFON CAPITAL LETTER THIING” and “UNIFON SMALL 
> LETTER THIING” both with a double I. Is it “THIING” or “THING”?

THING, though these names may change. 

> On the same page and the following one, the subtitles “Archaic small letters” 
> they are just before the lettres with macron while they should 
> be, IIRC, between ewe and chay.

Yes, well, people make mistakes. :-)

> “Figure 8. The Unifon alphabet for Hupe.” Shouldn’t that be “Hupa”?

Yes.

> “10a. Can any of the proposed character(s) be considered to be similar 
> (in appearance or function) to an existing character?”
> “No.”
> I’m a little surprised. If the 2nd possibility was envisioned, isn’t it  
> because many Unifon letters are similar in appearance and often in 
> function with some capital Latin letters?

I didn't bother with that in an exploratory proposal. 

> I think that’s all I have to say about this proposal.

I'd be grateful if you would review the later proposal, 
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4262.pdf

Michael Everson * http://www.evertype.com/

Re: Plese add a Chinese Hanzi

2012-05-30 Thread John H. Jenkins

Making a proposal directly to the IRG isn't possible under the present 
procedures.  What's usually done for this kind of thing is to have the UTC 
propose them.  

Andrew West  於 2012年5月30日 上午8:14 寫道：

> I personally think that rather than add characters such as this
> piecemeal, it would be more useful if someone or some organization
> could research what newly devised, unencoded characters are in use in
> biology, chemistry, etc., and make a proposal to encode them all,
> either via the Chinese national body or directly to IRG.  Characters
> used in modern scientific literature should be considered urgent use,
> in my opinion, and encoded sooner rather than later.
> 

=
Hoani H. Tinikini
John H. Jenkins
jenk...@apple.com

Re: Plese add a Chinese Hanzi

2012-05-30 Thread Andrew West

On 30 May 2012 16:12, Andrew West  wrote:
>
>> And (鱼芒) on page 4.
>
> But that is an unencoded simplified form of U+29DF6 𩷶

Oddly, the editors have created a new glyph for the unencoded
simplified form of U+29DF6 𩷶, but have not done the same for U+9BA1 鮡
and U+9B88 鮈 (for which there are no corresponding encoded simplified
forms), which they print using the traditional form characters.  I
suppose this is because they were unaware of the existence of U+29DF6
𩷶 (or did not have an appropriate font), and so had to create a glyph
for the character anyway.

Andrew

Re: Plese add a Chinese Hanzi

2012-05-30 Thread Andrew West

On 30 May 2012 15:30, Michael Everson  wrote:
>
>> http://www.bioline.org.br/pdf?zr09050
>
> (鱼皮) is also found there, on page 3.

That one is already encoded as U+9C8F 鲏 so it is odd that they needed
to create their own custom glyph for it.

> And (鱼芒) on page 4.

But that is an unencoded simplified form of U+29DF6 𩷶

Which just goes to prove my point that someone needs to research these
characters methodically.

Andrew

Re: Plese add a Chinese Hanzi

2012-05-30 Thread Ed Trager

Following up on Andrew West's comment that all such new scientific
characters should be researched and encoded en masse, I would also add
that consideration should be given to encode both the simplified and
traditional versions of such characters in pairs, where merited, as
clearly would be the case with these characters for fishes.

- Ed Trager
http://unifont.org/keycurry/

On Wed, May 30, 2012 at 10:30 AM, Michael Everson  wrote:
> On 30 May 2012, at 15:14, Andrew West wrote:
>
>> I have found examples of the use of this character (鱼丹) in print in
>> the following academic article available on line:
>>
>> "Composition and Status of Fishes of Nanla River in Xishuangbanna,
>> Yunnan, China"
>> ZHENG Lan-ping, CHEN Xiao-yong*, YANG Jun-xing
>> Zoological Research 2009, Jun. 30(3): 334−340
>>
>> http://www.bioline.org.br/pdf?zr09050
>
> (鱼皮) is also found there, on page 3.
>
> Michael Everson * http://www.evertype.com/
>
>
>

Re: Plese add a Chinese Hanzi

2012-05-30 Thread Michael Everson

On 30 May 2012, at 15:14, Andrew West wrote:

> I have found examples of the use of this character (鱼丹) in print in
> the following academic article available on line:
> 
> "Composition and Status of Fishes of Nanla River in Xishuangbanna,
> Yunnan, China"
> ZHENG Lan-ping, CHEN Xiao-yong*, YANG Jun-xing
> Zoological Research 2009, Jun. 30(3): 334−340
> 
> http://www.bioline.org.br/pdf?zr09050

(鱼皮) is also found there, on page 3.

Michael Everson * http://www.evertype.com/

Re: Plese add a Chinese Hanzi

2012-05-30 Thread Michael Everson

And (鱼芒) on page 4.
 
Michael Everson * http://www.evertype.com/

Re: Plese add a Chinese Hanzi

2012-05-30 Thread Andrew West

I have found examples of the use of this character (鱼丹) in print in
the following academic article available on line:

"Composition and Status of Fishes of Nanla River in Xishuangbanna,
Yunnan, China"
ZHENG Lan-ping, CHEN Xiao-yong*, YANG Jun-xing
Zoological Research 2009, Jun. 30(3): 334−340

http://www.bioline.org.br/pdf?zr09050

See page 6: Wang XZ ref.; and Appendix I #1 Danio chrysotaeniata.

I personally think that rather than add characters such as this
piecemeal, it would be more useful if someone or some organization
could research what newly devised, unencoded characters are in use in
biology, chemistry, etc., and make a proposal to encode them all,
either via the Chinese national body or directly to IRG.  Characters
used in modern scientific literature should be considered urgent use,
in my opinion, and encoded sooner rather than later.

Andrew

On 30 May 2012 14:10, shi zhao  wrote:
>
> (鱼丹) pinyin is: dan1.
>
> (鱼丹)  is Chinese name of some fish.
>
> In chinese:
> Danioninae =  (鱼丹)亚科
> Gymnodanid = 裸 (鱼丹)属
> Gymnodanid strigatus = 条纹裸(鱼丹)
> Danio =  (鱼丹)属
> Danio aequipinnatus = 波条(鱼丹)
> Danio kakhienansis = 红蚌(鱼丹)
> Danio myersi = 麦氏(鱼丹)
> Danio interrupta = 半线(鱼丹)
> Danio apogon = 缺须(鱼丹)
> Danio chrysotaeniatus = 金线(鱼丹)
>
> References:
>
> [1] Xin-Luo Chu, A Preliminary Revision of Fishes of The Genus Danio From
> China, Zoological Research, 1981, 2(2), p 145-154
> [2] CHEN YI-FENG,  HE SHUN-PING, A NEW GENUS AND A NEW SPECIES OF CYPRINID
> FISHES FROM YUNNAN, CHINA (CYPRINIFORMES: CYPRINIDAE: DANIONINAE),  Acta
> Zootaxonomica Sinica, 1992, 17(2), 238-240
> [3] CHEN Min,  A study of the Freshwater Fishes in Guangxi Part
> I.Cyprinidae:Danioninae,Leuciscinae and Cultrinae, Journal of Liuzhou
> Vocational & Technical College, 2001, 1(2), 64-69
>
> PS: In China, scientists sometimes will be made new Hanzi for the some
> new concept/terminology, especially in the field of biology, chemistry, etc.
>

Re: Plese add a Chinese Hanzi

2012-05-30 Thread shi zhao

(鱼丹) pinyin is: dan1.

(鱼丹)  is Chinese name of some fish.

In chinese:
Danioninae =  (鱼丹)亚科
Gymnodanid = 裸 (鱼丹)属
Gymnodanid strigatus = 条纹裸(鱼丹)
Danio =  (鱼丹)属
Danio aequipinnatus = 波条(鱼丹)
Danio kakhienansis = 红蚌(鱼丹)
Danio myersi = 麦氏(鱼丹)
Danio interrupta = 半线(鱼丹)
Danio apogon = 缺须(鱼丹)
Danio chrysotaeniatus = 金线(鱼丹)

References:

[1] Xin-Luo Chu, A Preliminary Revision of Fishes of The Genus Danio From
China, Zoological Research, 1981, 2(2), p 145-154
[2] CHEN YI-FENG,  HE SHUN-PING,* *A NEW GENUS AND A NEW SPECIES OF
CYPRINID FISHES FROM YUNNAN, CHINA (CYPRINIFORMES: CYPRINIDAE: DANIONINAE),
 Acta Zootaxonomica Sinica, 1992, 17(2), 238-240
[3] CHEN Min,  A study of the Freshwater Fishes in Guangxi Part
I.Cyprinidae:Danioninae,Leuciscinae and Cultrinae, Journal of Liuzhou
Vocational & Technical College, 2001, 1(2), 64-69

PS: In China, scientists sometimes will be made new Hanzi for the some
new concept/terminology, especially in the field of biology, chemistry, etc.

2012/5/30 Philippe Verdy 

> How do you name that fish ? Danio or Devario ?
>
> An then how do you transliterate this phonetically in Chinese if you
> can't use the two ideographs in a simple row ?
>
> Note that the term "Danio" is vernacular" and now deprecated, the
> classification has changed (and is still a research in progress)... So
> the important thing is how you would name it (and transcript it
> phonetically, in Pinyin for example) in Chinese for the vernacular
> usage. If this name is in fact not vernacular but comes from an old
> scientific classification, it probably better be named using its
> scientific Latin name, and there's no establisehd distinct word for
> these species in the vernacular language (e.g. in shops or magazines
> for aquariophiles, where the Latin names are probably displayed).
>
> 2012/5/28 Charlie Ruland :
> > * John H. Jenkins  [2012-05-28 20:54]:
> >
> >
> > On 2012年5月28日, at 上午10:21, Charlie Ruland  wrote:
> >
> > Zhao,
> > 1. If the character 鱼⿰丹 that you would like to have encoded is a
> > contemporary Standard Chinese word or morpheme, then what is its
> > pronunciation?
> >
> >
> > FWIW, the correct syntax is ⿰鱼丹.  I take it that he would also like ⿰魚丹.
> >
> >
> > Right. 鱼⿰丹 is the syntax I sometimes use when teaching Chinese
> characters,
> > whereas ⿰鱼丹 is Unicode. Sorry for the mess!
> > Charlie
> >
> >
> >
> > 2. Can you provide material (for example photos, scans from books, etc.)
> > that clearly shows that 鱼⿰丹 is used as a single character? By which
> group of
> > people is it used?
> >
> >
> > Exactly.  *No* hanzi will be added to Unicode/ISO 10646 without solid
> > evidence of actual use. Generally, this means authoritative, printed
> > materials (a dictionary, government ID). Handwritten materials could
> > conceivably be used, but they would have to be awfully convincing.
> > Well-known websites with the character embedded *as a graphic* has been
> used
> > as a past, but in those cases the character was quite well-known.
> >
> >
> > Charlie
> >
> > * shi zhao  [2012-05-28 17:07]:
> >
> > PS:
> >  zh-hans:  鱼+丹
> > zh-hant: 魚+丹
> >
> >
> > 2012/5/28 shi zhao 
> >>
> >> Plese add a Hanzi to Unihan: a fish name 鱼+丹 = Danio.
> >>
> >> see:
> >>  https://en.wikipedia.org/wiki/Danio
> >>
> https://zh.wikipedia.org/wiki/Category:%28%E9%AD%9A%E4%B8%B9%29%E5%B1%AC
> >> http://www.cnffd.com/index.php?route=product/category&path=3_11_64_284
> >> http://zd1.brim.ac.cn/Mnamelist.asp?start=1982
> >> http://hello.area.com.tw/is_bs.cgi?areacode=nt097&bsid=2.9.1.1.3
> >> https://www.google.com/search?q=Danio+ 魚丹
> >>
> >>
> >> Chinese wikipedia: http://zh.wikipedia.org/
> >> My blog: http://shizhao.org
> >> twitter: https://twitter.com/shizhao
> >>
> >> [[zh:User:Shizhao]]
> >>
> >
> >
> > =
> > Hoani H. Tinikini
> > John H. Jenkins
> > jenk...@apple.com
> >
> >
>
>
>

Re: [OT] Re: Exact positioning of Indian Rupee symbol according to Unicode Technical Committee

2012-05-30 Thread Jean-François Colson


Le 30/05/12 06:26, Jean-François Colson a écrit :

Le 28/05/12 22:53, Doug Ewell a écrit :

Karl Pentzlin wrote:


As said in an earlier posting, the part 9995-9 is now in DIS, which
means that its final version will be published 2013 or 2014. Thus,
national standards referring to this part will hardly be published
before 2015.

Thus, there is enough time for any manufacturer of operating systems
or third-party software suppliers to announce their support of any
keyboard layout compliant with a standard referring to ISO/IEC 9995-9.


Again, just speaking about one platform (Windows) that seems to be in 
somewhat common use, the problem is that the underlying architecture 
doesn't support multiple dead keys on a single base character, nor 
does it support a fifth, sixth, etc. shift state (unless one chooses 
to be reckless and use Ctrl). This is unlikely to change in the next 
two to three years. It isn't a matter of providing a 
layout—otherwise, anyone with MSKLC and a supported Windows version 
could create one.




The only limitS I know for Windows’ dead keys is that they can’t 
handle characters outside from the BMP.


… and that there can only be one single character at the output.

With MSKLC, it is possible to support multiple dead keys on a single 
base character: 
http://msdnrss.thecoderblogs.com/2011/04/chain-chain-chain-chain-of-dead-keys/
(I didn’t say it’s easy: you need to edit the klc file with a text 
editor and to compile it manually.)


Using the same technique, you can even make a compose key.

And for the 5th and 6th layers, perhaps you could look at the Neo 
layout (a Dvorak-like keyboard layout for German, 
http://neo-layout.org). They made Windows drivers for their very 
special layout which uses three pairs of modifiers: Shift, Mod3 and 
Mod4. You could certainly find ideas there.


JF

Re: Unicode 6.2 to Support the Turkish Lira Sign

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

Re: Flag tags

Re: Preliminary proposal to encode Unifon in the UCS.

Re: Preliminary proposal to encode Unifon in the UCS.

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

Re: Flag tags (was: Re: Unicode 6.2 to Support the Turkish Lira Sign)

Re: Unicode 6.2 to Support the Turkish Lira Sign

Re: Preliminary proposal to encode Unifon in the UCS.

RE: Preliminary proposal to encode Unifon in the UCS.

Re: [OT] Re: Exact positioning of Indian Rupee symbol according to Unicode Technical Committee

Preliminary proposal to encode Unifon in the UCS.

Re: Plese add a Chinese Hanzi

Re: Plese add a Chinese Hanzi

Re: Plese add a Chinese Hanzi

Re: Plese add a Chinese Hanzi

Re: Plese add a Chinese Hanzi

Re: Plese add a Chinese Hanzi

Re: Plese add a Chinese Hanzi

Re: Plese add a Chinese Hanzi

Re: [OT] Re: Exact positioning of Indian Rupee symbol according to Unicode Technical Committee

22 matches

Site Navigation

Mail list logo

Footer information