Reminder about reporting bugs, errors, and other feedback

2020-03-07 Thread Rick McGowan via Unicode

Hello everyone...

This is just a little public service reminder that discussions on the 
Unicode mail list are not considered official feedback, and are not 
reviewed by UTC members or staff as a source for bug reports.


If you want to make sure your feedback and/or report gets into the UTC 
process, it is best to submit it through our reporting form, which can 
be found here:


https://www.unicode.org/reporting.html

Cheers,







Unicode CLDR 35 beta available for testing

2019-03-18 Thread Rick McGowan via Unicode
The *beta* version of Unicode CLDR 35 
 is available for 
testing. The final release is expected on March 27.


Aside from documenting additional structure, there have been important 
modifications LDML (scan for the yellow highlighted sections). See 
Modifications 
 
for details. There is (limited) time for feedback on the changes to the 
specification: please file feedback at 
http://unicode.org/cldr/trac/newticket.


Unicode CLDR 35 provides an update to the key building blocks for 
software supporting the world's languages. CLDR data is used by all 
major software systems 
 for their software 
internationalization and localization, adapting software to the 
conventions of different languages for such common software tasks.


CLDR 35 included a limited Survey Tool data collection phase 
, 
adding approximately 54 thousand new translated fields:


*Basic coverage*



New languages at *Basic* coverage: Cebuano (ceb), Hausa (ha), Igbo (ig), 
Yoruba (yo)


*Modern coverage*



Languages Somali (so) and Javanese (jv) has additional coverage from 
*Moderate* to *Modern*


*Emoji 12.0*



Names and annotations (search keywords) for 90+ new emoji 
;


Also includes fixes for previous names & keywords

*Collation*



Collation updated to *Unicode 12.0*, including new emoji;

Japanese single-character (ligature) era names added to collation and 
search collation


*Measurement units*



23 additional units 



*Date formats*



Two additional flexible formats, and 20 new interval formats

*Japanese calendar*



Updated to Gannen (元年) number format for years

*Region Names*



Many names updated to local equivalents of “North Macedonia” (MK 
) 
and “Eswatini” (SZ 
)


A dot release, version 35.1 is expected in April, with further changes 
for Japanese calendar.


For details, see Detailed Specification Changes 
, 
Detailed Structure Changes 
, 
Detailed Data Changes 
, 
Growth .





IUC 43 - call for presentations closing soon

2019-03-04 Thread Rick McGowan via Unicode
For those who might be interested in submitting for the 43rd 
Internationalization & Unicode® Conference (IUC 43) in Santa Clara, 
California, October 16-18, 2019...


The call for presentations closes at the end of this week.

http://www.unicodeconference.org/e-marketing/IUC43-CfP-030419.htm




Unicode CLDR 34 beta available for testing

2018-10-04 Thread Rick McGowan via Unicode
The *beta* version of Unicode CLDR 34 
 is available for 
testing. The final release is expected on October 12.


CLDR 34 provides an update to the key building blocks for software 
supporting the world’s languages. This data is used by all major 
software systems  for 
their software internationalization and localization, adapting software 
to the conventions of different languages for such common software tasks.


CLDR 34 included a full Survey Tool data collection phase. Other 
enhancements include several changes to prepare for the new Japanese 
calendar era starting 2019-05-01; updated emoji names, annotations, 
collation and grouping; and other specific fixes. The draft release page 
at http://cldr.unicode.org/index/downloads/cldr-34 lists the major 
features, and has pointers to the newest data and charts. It will be 
fleshed out over the coming weeks with more details, migration issues, 
known problems, and so on. Particularly useful for review are:


   * Delta Charts 
 - the data that changed during the release
   * By-Type Charts
  - a
 side-by-side comparison of data from different locales
   * Annotation Charts
  - new
 emoji names and keywords

Please report any problems that you find using a CLDR ticket 
. We’d also appreciate it if 
programmatic users of CLDR data download the xml files and do a trial 
integration to see if any problems arise.





Server move notice, Unicode

2018-07-18 Thread Rick McGowan via Unicode

Hello everyone,

On Wednesday evening (US time) July 18 the www.unicode.org server will 
again be undergoing migration. Downtime / off-line period is expected to 
be a few hours at most, beginning shortly after 17:00 Pacific time.


We apologize for the inconvenience.

Rick



Server move notice, Unicode

2018-07-14 Thread Rick McGowan via Unicode

Hello everyone,

Over this weekend on Sunday (US time) July 15 the *www.unicode.org 
*server will be undergoing a migration. Downtime should be minimal, but 
there is some possibility of brief periods off-line from Sunday morning 
through evening.


The Unicode mail list may be unavailable for parts of the day on Sunday.

We expect to be complete and functioning normally before Monday morning.

Only the "www" server is affected. CLDR, Survey Tool, and ICU facilities 
are not affected at this time.


Rick



IUC 42 - abstract submission deadline extended to March 16

2018-03-12 Thread Rick McGowan via Unicode

Hello everyone,

The submission deadline for IUC 42 abstracts has been extended to 
Friday, March 16.


http://www.unicodeconference.org/call-for-participation.htm

Hope you can join us in September.

Regards,
Rick



Re: TIRONIAN SIGN ET

2018-01-27 Thread Rick McGowan via Unicode

Hello Janusz --

Try this: http://www.unicode.org/L2/L2017/17300-n4841-tironian-et.pdf

Regards,

On 1/27/2018 11:40 AM, Janusz S. Bień via Unicode wrote:

Hi!

I try to find in UTC Document Register the proposals for characters
which interest me for some reasons. I'm usually rather successful, but
I'm unable to find the proposal for TIRONIAN SIGN ET.

Any hints?

Best regards

Janusz





Re: 0027, 02BC, 2019, or a new character?

2018-01-19 Thread Rick McGowan via Unicode

Michael -

Lemme know when you're ready to print. I have a huge bag of leftover 
apostrophes I can send you.




On 1/19/2018 5:51 AM, Andrew West via Unicode wrote:

On 19 January 2018 at 13:19, Michael Everson via Unicode
  wrote:

I’d go talk with him :-) I published Alice in Kazakh. He might like that.

Damn, you'll have to reprint it with apostrophes now.

Andrew






Emoji candidate chart update

2017-11-02 Thread Rick McGowan via Unicode

Hi Everyone,

Just FYI... The new Unicode emoji candidate charts, with updates from 
the UTC #153 meeting are now posted at: 
http://www.unicode.org/emoji/future/emoji-candidates.html


R



Public review of draft repertoire for ISO/IEC 10646

2016-06-15 Thread Rick McGowan
The UTC would appreciate feedback on new repertoire that is currently 
under ballot for future additions to ISO/IEC 10646. This includes 
repertoire that has already been reviewed and approved by the UTC, but 
which will not be published until next year, as part of Version 10.0 of 
the Unicode Standard.


This is your opportunity to review the planned new repertoire for 
possible problems, and to make any suggestions you might have about 
improvements for glyphs or character names.


See PRI #327  and PRI #328 
 for details on access to the 
draft repertoire documents for review, and for how to provide your 
feedback. The characters of interest -- the new repertoire under ballot 
-- are highlighted in yellow in the code charts in those documents. 
Glyph corrections or improvements in the charts are highlighted in a 
light blue.


Note that we already know about the mistaken glyph for the new character 
U+1D378 TALLY MARK FIVE, so you do not need to report that problem again!


Note also that a few of the characters for review in PRI #328, including 
the 72 new emoji characters, have been accelerated for publication in 
Unicode 9.0. The UTC will not be able to respond to further feedback on 
those 9.0 characters, which are already frozen for publication.




Unicode Emoji Charts updated

2015-10-24 Thread Rick McGowan
The Unicode Emoji charts  have been 
updated to show the new images from Apple, and the Selection Factors for 
emoji proposals  have also been 
updated. Among the many other topics at the Unicode technical conference 
 on October 26-28, there will be a 
new panel session on emoji for people to find out more about how new 
emoji are developed.





Re: The scope of Unicode (from Re: How can my research become implemented in a standardized manner?)

2015-10-23 Thread Rick McGowan

William,

All right... This is likely to be my last posting on the subject...

... there has been much objection to my invention in this mailing list 
over the years, with no good reason ever stated, ...


If this invention had been made in the research laboratory of a large 
information technology company maybe things would be very different.




Please see attached image, for example. While it's not yet as fun as 
Star Trek, this kind of thing can be done for simple interactions in a 
variety of languages using a $20 cell phone...


See also: https://en.wikipedia.org/wiki/Google_Translate

   /As of October 2015, Google Translate supports 90 languages at
   various
   levels^
   
   and serves over 200 million people daily./





Re: The scope of Unicode (from Re: How can my research become implemented in a standardized manner?)

2015-10-22 Thread Rick McGowan

Hello William,

Answers to most of your questions can be found among the pages of the 
Unicode Consortium website. I'll try to answer your questions about 
scope which may also be of interest to other subscribers, but please 
note that *everything I say in this e-mail is solely my own opinion and 
does not reflect the opinions or policies of Unicode, Inc, or any of its 
committees.


*


What is the scope of Unicode please?



The scope of The Unicode /Standard /(TUS) is set forth in Chapter 1, 
which you can find here: 
_http://www.unicode.org/versions/Unicode8.0.0/ch01.pdf_


The scope of the Unicode /Consortium /is essentially distilled in the 
mission statement, which is on the home page:_

__http://www.unicode.org/_and on the "What is Unicode" page here:_
__http://www.unicode.org/standard/WhatIsUnicode.html_
under the heading "About the Unicode Consortium"... and formally here, 
in the corporate bylaws:_

__http://www.unicode.org/consortium/Unicode-Bylaws.pdf_
under "Article I - Purpose and Membership", which says:

   ...This Corporation’s specific purpose shall be to enable people
   around the world to use computers in any language, by providing
   freely-available specifications and data to form the foundation for
   software internationalization in all major operating systems, search
   engines, applications, and the World Wide Web. An essential part of
   this purpose is to standardize, maintain, educate and engage
   academic and scientific communities, and the general public about,
   make publicly available, promote, and disseminate to the public a
   standard character encoding that provides for an allocation for more
   than a million characters.



Can it ever change?



The answer to that question depends on what you mean by "it", and 
"change", really. The scope of the /standard /has changed several times 
over the course of its history, as has the scope of the /consortium/, 
for good reasons. For example, the corporate scope was expanded to 
include a variety of standards beyond just the character encoding 
standard, which were of interest to members (and continue to be of 
interest). The scope of the /standard /was expanded to include code 
space for more than 65,536 characters, to include characters needed for 
historical scripts, and so forth.


If it can change, who makes the decision? For example, does it need an 
ISO decision at a level higher than the WG2 committee or can the WG2 
committee do it if it so pleases?




Like any /corporation/, the Unicode Consortium bylaws are subject to 
changes from time to time. The full members, as set forth in the bylaws, 
are the ones who may make changes to the bylaws. There are some 
restrictions, of course, such as operating within various legal 
parameters and within the scope of a public-benefit charitable 
organization, as defined under US law.


The /standard /is mainly controlled by the Unicode Technical Committee, 
operating under the TC Procedures laid out here:_

__http://www.unicode.org/consortium/tc-procedures.html_
and subject to interpretation or restriction by the officers and board 
of directors. The UTC works very closely with members of ISO/IEC 
JTC1/SC2 and the working group WG2 under it. (You can find out about ISO 
procedures and so forth on their site.)



How can a person apply for the scope of Unicode to become changed please?



The most direct way to influence the scope of the Unicode Standard is 
through becoming a full member of the consortium:

_http://www.unicode.org/consortium/join.html_
so that you can vote in corporate meetings and for members of the board, 
as well as in technical committees. Then, presumably, you would go to an 
annual member's meeting (or call for a special meeting) and present your 
case for the scope of the consortium to be changed. Then, if you want to 
change the scope of The Unicode Standard, you call for a vote in the UTC 
and achieve a majority of votes on whatever resolution you put to the 
committee. This is /intentionally /a weighty process.


I have been considering how to make progress with trying for my 
research to become implemented in a standardized manner.




Personally, I think you're getting ahead of yourself. First, you should 
demonstrate that you have done research and produced results that at 
least some people find so useful and important that they are eager to 
implement the findings. Then, once you have done that, think about 
standardizing something, but only after you have a /working model /of 
the thing sufficient to demonstrate its general utility.


While I do not speak for the UTC in any way, observations of the 
committee over a period of some years have led me to conclude that they 
never encode something, call it "X", on pure speculation that some 
future research might result in "X" being useful for some purpose that 
has not even been demonstrated as a need, or clearly enough articulated 
to engender the committee's confidence in its potential 

Re: VS: [somewhat off topic] straw poll

2015-09-11 Thread Rick McGowan

Doug, et al --

The primordial statement you're looking for is in TUS, Chapter 1 and has 
been there forever. See:


http://www.unicode.org/versions/Unicode8.0.0/ch01.pdf

In section 1.1, page 3:

*Note, however, that the Unicode Standard does not encode idiosyncratic, 
personal, novel, or private-use characters, nor does it encode logos or 
graphics.*


I'm not sure UTC has ever made any specific pronouncement on the topic, 
but they do sometimes add things to the notice of non-approvals, which 
can generally be taken as a precedent.


http://unicode.org/alloc/nonapprovals.html

If there is any such statement from the UTC, Ken Whsitler would probably 
be the one who could put his hand upon it most quickly. :-)


R.




On 9/11/2015 10:25 AM, Doug Ewell wrote:

I absolutely agree that UTC -- the technical committee, not the
corporation -- should issue a formal statement expressing its position
as to:

1. Generally, whether novel and untested concepts, particularly those
for which a sizable body of popular support has not been established,
are viewed by UTC as suitable and appropriate candidates for encoding in
the Unicode Standard, on the basis of their perceived future usefulness.
(I believe this statement has been made already; if so, a reference that
can be easily cited would serve the purpose.)

2. Specifically, whether the particular concept that William proposes,
to encode entities that are not characters into the Unicode Standard on
the basis of their perceived future usefulness, is viewed by UTC as
being suitable for and appropriate to the standard.




IUC 39 call for participation - abstract submission reminder

2015-03-26 Thread Rick McGowan

Hi everyone,

Just a quick reminder that the Call for participation in IUC #39 is now 
open, and the deadline for submitting an abstract is coming up quickly: 
April 3.


All the information is here, on the conference website:

http://www.unicodeconference.org/

The conference itself is October 26-28, in Santa Clara.

Watch this space in upcoming weeks for further announcements...

Regards,
Rick

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Admuncher javascript on Unicode site

2014-12-25 Thread Rick McGowan

Thank you for the report.

This is an error on my part: saving an HTML file from a browser window
on my own machine while running an ad blocker. I usually don't do that. 
I will correct this file and update it as soon as I have an opportunity.


Regards,
Rick

On 12/25/2014 6:14 AM, Neil Harris wrote:

I've just noticed that loading the web page

http://www.unicode.org/L2/L2014/14250.htm

loads a script from interceptedby.admuncher.com

This seems pretty peculiar to me. Is this intended?

Neil Harris


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: New Unicode Emoji draft, available for review

2014-11-05 Thread Rick McGowan
FYI, Posting this on behalf of Mark Davis... Something in his original 
reply message is apparently toxic to our mail gateway that it can't get 
through. (Investigating.)


May be the literal U+1F4A9, which I have (I'm sorry) redacted below.

Rick



 Could be either one [U+1F4A9]

 The exact contents of minimal and optional characters is something 
that we

 want to get feedback on. But I don't think [U+1F4A9] is in the running!

 BTW, I'm seeing about 250 new news articles on this, per hour (in 
English).

 https://www.google.com/search?q=emoji+unicodetbm=nwstbs=qdr:h

 Plus a scattering of others, s.a.
 
http://www.spiegel.de/netzwelt/web/unicode-consortium-emojis-demnaechst-fuer-alle-hautfarben-a-1001125.html










___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: What happened to...?

2014-09-19 Thread Rick McGowan

Hi Mark,

This document ended up being delayed all the way into meeting #133, so 
the resolution is in those minutes:


http://www.unicode.org/L2/L2012/12343.htm#133-A62

Regards,
Rick


On 9/19/2014 3:56 PM, Mark E. Shoulson wrote:
that, http://www.unicode.org/L2/L2011/11373-linguistic-doubt.pdf 
proposes some 


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: PRI #273, UTS #39 draft data updated

2014-07-24 Thread Rick McGowan
The draft and data for the proposed update UTS #39 were both changed on 
2014-07-24.


It appears that the issue previously noted with idempotence in the UTS 
#39 tables can be addressed for all of the mappings, with some extensive 
changes.​ The issue will be documented in the text ( see 
http://www.unicode.org/reports/tr39/proposed.html ) and the PRI text is 
being adjusted as well.


The PRI page has been updated as well:
http://www.unicode.org/review/pri273/

___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Erratum report for UTS #46

2014-07-14 Thread Rick McGowan
Recently we received an error report about one of the data files for the 
latest release of UTS #46, specifically the testing data in IdnaTest.txt .


The erratum notice is here:
http://www.unicode.org/errata/#current_errata

An update to this file has been generated, and that has now been posted. 
Users of UTS #46 test data may wish to take note.


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Emoji [And crash in the Web interface to the mailing list]

2014-04-02 Thread Rick McGowan
Also, fwiw, the new Mailman archives using Pipermail seem to do better 
than the legacy archives: 
http://unicode.org/pipermail/unicode/2014-April/000382.html




On 4/2/2014 12:44 PM, Doug Ewell wrote:

I didn't have any trouble viewing James's examples from the Web
interface, although of course the private-use characters showed up as
dots instead of whatever they were supposed to be.


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Pali in Thai Script

2014-03-27 Thread Rick McGowan

Hello,

This is an interesting discussion so far...

What is the current situation of Pali written in the Thai script? Is 
there a scholarly tradition already? Why are new symbols being used for 
this purpose in this project? Is it because nothing else exists at this 
time? Or some other reason? Has this never been done before?


I'm trying to understand the particular scholarly need that will be 
addressed by this project, and to know why some other existing symbols 
are not, or cannot, be used for this purpose. It would help to get a 
sense of the project scope, and how it relates to previous and current 
Pali scholarship in Thailand. And what alternative solutions have been 
discussed and/or used by the project participants.


(Also to be clear: I'm only asking these questions out of personal 
curiosity, not an official question on behalf of the UTC or anything 
like that.)


Thanks,
Rick

On 3/27/2014 1:14 AM, Sittipon Simasanti wrote:

In order to ease this situation, we have created an orthography font (slightly 
modified from the existed Thai font) and used them internally. I have to admit 
that, currently, we are changing the glyphs from time to time. But, we are 
looking forward to establish the studies nationwide in the near future once 
everything is in place.


___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Mail list changes for 2014

2014-01-02 Thread Rick McGowan

Hello everyone.

The Unicode mail list has now been re-activated. If you experience 
trouble with subscription issues or functionality, please feel free to 
e-mail me directly.


Regards,
Rick


On 12/31/2013 8:48 AM, Rick McGowan wrote:
The mail list will now go off-line shortly, and be back after the new 
year.

Regards,
Rick

On 12/3/2013 2:56 PM, Rick McGowan wrote:
At the end of the year, we will be changing the mail list server for 
the public-access mail lists, including this one. The new system will 
be Gnu Mailman, an interface familiar to many. This should make it 
easier for users to handle their subscriptions and options in one 
place, via the web interface.


We will thus be shutting down the public mail lists over the holiday 
break in the final days of 2013, and re-open with the new system in 
January 2014.


Affected mail lists are those listed on the Mail Lists page here:
http://www.unicode.org/consortium/distlist.html
including Unicode, CLDR-Users, ULI-Users, and Indic.

The new mail list system is documented here: 
http://www.gnu.org/software/mailman/






___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode


Re: Mail list changes for 2014

2013-12-31 Thread Rick McGowan

The mail list will now go off-line shortly, and be back after the new year.
Regards,
Rick

On 12/3/2013 2:56 PM, Rick McGowan wrote:
At the end of the year, we will be changing the mail list server for 
the public-access mail lists, including this one. The new system will 
be Gnu Mailman, an interface familiar to many. This should make it 
easier for users to handle their subscriptions and options in one 
place, via the web interface.


We will thus be shutting down the public mail lists over the holiday 
break in the final days of 2013, and re-open with the new system in 
January 2014.


Affected mail lists are those listed on the Mail Lists page here:
http://www.unicode.org/consortium/distlist.html
including Unicode, CLDR-Users, ULI-Users, and Indic.

The new mail list system is documented here: 
http://www.gnu.org/software/mailman/







Re: Mail list changes for 2014

2013-12-12 Thread Rick McGowan

Hi Don,


Rick, Will the existing mail archives be maintained? At same location?


This is a good question.

The current archives are actively manufactured by Hypermail, and unless 
something goes terribly wrong, the archiving system will continue to 
work for now. There will be new Mailman archives for the lists, also, 
from the date of the switch-over. It hasn't been decided whether to 
consolidate, and how the archive documentation page will work. I'm kind 
of taking one step at a time.



On that topic, I couldn't find the mail archive for the oldafr...@unicode.org  
list. Is that still maintained?


There were never public Hypermail archives for that specialist list, 
which has been retired since April 2009. We can discuss that off-list.


Regards,
Rick




Mail list changes for 2014

2013-12-03 Thread Rick McGowan
At the end of the year, we will be changing the mail list server for the 
public-access mail lists, including this one. The new system will be Gnu 
Mailman, an interface familiar to many. This should make it easier for 
users to handle their subscriptions and options in one place, via the 
web interface.


We will thus be shutting down the public mail lists over the holiday 
break in the final days of 2013, and re-open with the new system in 
January 2014.


Affected mail lists are those listed on the Mail Lists page here:
http://www.unicode.org/consortium/distlist.html
including Unicode, CLDR-Users, ULI-Users, and Indic.

The new mail list system is documented here: 
http://www.gnu.org/software/mailman/





public review issue reminder

2013-10-29 Thread Rick McGowan

Hi everyone,

This is just a friendly reminder... There are several open public review 
issues that close this week, and the next UTC meeting is next week. 
(Yes, formally speaking some of these issues closed on Oct 28, but I'm 
still gathering comments, in case anyone wants to submit them today. 
Please see:

http://www.unicode.org/review/

The two most recent issues, for UTR #36 and UTS #39 will be open through 
January.


Regards,
Rick




IUC 37 - Only one week away

2013-10-14 Thread Rick McGowan

Hi everyone,

This year's Internationalization and Unicode Conference is only a week 
away. You can read more about it, and check out the program here:


http://www.unicodeconference.org/e/IUC37-2WEEK-10-09-13.htm





IUC 37 - Call for Participation reminder

2013-03-18 Thread Rick McGowan

Hello everyone --

This is just a quick reminder that March 29 is the deadline for 
submission of abstracts for the October conference (IUC 37). You can see 
the announcement and instructions here:


http://www.unicodeconference.org/call_for_papers.htm

If you're planning to submit an abstract for consideration, please don't 
forget!


Regards,
Rick




Re: pIqaD in actual use

2013-02-20 Thread Rick McGowan

On 2/20/2013 4:38 PM, Richard Wordingham wrote:

How did they find a native speaker?


Good question. And then: Why didn't that native speaker beam into the 
UTC meeting when we were debating the encoding of Klingon?


Rick





Brief downtime for www.unicode.org

2013-02-13 Thread Rick McGowan

Hello everyone,

The main server www.unicode.org will be taken off-line for a bit this 
evening after 6ypm Pacific time for some maintenance. The outage is 
expected to be brief.


Regards,
Rick




Unicode.org downtime tonight

2012-11-26 Thread Rick McGowan

Hello everyone,

Just a quick heads-up... We received a report of a problem in a RAID 
disk and have arranged to have it swapped out this evening, Pacific 
time. That will be sometime after 6pm today November 26 (0200 GMT Nov 
27,  1100 Tokyo time), for up to an hour. It should be back online after 
that, and there may be some degradation of response time while the RAID 
rebuilds overnight.


Regards,
Rick





Re: texteditors that can process and save in different encodings

2012-10-05 Thread Rick McGowan

Doug, et al -

In your experience, what are the best (plaintext) texteditors or word 
processors for Linux / Mac OS X / Windows that have the ability to 
save in many different encodings? 


Ok, for what it's worth... I personally now use SublimeText2 as my sole 
plain-text editor on Windows. It supports UTF-8 natively -- with or 
without BOM. (Has apparently no bidi support, so for some people that 
would be a non-starter.) Can read/write a healthy variety of encodings, 
easily accessible via menu, including an option to re-open the current 
file with some other encoding.

http://www.sublimetext.com/2

And because I'm an emacs geek, I hacked up an emulation keyboard mapping 
that works pretty well. ST2 also has a bunch of other features for 
working on projects and will auto-save buffers and window configs across 
reboot/restart.


Rick




Re: Mayan numerals

2012-08-23 Thread Rick McGowan

Jameson, Michael, et al -

OK, I'm going to join in here before this goes much further. (And as 
usual on this list I'm writing as an individual, and this is only my 
personal opinion.)


You are always welcome to put forward a proposal for whatever you want 
to see encoded. I'm happy to receive serious proposals, and in my 
experience, the committee is generally happy to look at them.


But when you ask the question and make the observations below, my simple 
answer would be that you don't see any problem because you haven't been 
sitting in the character encoding committees for 20+ years to observe 
how things gang aft agley as they say.


In my opinion, the UTC would be irresponsible to approve the encoding 
for a set of digits for a complicated system like Mayan without even 
having a preliminary script proposal on record; and without any 
involvement of the actual serious scholars in the field.


No matter what you say about how safe it is, well... I wouldn't tend to 
believe that without firm evidence, which means -- at least -- someone 
having done significant work on the whole script in the context of a 
character encoding proposal to prove it. And given a lot of the other 
questions and speculation in your recent e-mail, I'm inclined to think 
that yes, you aren't an expert and probably don't have enough clear 
answers to detailed questions, as required to convince a committee.


In any case: you're welcome to write up a proposal for Mayan digits and 
give your opinions and findings. It would not be a waste of time to do 
so. But I would expect the outcome to be that the committee would set it 
aside and eventually pass it along to the scholars who end up working on 
the actual proposal for the Mayan script. At that time, it would be a 
valuable input document.


Cheers,
Rick

---

On 8/23/2012 10:33 AM, Jameson Quinn wrote:


Because we aren't ready to do it without doing it in the context
of the whole script.


Why not? Can you give some indication of what you're afraid of, some 
scenario of how we could possibly later regret having included the 
basic digits now?


I understand you may be reluctant to speculate, but I really don't see 
how it could be a problem.




Re: Website unavailable

2012-07-15 Thread Rick McGowan
To all concerned: Please accept our apologies for problems with the mail 
list, website etc.


Our provider reports experiencing a large-scale distributed denial of 
service attack that has crippled their data center connectivity since 
yesterday (Saturday).


We do not have an estimated time when this will be fixed, but our 
provider is working diligently around the clock.


Regards,
Rick

On 7/15/2012 1:00 AM, Jean-François Colson wrote:

I can’t access to unicode.org.
Is there a problem with the website?






Re: Mandombe

2012-06-08 Thread Rick McGowan

Jean-François --

 What decision has been made? Has it been accepted? Rejected? For ever 
or till more information is provided?


I will follow up with you off-list.

Yes, the committee reviewed the proposal during meeting 126, and is 
following up with questions to the authors of the proposal. See meeting 
minutes:

http://www.unicode.org/consortium/utc-minutes/UTC-126-201102.html

Rick




IUC 36 - call for participation reminder

2012-05-10 Thread Rick McGowan
Hello everyone. Just a reminder that the deadline for abstract 
submissions is coming up soon. If you are thinking of submitting 
something for the October, conference, please see the information below.


Regards,
Rick


*Call for Participation Announced!*

*The Internationalization and Unicode Conference (IUC)* is the premier 
event covering the latest in industry standards and best practices for 
bringing software and Web applications to worldwide markets. This annual 
event focuses on software and Web globalization, bringing together 
internationalization experts, tools vendors, software implementers, and 
business and program managers from around the world.


The Program Committee is soliciting proposals for presentations that 
describe cases studies, best practices, effective software design, 
innovative technology, or important standards. Tutorial presentations 
are also welcome. Suitable topics include, but are not limited to:


*Application Areas*

* 	Designing software platforms, operating systems, software as a 
service (SAAS), or programming environments

*   Social networks
*   Search engines, SEO, discovery and navigation best practices
*   Websites and web services
*   Libraries and education
* 	Mobile applications including iPhone, Android, iPad, Kindle, Windows 
Mobile, tablets, etc.

*   Publishing and broadcasting for a global audience
*   Internationalized Domain Names and other identifiers
*   Security concerns and practices
*   Voice to text, text to voice
*   Machine translation
*   Unicode, encodings, scripts, character properties, and algorithms

*General Techniques*

*   Advances in technologies, algorithms or methodologies
*   Using internationalization libraries and programming environments
*   Handling bidirectional or other complex scripts
*   HTML5 and HTML5-based applications
* 	Dealing with data formats: XML, JSON, HTML5, DITA, and upcoming 
standards
* 	Project management and methodologies for global development teams 
e.g. Agile

*   Best practices in localization process and technology
*   Best practices in world-ready development, test, and deployment
*   Improving globalization capabilities within organizations
*   Approaches for migrating legacy applications to global markets
*   Font development and Typography

*Culture and Technology*

*   Endangered Languages
*   Unencoded Languages
*   Case studies and research on cross-culture communication
*   Digital Divide
*   ISO language tag issues

*Regional Considerations*

*   Languages of Africa, Asia, and the Middle East
*   Locales and the Unicode Common Locale Data Repository (CLDR)
*   Emoji support

Tutorial presenters receive complimentary conference registration, and 
two nights lodging. Session presenters receive a fifty percent 
conference discount and two nights lodging.


To be considered as a presenter for the conference, please *submit 
http://www.unicodeconference.org/abstracts.htm* a brief abstract by 
the deadline of Friday, May 18th.


The Program Committee will notify authors by Friday, June 1st. Final 
presentation materials will be required from selected presenters by 
Friday, August 3.



*About The Unicode Consortium*
The Unicode Consortium is a non-profit organization founded to develop, 
extend and promote use of the Unicode Standard and related globalization 
standards.


The membership of the consortium represents a broad spectrum of 
corporations and organizations in the computer and information 
processing industry. Members are: Adobe Systems, Apple, Google, 
Government of Bangladesh, Government of India, IBM, Microsoft, Monotype 
Imaging, Oracle, SAP, The Society for Natural Language Technology 
Research, The University of California (Berkeley), The University of 
California (Santa Cruz), Yahoo!, plus well over a hundred Associate, 
Liaison, and Individual members.


For more information, please contact the Unicode Consortium 
http://www.unicode.org/contacts.html.


*About the Event Producer*
OMG® is the Event Producer for the Internationalization  Unicode 
Conferences. OMG is an open membership, not-for-profit consortium that 
produces and maintains computer industry specifications for 
interoperable enterprise applications. Our specifications include MDA®, 
UML®, CORBA®, MOF^(TM), XMI® and CWM^(TM). OMG's specifications are all 
available for download by everyone without charge.


For more information about OMG, visit us online at http://www.omg.org.

This email may be considered to be commercial email, an advertisement or 
a solicitation. If you would prefer not to receive messages from the 
OMG, please go to: 
http://www.omg.org/registration/registration-unsubscribe.htm.





Re: Notice of brief Unicode.org system outage on Friday

2012-05-02 Thread Rick McGowan

Christian,


Just wondering why the time zone reference is not given in a universal format, 
like UTC±n, so one in other part of the world can calculate.


That's 0700 GMT on Friday.

For others:  http://wwp.greenwichmeantime.com/

Rick




Notice of brief Unicode.org system outage on Friday

2012-05-01 Thread Rick McGowan
Due to some electrical work, the Unicode web servers will be off line 
for as much as 2 hours 2am to 4am US Central time, Friday May 4.





Re: Origins of ẘ

2012-04-15 Thread Rick McGowan

 At Wiktionary, we're looking at ẘ (U+1E98) and
 we can't figure out where it came from.

Good catch. It's obviously another stowaway...
Just throw it in the brig until we can get around to deporting it.





Re: Code2000 on SourceForge

2012-02-04 Thread Rick McGowan
And, the most recent mysterious development: unsubscription requests 
have come in from that James Kass account for all of the mail lists that 
it just subscribed to a few days ago. It obviously wasn't the real guy.


Posting from that address has now been disabled on the Unicode list...

Rick


On 2/4/2012 1:25 PM, Doug Ewell wrote:

Well, that's a disappointment. I was hoping the real James had surfaced, even 
if he can't work on the fonts any more. That means a whole lot of digital ink 
just got wasted arguing over the perfect license.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org  | @DougEwell
Sent via BlackBerry by ATT







Re: Upside Down Fu character

2012-01-03 Thread Rick McGowan
I would say to use higher level mark-up or images for this. I don't see 
any reason to start down the road of encoding upside down Chinese 
characters, or variation sequences, for such things. They are decorative 
anomalies, not plain text.


Rick


On 12/30/2011 7:34 AM, Andre Schappo wrote:
The character 福 means happiness http://www.mdbg.net 
/chindict/chindict.php?page=chardictcdcanoce=0cdqchi=福 
http://www.mdbg.net/chindict/chindict.php?page=chardictcdcanoce=0cdqchi=%E7%A6%8F 



Unicode entry: U+798F  CJK UNIFIED IDEOGRAPH-798F

It is customary to use an upside-down version of 福 during the Spring 
Festival http://en.wikipedia.org/wiki/Fu_character





Re: Upside Down Fu character

2012-01-03 Thread Rick McGowan

Michael,


What's the inline markup for display this glyph upside down?


It doesn't really matter, and it would depend on the system anyway.

My argument here is that this is a one-off need for some character in 
a specialized, decorative context. This upside-downness or rotation is 
not systematic in Chinese, nor part of a notational system. There is no 
need for this thing to be expressed inline in plain text at all.


The one given example in the Wikipedia page, for example, is not 
textual, it is a paper decoration hung upside down.


Rick




IUC 35 Conference T-Shirt Design

2011-09-02 Thread Rick McGowan

Hello everyone,

Have you ever thought you could design a great Unicode themed T-shirt? 
We're putting out a call for T-shirt design ideas. But we must move 
quickly -- the deadline is looming.


While you're enjoying your Labor Day barbecue (in the US) and doodling 
on napkins, maybe you'll come up with a great design. Maybe you've 
already thought of it.


Send us your ideas, and we will forward them to the graphic designer as 
input for this year's conference T-shirt. You can send text, pictures of 
hand drawings, full digital illustrations, etc.


The deadline is Monday night, US time. Just e-mail your entry directly 
to me. I will collect them and forward to the designer next Tuesday.


And, while you're thinking about the October conference, please take a 
look at the great line-up of talks we have this year:

http://www.unicodeconference.org/conference-at-a-glance.htm

Cheers,
Rick



Re: Fw: Endangered Alphabets [OT]

2011-09-02 Thread Rick McGowan

Chris,


How does this differ from what the Script Encoding Initiative
http://linguistics.berkeley.edu/sei/   is already trying to do?


It is an art project, not a script encoding project. The artist is 
seeking financial  support for finishing a massive woodworking project 
with accompanying book.


Rick




Re: Pupil's question about Burmese

2010-11-08 Thread Rick McGowan

Hello Philippe Blankert -

Thanks for your interest in Unicode...

http://www.burmese-dictionary.org/tastatur.php?terme=hoteltermb=%5Bkdw%2Cfid=2970 



That page isn't in Unicode at all, it's an 8859-1 encoded page. That's part 
of the problem.


Then, the Burmese characters on the page are all *images*, and when you 
click the buttons to type into the field, it seems to send ASCII text to the 
input field.


And the WWin_Burmese1 font, which I just downloaded to check, is an 
ASCII-hack font that is not encoded in Unicode.


Hope that helps.

Rick




Fwd: Walk-ins Welcome: 34th Internationalization Unicode Conference, October 18-20, Santa Clara, CA USA

2010-10-15 Thread Rick McGowan
 Hello everyone! Just a quick reminder that next week is the conference. We 
have a great line-up of talks:

http://www.unicodeconference.org/conference-at-a-glance.htm
and a keynote talk by Brewster Kahle on Tuesday morning:
http://www.unicodeconference.org/bios.htm#Kahle

This year we're also in a new venue, the Hyatt in Santa Clara. I hope to see 
many of you there!


Cheers,
Rick






 WALK-INS WELCOME!
 Register onsite!


* New to software internationalization or just need a refresher course?*
Attend Monday's tutorials, listen to speakers from companies such as Adobe 
Systems, Amazon, IBM, Jim DeLaHunt  Assoc., W3C, XenCraft, and Yahoo! Inc.


* Prefer conference sessions?
*Stop by on Tuesday to sign up for * 2 days of presentations*, hear how 
software works in all world languages. Session Presenters come from such 
organizations as Across Systems, Inc., Adobe Systems, Aeontera, Inc., Aoyama 
Gakuin University, Apple, Asia Online, Byte Level Research, Capella Software 
Ltd., Diwan Software, Google, Inc., IBM, Intel, Kamusi Project 
International, Lab126, Microsoft, Monotype Imaging Inc., SAS Institute, 
Twitter, Inc., UC Berkeley, W3C, XenCraft, and Yahoo! Inc.


To see everything this program has to offer, click *here* 
http://www.unicodeconference.org/conference-at-a-glance.htm.


Safe travels and we look forward to seeing you at the Hyatt Hotel in Santa 
Clara, CA!


Event Producer: Gold Sponsor:   Media Sponsor:
OMG Logo http://www.omg.org
	http://www.unicodeconference.org/adobe-banner 	MultiLingual 
http://www.unicodeconference.org/ml-banner



About The Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, 
extend and promote use of the Unicode Standard and related globalization 
standards.


The membership of the consortium represents a broad spectrum of corporations 
and organizations in the computer and information processing industry. 
Members are: Adobe Systems, Apple, DENIC eG, Google, Government of India, 
Government of Tamil Nadu, IBM, Microsoft, Monotype Imaging, Oracle, The 
Society for Natural Language Technology Research, Sun Microsystems, Sybase, 
The University of California at Berkeley, Yahoo!, plus well over a hundred 
Associate, Liaison, and Individual members.


For more information, please contact the Unicode Consortium 
http://www.unicode.org/contacts.html.


About the Event Producer
The Object Management Group™ (OMG™) is the Event Producer for the 
Internationalization  Unicode Conferences. OMG is an open membership, 
not-for-profit consortium that produces and maintains computer industry 
specifications for interoperable enterprise applications. Our specifications 
include MDA®, UML®, CORBA®, MOF™, XMI® and CWM™. OMG's specifications are 
all available for download by everyone without charge.


For more information about OMG, visit us online at http://www.omg.org.

If you would prefer not to receive messages about the Unicode Conference, or 
have address corrections, please reply to this email message, requesting 
Unsubscribe or describing your address corrections in the body of the text. 
Please leave subject line intact.


.



Re: Indian new rupee sign

2010-07-30 Thread Rick McGowan

On 7/30/2010 4:01 AM, William_J_G Overington wrote:

I find it strange that for a new currency symbol that is to come into use in 
six months that, in the twenty-first century, with all the modern communication 
methods available, that encoding in Unicode will take longer than six months.


William, perhaps you can read the RFC, particularly section 8.
http://www.rfc-archive.org/getrfc.php?rfc=3718

That describes the ordinary process of character encoding.

Rick




Re: Writing a proposal for an unusual script: SignWriting

2010-06-14 Thread Rick McGowan

OK, I wasn't going to weigh in here, but...

On 6/14/2010 1:18 PM, Mark E. Shoulson wrote:
Here, the question is more a matter of given that SignWriting is 
nifty, does it qualify as plain text?  Or even Does the way 
SignWriting does its thing map well to the way Unicode does things?


After looking at this discussion for a while, and taking a look at what 
Steven Slevinski proposes, I think it matches Unicode about as well as 
math or music notation, or even Egyptian hieroglyphs. I.e., yes, the set 
of primitives seems encodable, and any English-language (or other 
language) pedagogical discussion of SignWriting will want *at least* the 
basic symbols. But... the whole system is not plain text, any more than 
music notation or math expressions. And even if UTC were to miraculously 
encode it all, with suitable semantics and lots of rules and give away a 
parser: nobody would implement it in standard software for the mass 
market, any more than they implement MdC for typesetting Egyptian 
Hieroglyphs.


Having said that, in theory I see no real reason (other than perhaps a 
bunch of intellectual property issues) that the basic symbols of 
SignWriting could not be encoded, given a suitable proposal, suitable 
stability, and assuming there is a sizable community of users.


I suggest that Steven take a look at Murray Sargent's UTN:
http://www.unicode.org/notes/tn28/

The set of entities listed in Steven's report is divided into several 
sections:

Structural Markers, BaseSymbols, Modifiers, Number Characters

Of those, it seems fairly obvious that the 652 base symbols are just 
symbols, which can be combined in various ways. The Structural markers 
could be encoded as control characters, or, in fact, as visual symbols 
for the thing they do, e.g.: symbol for left lane signbox marker, much 
like we have encoded the pictures for control symbols. The modifiers 
could likewise be encoded as signwriting fill modier X and so forth. 
(In fact, the proposal shows visual representations of the rotation 
symbols, etc, so presumably they already exist.)


*Parsing* a stream of this stuff into something that's legible and/or 
beautiful is beyond the scope of the standard, and I'm fairly sure the 
committee wouldn't even entertain such a thing any more than they 
entertained specifying the layout of western music notation.


But once you've got all of those symbols encoded, you could use a 
light-weight protocol similar to what Murray has done for embedding Math 
expressions in plain text. Software that recognizes the protocol can do 
fancy things to the contents of the sign zone. Off-the-shelf software 
that doesn't understand the protocol would do no worse than an ordinary 
word processor can now do with Egyptian Hieroglyphs or music symbols: 
blast out a line of symbols in-line.


In looking at the actual proposal, I'm not sure why sign language users 
are only allowed to count from -299 to 300, but presumably there's a 
logical explanation for that.


(This is all my personal opinion of course and does not reflect an 
official opinion of the consortium or the UTC.)


Rick




Unicode 6.0 beta code charts - updated today

2010-06-14 Thread Rick McGowan

Hi everyone,

The Unicode 6.0 beta has been out for a while.  Today I just posted an 
entirely new set today, and this *should* take care of all known chart 
bugs. But we could use more eyes!


Please see:
http://www.unicode.org/Public/6.0.0/charts/
and the beta page:
http://www.unicode.org/versions/beta.html

If you spot any errors or regressions in the charts, please file a bug 
through the reporting form: http://www.unicode.org/reporting.html


Thanks,
Rick




Re: Hexadecimal digits

2010-06-05 Thread Rick McGowan

On 6/5/2010 10:42 AM, Doug Ewell wrote, responding to Luke-jr:


Draft characters would be ones which are not final and can be 
removed or replaced in the future, if they don't in the meantime gain 
popularity within some reasonable timeframe.


There is no precedent for this in either Unicode or ISO/IEC 10646.  If 
you think it has been difficult persuading people that your characters 
should be encoded in the existing framework, just try suggesting a 
basic architectural change like this. 


Speaking only with my person opinion on this one poin: Doug is right. 
This won't happen. Once you have characters in real usage because a 
standard was released that contains them, even if the standard called 
them draft, you'd have data in the wild that could potentially 
become non-conformant.


Rick




Re: Hexadecimal digits

2010-06-04 Thread Rick McGowan

Luke-jr wrote,


Hexadecimal/tonal will never be popularised as long as it can be confused with 
letters...
   


and


But I'm not talking about programming languages, just common everyday uses by 
people who have it as their primary (not secondary) system of numbers.


Hexadecimal already is popular with programmers in programming 
situations. It's useful enough for dealing with computers that 
programmers have adopted it despite the shortcoming of being 
potentially confusable. People use complicated and potentially confusing 
systems all the time because to not use them would mean that (a) they 
can no longer communicate with everyone else and/or (b) they would 
represent an unnecessary discontinuity with all past usage, and thus 
people would lose touch with their history and literature. In the 
absence of cultural disasters, that doesn't typically happen on short 
time scales. (Look, for example, at the Japanese writing system.)


Hexadecimal/tonal will never be popular with ordinary humans for 
ordinary counting in social situations because people don't have ten 
fingers and nobody uses hexadecimal for ordinary counting, nor has any 
significant population ever done so, as far as I know.


Just out of curiosity, why do you think it's useful or important for 
people to use hexadecimal as their primary system of counting? What 
advantages would it confer?


(As usual on this list, this reflects purely my personal opinion.)

Rick




Re: Greek letter LAMDA?

2010-06-01 Thread Rick McGowan

For further info on names, see also http://www.unicode.org/notes/tn27/


Yep. Known inconsistency -- but this is not considered an error.
--Ken
   





New Public Review Issue posted

2004-12-23 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public review  
and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new items close on January 31, 2005.

Please see the page for links to discussion and relevant documents.  
Briefly, the new issue is:


59  Disunification of Dandas

The UTC is considering the question of disunifying the characters U+0964  
DEVANAGARI DANDA and U+0965 DEVANAGARI DOUBLE DANDA from their counterparts  
in several other Indic scripts. Feedback on this issue, for or against the  
disunification, is being sought.

A background document is available here:

http://www.unicode.org/review/pr-59.html


If you have comments for official UTC consideration, please post them by  
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please use  
the following link to subscribe (if necessary). Please be aware that  
discussion comments on the Unicode mail list are not automatically recorded  
as input to the UTC. You must use the reporting link above to generate  
comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



New Public Review Issue posted

2004-12-22 Thread Rick McGowan
The CLDR Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/#pri58

Review periods for the new items close on January 31, 2005.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:


58  Characters with cedilla and comma below in Romanian language

The CLDR Technical Committee is seeking feedback regarding the
relative frequency of use of the characters with comma below and
of the characters with cedilla in Romanian language textual material.



If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Unicode Version 4.1.0 Beta Release

2004-12-14 Thread Rick McGowan
The next version of the Unicode Standard will be Version 4.1.0, due for  
release in March, 2005.

A BETA version of the updated Unicode Character Database files is  
available for public comment. We strongly encourage implementers to  
download these files and test them with their programs, well before the end  
of the beta period. These files are located at the following URL:

http://www.unicode.org/Public/4.1.0/

(or ftp://www.unicode.org/Public/4.1.0/)

A detailed description of the beta is located here:

http://www.unicode.org/versions/beta.html

Any comments on the beta Unicode Character Database should be reported  
using the Unicode reporting form. The comment period ends January 31, 2005.  
All substantive comments must be received by that date for consideration  
at the next UTC meeting. Editorial comments (typos, etc) may be submitted  
after that date for consideration in the final editorial work.

Note: All beta files may be updated, replaced, or superseded by other  
files at any time. The beta files will be discarded once Unicode 4.1.0 is  
final. It is inappropriate to cite these files as other than a work in  
progress.

Testers should not commit any product or implementation to the code points  
in the current beta data files. Testers should also be ready for retesting  
based on updated data files which will be posted after the February, 2005  
UTC meeting.



If you have comments for official consideration, please post them by  
submitting your comments through our feedback  reporting page:

  http://www.unicode.org/reporting.html

If you wish to discuss beta issues on the Unicode mail list, then please  
use the following link to subscribe (if necessary). Please be aware that  
discussion comments on the Unicode mail list are not automatically recorded  
as beta comments. You must use the reporting link above to generate  
comments for official consideration.

  http://www.unicode.org/consortium/distlist.html


Regards,
Rick McGowan
Unicode, Inc.



Re: Nicest UTF.. UTF-9, UTF-36, UTF-80, UTF-64, ...

2004-12-07 Thread Rick McGowan
 Yes, and pigs could fly, if they had big enough wings.

An 8-foot wingspan should do it. For picture of said flying pig see:

http://www.cincinnati.com/bigpiggig/profile_091700.html
http://www.cincinnati.com/bigpiggig/images/pig091700.jpg

Rick



Public Review Issues updated

2004-11-24 Thread Rick McGowan
There have been a number of updates to Public Review Issues on the Unicode  
web site.

The comment periods for Public Review Issues 51, 53, 54, and 56 have been
extended to January 31, 2005. During the review period, new drafts may be
issued, and if so, they will be announced at the time.

http://www.unicode.org/review/index.html

51  Proposed Update UAX #29 Text Boundaries
53  Proposed Draft UTR #33 Unicode Conformance Model
54  Proposed Update UTS #22 Character Mapping Markup Language
56  Proposed Update UAX #14 Line Breaking Properties


The following Public Review Issues have been closed, and their resolutions
posted on the Resolved Issues page.

http://www.unicode.org/review/resolved-pri.html

36  Draft Unicode Technical Report #30 Character Foldings
39  Draft Unicode Technical Standard #31 Identifier and Pattern Syntax
40  Encoding of Latin Capital and Small Letter At
41  Encoding of INVISIBLE LETTER
42  Proposed Draft UAX #34 Unicode Named Character Sequences
43  Proposed Update UAX #24 Script Names
44  Bidi Category of Fullwidth Solidus
45  Bidi Category of Narrow No-Break Space
46  Proposal for Encoded Representations of Meteg
47  Changes to default collation of Latin in UCA
48  Definition of Directional Run
49  Proposed Update UTS #6 A Standard Compression Scheme for Unicode
50  Proposed Update UTS #18 Unicode Regular Expressions
52  Proposed Draft UTR #36 Security Considerations
55  Proposed Change to Character Properties for Two Katakana Characters


Regards,
Rick McGowan
Unicode, Inc.





New Public Review Issue

2004-11-24 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new item closes on January 31, 2005.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:


57  Changes to Bidi categories of some characters used with Mathematics

The UTC is considering changing the bidi category of seven compatibility
characters from ET to ES:
U+207A SUPERSCRIPT PLUS SIGN
U+208A SUBSCRIPT PLUS SIGN
U+FB29 HEBREW LETTER ALTERNATIVE PLUS SIGN
U+FE62 SMALL PLUS SIGN
U+FE63 SMALL HYPHEN-MINUS
U+FF0B FULLWIDTH PLUS SIGN
U+FF0D FULLWIDTH HYPHEN-MINUS

The UTC is also seeking feedback on the bidi categories of the following
characters, and whether to also change these from ET to ES:
U+2212 MINUS SIGN
U+207B SUPERSCRIPT MINUS
U+208B SUBSCRIPT MINUS

All of these characters may be used in connection with mathematical 
applications.


If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Re: Font selection, font downloads, and (writing system) scripts

2004-11-21 Thread Rick McGowan
Fantasai wrote,

 This discussion belongs on www-style, so setting Reply-To to there.

And if you're going to do that then, as a matter of etiquette, please  
don't CC the Unicode list.

When you CC the Unicode list and some other list, people on the other list  
may try to reply all and include both lists. For hot topics, this can  
result in a cross-posting mess and people seeing half the story. And some  
people may get you can't post here because you're not subscribed  
messages.

Thanks,

Rick





Public Review Issues update - UAX #14

2004-11-05 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/
http://www.unicode.org/reports/tr14/tr14-16.html

Review periods for the new item closes on November 8, 2004.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:


56  Proposed Update UAX #14 Line Breaking Properties

This is a proposed update to a previously approved Unicode Standard Annex.  
It incorporates some changes in Hangul syllable rules, word separators,  
U+00A0 as a base for combining marks, and other  updates. The UTC is  
seeking public feedback on these changes.


If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.





Announcement: CLDR 1.2 Released

2004-11-03 Thread Rick McGowan
The Unicode Consortium is pleased to announce the release of new versions  
of the Common Locale Data Repository (CLDR 1.2) and the Locale Data Markup  
Language specification (LDML 1.2).

For more information on the contents of this release, see:
http://www.unicode.org/cldr/repository_access.html

For more information on the new version of LDML, see:
http://www.unicode.org/reports/tr35/

And for general information on CLDR, see:
http://www.unicode.org/cldr/index.html

Please note that the freeze date for the next version of CLDR is January  
15, 2004. All new data or defect reports for CLDR 1.3 must be submitted by  
then.


Regards,
Rick McGowan
Unicode, Inc.



Public Review Issues update

2004-11-02 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new item closes on November 8, 2004.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:



55  Proposed Change to Character Properties for Two Katakana Characters

The UTC has received to change the General Category of two characters.  
Reports indicate that they should not have the General Category Connector  
Punctuation (gc=Pc) because the characters don't connect other elements,  
they separate elements. The two characters are:
 U+30FB KATAKANA MIDDLE DOT
 U+FF65 HALFWIDTH KATAKANA MIDDLE DOT
The proposal is to change the General Category of those characters from  
Pc (Connector Punctuation) to Po (Other Punctuation).



If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Public Review Issues Update

2004-10-28 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public review  
and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new item closes on November 8, 2004.

Please see the page for links to discussion and relevant documents.  
Briefly, the new issue is:



54  Proposed Update UTS #22 Character Mapping Markup Language

This is a proposed update to a previously approved Unicode Technical   
Report. It will change to a Unicode Technical Standard, so the update   
includes a new conformance section. Included in the update are many   
editorial changes and explicit text about multiple-character mappings



If you have comments for official UTC consideration, please post them by  
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please use  
the following link to subscribe (if necessary). Please be aware that  
discussion comments on the Unicode mail list are not automatically recorded  
as input to the UTC. You must use the reporting link above to generate  
comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Public Review Issues Update

2004-10-27 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new item closes on November 8, 2004.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:


Proposed Draft UTR #33 Unicode Conformance Model

This proposed draft Unicode Technical Report explains the issue of  
conformance relating to the Unicode Standard so that users better  
understand the contexts in which products are making claims for support of  
the standard, and implementers better understand how to meet the formal  
conformance requirements while satisfying the expectations of their users.  
It does not alter, augment or override the actual Unicode conformance  
requirements. Rather it attempts to provide a conceptual framework to make  
it easier for users and implementers to identify and understand the  
specific conformance requirements.


If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.




Public Review Issues Update

2004-10-19 Thread Rick McGowan
The Unicode Technical Committee has posted three new issue for public  
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new items close on November 8, 2004.

Please see the page for links to discussion and relevant documents.  
Briefly, the new issues are:


50  Proposed Update UTS #18 Unicode Regular Expressions

This is a proposed update to a previously approved Unicode Technical  
Standard. The update includes some new notation, new notes on Compatibility  
Properties, and other changes. The UTC is seeking public feedback on these  
changes.


51  Proposed Update UAX #29 Text Boundaries

This is a proposed update to a previously approved Unicode Standard Annex.  
It contains some important chnages in categories for some characters and  
changes in linebreaking rules. The UTC is seeking public feedback on these  
changes.


52  Proposed Draft UTR #36 Security Considerations

This draft Unicode Technical Report describes some of the security  
considerations that should be taken into account by programmers, system  
analysts, standards-developers, and others when implementing the Unicode  
Standard and related technologies. The UTC is seeking public feedback on  
this document.



In addition to the above new items, the document for Public Review Issue  
#39 has been updated today, with minor changes:


39  Draft Unicode Technical Standard #31 Identifier and Pattern Syntax



If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Public Review Issues Update

2004-10-12 Thread Rick McGowan
The Unicode Technical Committee has posted three new public review issues.
Details are on the following web page:

http://www.unicode.org/review/

Briefly the new issues are:


47   Changes to default collation of Latin in UCA

In collation, searching, and matching according to the Unicode Collation  
algorithm, the 10 characters , , , , , , , , ,  (and  
their lowercase forms) currently have primary (base letter) differences  
from the letters A, D, H, L, and O respectively. There is a proposal before  
the UTC to change these to have secondary (accent) differences from AE, D,  
H, L, O, respectively. We would welcome feedback on this issue -- pro or  
con. Arguments for the change are in the background document. We expect to  
add the contrary point of view to that document.



48Definition of Directional Run

A definition of directional run is proposed for inclusion in UAX #9 The  
Bidirectional Algorithm. The UTC is seeking public feedback on this  
definition. See the background document for details.



49   Proposed Update UTS #6 A Standard Compression Scheme for Unicode

This is a proposed update to a previously approved Unicode Technical  
Standard. This standard describes a compression scheme (SCSU) mainly  
intended for use with short to medium length Unicode strings. A number of  
changes and clarifications have been made in the text, and the UTC is  
seeking public feedback on these changes.




The closing date for comments on these issues is 2004/11/08.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.




Public Review Issues Update (correction)

2004-10-12 Thread Rick McGowan
There has been a further update to the document for Public Review Issue  
#48 (Directional Run) to clarify and expand the proposed definition. If you  
have already reviewed the document, I apologize for the inconvenience. The  
revised document is linked from the review page:

http://www.unicode.org/review/

And may be accessed directly from the following URL:

http://www.unicode.org/review/pr-48.html

The closing date for comments on this issue remains 2004/11/08.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.




Two new tech notes posted

2004-10-08 Thread Rick McGowan
Two new Unicode Technical Notes have been posted. They are numbers 17  
and 18, indexed here:

http://www.unicode.org/notes/index.html

The new notes by P. Chellappan detail the conversion of two prevalent  
encodings for Tamil into Unicode.

Details about tech notes in general may be found here:

http://www.unicode.org/notes/about-notes.html

Regards,
Rick



CLDR 1.2 Alpha now available

2004-09-30 Thread Rick McGowan
The Unicode Consortium is pleased to announce that the alpha version of  
the Common Locale Data Repository (CLDR) 1.2 is available for public  
review. The contents include:

SPECIFICATION
* Updated Locale Data Markup Language (LDML) specification (UTS #35  
draft), and updated DTD.
* Added explicit documentation of Date Format Patterns, Number Format  
Patterns, Choice Patterns, Calendar Field
* Revised documentation of characters element, especially exemplar characters.
* Added alt attribute, references attribute, Inheritance and Validity  
specification
* Incorporated new model of time zone localization

DATA
* Added weekend data for most locales
* Added yes / no POSIX data to most locales
* Added locale data for new locales Oriya, Malayalam, Assamese , Welsh,  
Dzongkha, Bhutan, Khmer and Lao
(All of these need vetting: in particular, Oriya and Malayalam.)
* Added significant amounts of data (country, language, currency, and type  
display names) to ar, bg, cs, el, he, hr, hu, is, mk, pl, ro, ru, sk, sl,  
sr, tr, uk
* Incorporated other fixes / additions according for bugs / feature  
requests filed, for French (inc. Canadian), Afrikaans, Swedish, and others.

For more information please look at the following:

* General information on CLDR
http://www.unicode.org/cldr/

* Comparing CLDR data to platform data
http://www.unicode.org/cldr/comparison_charts.html

* Latest LDML
http://oss.software.ibm.com/cvs/icu/~checkout~/locale/docs/tr35.html
(Note that all LDML updates are to be backwards compatible.)

* For reporting bugs on the alpha
http://www.unicode.org/cldr/filing_bug_reports.html





Public Review Issues Update

2004-08-26 Thread Rick McGowan
The Unicode Technical Committee has posted new public review issues.
Details are on the following web page:

http://www.unicode.org/review/

Briefly the new issues are:


40Encoding of Latin Capital and Small Letter At

LATIN CAPITAL LETTER AT and LATIN SMALL LETTER AT are used as orthographic  
characters in the Koalib language of Sudan. Although similar in appearance  
to COMMERCIAL AT, LATIN SMALL LETTER AT should have different character  
properties. The main concern is the similarity in appearance of LATIN SMALL  
LETTER AT to COMMERCIAL AT. There are potential implications for Internet  
protocols that use @.


41Encoding of INVISIBLE LETTER

UTC is seeking feedback regarding a proposal to encode INVISIBLE LETTER  
to serve as an unambiguous base letter for combining marks in isolation.  
The character properties would be specifically designed to aid in  
processing. Details are provided in an accompanying document.


The closing date for comments on both issues is 2004/11/08.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Re: Problem with accented characters

2004-08-25 Thread Rick McGowan
Philippe wrote:

 Actually, it was based on decompositions in Unicode 2.01.

There is no such version. Perhaps you meant another version?

Rick



New mail list for African script issues

2004-08-19 Thread Rick McGowan
The list [EMAIL PROTECTED] is a new public forum for discussion of  
African scripts -- specifically technical issues and proposals -- including  
native scripts and imported scripts (e.g., Latin-based orthographies for  
African languages).

Please note that it is not a general discussion list. People interested in  
contributing to encoding proposals for African scripts, or reviewing  
proposal documents during preparation may subscribe themselves by following  
instructions below. (As usual, the list is closed to outside posting, so  
you must subscribe before posting.)

To subscribe to the list, send mail to ecartis @ unicode.org and put  
subscribe africa in the subject line. You'll get a confirmation note, and  
you must forward and reply to the note before your subscription is  
effective.

To un-subscribe, send mail to ecartis @ unicode.org and put unsubscribe  
africa in the subject line.

Rick




Public Review Issues Update - UTS #31 Draft

2004-07-27 Thread Rick McGowan
The Unicode Technical Committee has posted a new public review issue.
Details are on the following web page:

http://www.unicode.org/review/

Briefly the new issue is:


39  Draft Unicode Technical Standard #31 Identifier and Pattern Syntax

An updated draft of UTS #31 Identifier and Pattern Syntax is available  
at the above link. This draft has new conformance information as well as a  
new section on Normalization and Case and other changes. This document has  
implications for programming languages, regular expressions, and scripting  
languages.


Please note: The closing date for comments is 2004/08/12 (August 12).


If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Re: I didn't send the virus

2004-07-22 Thread Rick McGowan
Doug wrote...

 This didn't come from me.

Indeed, it didn't come from Doug. And it has been removed from the  
archives as well.

FYI... We have some software in place on Unicode.ORG to assist with  
filtering of attachments. However, we pass through .zip files because  
many of us who work with proposals and such need to be able to receive them  
on a regular basis. Viruses attached as .zip files do occasionally get  
through if they just happen to have forged return addresses belonging to  
list subscribers. (The list filters out mail from non-subscribers anyway.)  
It's fairly rare but it does happen. And I am always gratified to see a  
flood of bounces back from subscriber's systems objecting to the viral  
content when one does go through. Many of you never see these viruses.

Cheers,
Rick



Public Review Issues Update

2004-07-16 Thread Rick McGowan
The Unicode Technical Committee has posted a new public review issue.  
Details are on the following web page:

http://www.unicode.org/review/

Briefly the new issue is:


38   Draft Unicode Technical Report #30 Character Foldings

An updated draft of UTR #30 Character Foldings is available. This update  
also provides a new set of draft data files for several types of character  
foldings. The Unicode Technical Committee especially seeks review of the  
data files.

The closing date for comments is 2004/08/03.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Unicode Technical Report #23 Published

2004-07-15 Thread Rick McGowan
The Unicode Technical Committee has published a new Technical Report:

UTR #23 The Unicode Character Property Model

This technical report covers a conceptual model of character properties  
defined in the Unicode Standard.

The report can be obtained at the following URL:

http://www.unicode.org/reports/tr23/

Regards,
Rick McGowan
Unicode, Inc.






Announcement: Unicode Technical Note #15

2004-07-15 Thread Rick McGowan
The Unicode Consortium announces the availability of a new Unicode  
Technical Note:

Unicode Technical Note #15
Text Conversion from TSCII 1.7 to Unicode
by Muthu Nedumaran
http://www.unicode.org/notes/tn15/

Summary: This document is written to assist with the conversion of TSCII  
encoded Tamil text to Unicode. TSCII is an 8-bit glyph encoding scheme used  
to exchange and store electronic text in the Tamil language prior to the  
emergence of Unicode enabled platforms that support Tamil Unicode.


The main Unicode Technical Notes page is available here:
http://www.unicode.org/notes/

Unicode Technical Notes provide informative material that may be of  
interest to users of the specifications published by the Unicode  
Consortium. Technical Notes are neither reviewed nor approved by the  
Unicode Technical Committee. Their publication does not imply endorsement  
by the Unicode Consortium in any way. For more information see
http://www.unicode.org/notes/about-notes.html


Regards,
Rick McGowan
Unicode, Inc.




Re: Greek tonos and oxia

2004-06-30 Thread Rick McGowan
See also the FAQ on Greek:
http://www.unicode.org/faq/greek.html





APL mapping tables

2004-06-28 Thread Rick McGowan
Does anyone know of a mapping table from APL character set to Unicode? I'm  
looking for something that maps APL to Unicode numerically, in a format  
similar to the various mapping tables on the Unicode site.

Thanks,
Rick



Re: Bob Bemer, father of ASCII, has died

2004-06-24 Thread Rick McGowan
A 95 byte salute to Bob Bemer:

 [EMAIL PROTECTED]'()*+,-./
0123456789:;=?
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_
`abcdefghijklmnopqrstuvwxyz{|}~



Rick



Re: lines 05-08, version 4.7 of Roadmap to BMP and 'Hebrew extensions'

2004-06-24 Thread Rick McGowan
Elaine asked...

 Questions about Roadmap to BMP---
 1) from Hebrew in line 05 to end of line 08,
 are these *all* right-to-left languages?

Not right now. However, Tifinagh was just accepted by UTC and WG2, and has  
been moved out of that slot, so that line 08 *will* be all right to left,  
and Tifinagh will (likely) be encoded up at U+2D30. Some other stuff is  
also moving around. The roadmap will soon be updated to reflect the  
movement of Tifinagh, etc.

 2) In line 08, can I formally request insertion of a
 'Hebrew Extensions' section next to 'Samaritan'--do I
 do this via the online reporting URL or ?

No. Blocks are created and populated as proposals are accepted. The  
roadmap is just a guide.

 Or do I need to write a short proposal asking to
 change line 08?

There is no need to request any change to it.

 I want to suggest that Babylonian vowels should also
 be considered for BMP insertion.

Then you should write a proposal for them.

Rick




Re: lines 05-08, version 4.7 of Roadmap to BMP and 'Hebrew extensions'

2004-06-24 Thread Rick McGowan
Elaine,

 Copying part of Rick's answer back to list, didn't
 seem private.

Yes, it was indeed CCed to the list. This is also CCed to the list.

 How many code points are there in the Roadmap's line
 08?---I still don't understand hex well enough to
 compute

You need to look at Unicode 4.0 as well. There are 30 codepoints not  
encoded within the Hebrew block on 05 of the Roadmap. Nothing *else* is  
currently proposed for allocation to Hebrew. If a proposal is received and  
accepted, then we would look at where to allocate things. That might mean  
opening a section in 08.

 Hebrew script, even without any 'artificial'
 disunification, still could potentially fill 3-4x as
 many code points as it currently has.

No problem. We have 16 other planes in which to put things, and Hebrew may  
spill over. There would naturally be much consideration to putting things  
on the BMP for scripts that are already allocated to the BMP, such as  
Hebrew, but space is filling up quickly.

 I actually wasn't sure about Avestan's directionality.

Right to left.

 I mistakenly thought Tifinagh was rtl.

That's OK. It has been, and sometimes still is, written right to left,  
hence it was roadmapped in a right-to-left allocation block. However, in  
modern usage, and in the Moroccan national standard now being drafted, it  
is specifically left to right. That is how it will be encoded. Hence it was  
moved away from the block of right to left scripts.

 I do actually have two brief suggestions that are
 important to Semitists, for the BMP and the SMP...

Submit your doc whenever you are ready.

 I wrote the preliminary Babylonian and Samaritan
 proposals in May.

UTC saw them. As usually happens with documents that are submitted in  
preliminary fashion, people were informed of the docs, and that you are  
receiving comments. UTC will be awaiting your updated proposals for the  
August meeting, or whenever they are ready. (This is a non-official  
response. With preliminary docs, UTC doesn't usually respond officially or  
take any recorded action.)

Rick



Public Review Issues Updated

2004-06-24 Thread Rick McGowan
The Unicode Technical Committee has posted resolutions for several public  
review issues. Details are on the following web page:

http://www.unicode.org/review/resolved-pri.html

Please see the page for links to discussion and relevant documents.

Briefly, the resolved issues are:

20 Draft UTR #31 Identifier and Pattern Syntax
29 Normalization Issue
30 Bengali Khanda Ta
32 Proposed Update UTR #23 Character Property Model
35 Encoding of LATIN SMALL LETTER C WITH STROKE
36 Draft UTR #30 Character Foldings

The following public review issues are still open, and their closing dates  
have been extended to August 3, 2004:

25  Proposed UTR #17 Character Encoding Model
31  Cantonese Romanization
33  UTF Conversion Code Update
34  Draft UTS #35 LDML

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Re: Response to a Proposal to Encode Phoenician in Unicode

2004-06-09 Thread Rick McGowan
Dean, et al,

Can we stop with the cross posting here? This is just my opinion, but  
conversations cross-posted to several lists end up with some people not  
being able to read the responses posted elsewhere (for lists they aren't  
on), or being frustrated to post undeliverable replies (for lists they  
aren't on). And nobody ends up with a complete archive of the conversation.

Rick



New versions of the Common Locale Data Repository (CLDR 1.1)

2004-06-08 Thread Rick McGowan
The Unicode® Consortium announced today the release of new versions of the  
Common Locale Data Repository (CLDR 1.1) and the Locale Data Markup  
Language specification (LDML 1.1), providing key building blocks for  
software to support the world's languages. This new release contains data  
for 247 locales, covering 78 languages and 118 countries. There are also 36  
draft locales in the process of being developed, covering an additional 17  
languages and 7 countries.

For more information, see http://news.google.com/news?q=CLDR

Regards,
Rick McGowan
Unicode, Inc.





Re: Updated Phoenician proposal: confidential?

2004-06-02 Thread Rick McGowan
Peter Kirk wrote...

... I suppose he didn't 
want to put his proposal at risk by describing how the user community 
was, at least in part, opposed to the proposal.

That imputes to him a motive that I doubt he had. My impression is that 
Michael didn't know there would be such violent opposition, or indeed 
*any* opposition, when he posted it.

Rick



Re: Updated Phoenician proposal: confidential?

2004-06-02 Thread Rick McGowan
Peter Kirk wrote..
  You, Rick, also replied on 22 December 2003 to the same posting
  of mine, so you can't claim to be ignorant of this discussion.
  You wrote:

You can't just call for a review and expect anything to happen. 
Please dcument your opinions and document some facts. If you have
a different model of Aramaic, Phoenician, and related scripts,...

Ah yes, I'd forgotten about that earlier discussion. You called for a 
review, but you never submitted any model document to UTC.

Anyway, this Phoenician discussion has become pretty old and isn't going 
much of anywhere on this list, so I'll sign off for now.

Rick



Re: Updated Phoenician proposal: confidential?

2004-05-29 Thread Rick McGowan
Peter Kirk wrote...

 I understand that a revised version of Final proposal for encoding
 the Phoenician script [WG2-N2746R] has been submitted to the UTC
 and included in the official document register.

Posted Friday night, yes. I insisted on receiving it, and I postponed my  
Friday evening dinner to upload it to the register and announce it.

 Will this document be made public? Or is there an intention to conceal
 it from the public, or from the user community of the scripts in
 question?

What an absurd insinuation. I am mortified and demand an apology. I'm sure  
Mr Everson would also demand an apology. The fact that this document was  
posted first to the UTC doc register reflects only my faster-than-light  
reflexes, being the person who insisted most emphatically on a revised  
proposal.

 Will this fullness of time allow time for interested
 parties to comment to the UTC and to WG2 before the proposal is
 discussed by them? I am sure that these committees will want to make
 sure of this.

I find your tone and insinuations offensive. The fullness of time for  
public posting of the document does not necessarily depend on Mr. Everson,  
it depends more on when the WG2 convenor posts the document!

 I am certain that WG2 will not be able to accept any proposal which
 has not been made public and on which the user community has not been
 given the opportunity to comment.

You mistake the procedure. One makes a document and submits it to the WG2   
chair, who then is at liberty to post it, or not post it, on the WG2  
website, at his sole discretion. The WG2 website happens to be publicly  
accessible, and posting a document there *is* the act of making public. So  
it is not that WG2 accepts only public documents; it is that the  
document register of WG2 is open to the public.

 I have also accepted that this particular script should be
 encoded, but that certain other specific definitions should be
 made to enshrine within the standard the special close relationship
 between the various 22 character Semitic scripts.

After a month of rather unpleasant wrangling, it is a relief to hear you  
publicly proclaim that you accept the encoding of Phoenician. Please do  
propose some wording for other specific definitions and submit a document  
with your suggestions. You would have at least a year, or two, between the  
time Phoenician is accepted for encoded and the time a block intro to it  
would be published. I'm sure the committee will welcome your input.

Since you have now concluded that the Phoenician script *should* be  
encoded, a brief statement to that effect submitted to the Unicode online  
Reporting Form would make it into the UTC record, and be appreciated.

Rick



ISO 15924 beta period is now over

2004-05-29 Thread Rick McGowan
The following notice has been received by the Unicode office. I pass it  
along to interested parties:


The ISO 15924 beta period is over. The tables are all
now generated from a single file; the differences from
the release version are reflected on the code changes page.

I would like to thank everyone who helped fix the problems
we had with the tables.

Michael Everson
ISO 15924 Registrar


To submit official comments on ISO 15924, please see the on-line reporting  
form here:

http://www.unicode.org/reporting.html

There is a selection box on the form for ISO 15924 issue or comment.

The standard and related documents, including a request form for additions  
and changes, are available here:

http://www.unicode.org/iso15924/

Regards,

Rick McGowan
Unicode, Inc.




Re: Updated Phoenician proposal: confidential?

2004-05-29 Thread Rick McGowan
 I will make a statement for the UTC record, but it will not be
 as brief as you seem to expect.

Hmmm. You apparently misunderstood my intent. I said brief statement  
merely intending to save you trouble if you don't have time or inclination  
to make a long statement. You are of course welcome to comment, or not, at  
whatever length you choose.

Rick



Re: Glyph Stance

2004-05-28 Thread Rick McGowan
Bob Richmond discussed...

 Recap. Michaels 'n1944' proposal for Egyptian Hieroglyphs in Unicode
 (1999)

Just FYI, the control codes were a rather controversial feature of that  
proposal. It would also be worth surveying (again) the use of controls in  
existing Egyptian implementations.

 I understand the UTC position was in favour of coding a basic
 (partial-Gardiner) character set but deferring the larger corpus
 and control elements. This would have been useful and fine to
 build on incrementally but IMO 5 years on, it is not only
 possible but highly desirable to go further than this.

UTC at last check was still in favor of encoding the Gardiner set, minus  
control codes, just as soon as someone is able to come up with a revised  
proposal for the 700+ characters. Funding and time are the current  
inhibitors to work on the proposal, as I understand it. And I'm afraid at  
this point, Egyptian is nowhere near being encoded.

Rick





Re: PH technical issues

2004-05-28 Thread Rick McGowan
D. Starner wrote:

 Shouldnt the encoding be geared towards those who use it the most?

Who use *what* the most?

 So far, all the people who actually use this script on a day to
 day basis who have actually spoken up have been in favor of
 unification.

By this script presumably you mean Phoenician script, not Hebrew script.  
On this list we've heard at length from Semiticists who disfavor encoding  
Phoenician script -- which they don't use, since they report using  
Hebrew-script transliteration, or refer to them as the same thing, etc.

Speaking up on this list isn't a criterion for anything. We don't happen  
to have *on this list* any of the actual daily basis Phoenician-script  
user community. If we did, we might have heard from them as well.

Deborah Anderson has, off-list, reported contact with scholars other than  
herself who favor the encoding of Phoenician. But if by scholar you have a  
narrow meaning requiring a PhD, I think she may be the only scholar who  
has *on this list* reported that she favors encoding of Phoenician script.  
Here:

http://www.unicode.org/mail-arch/unicode-ml/y2004-m05/0083.html

That was on May 2, which is ancient history by now. Others may correct me  
if I'm leaving anyone out.

Rick



Re: Why Fraktur is irrelevant (was RE: Fraktur Legibility (was Re:Response to Everson Phoenician)

2004-05-26 Thread Rick McGowan
Personally speaking, at this juncture, I usually yawn and hit the delete  
button when I see the word Phoenician on this list. The discussion has  
gone way past any sane argument.

However, Peter Kirk asked a question to which I have a response.

 ... we need to ask a more general question: should the
 UTC encode scripts for which there is a (small, in this case)
 demand but no technical justification?

Do you even have to ask this question? If so, I have to think you haven't  
been listening at all. There are technical justifications for the encoding,  
but you are either failing to listen to them, or are refusing to believe  
that some of the justifications are technical. I will not repeat any  
arguments here, I've really had enough Phoenician.

It's my personal opinion that yes, UTC *should* encode Phoenician  
precisely because there is a group of scholars and others who have  
indicated they desire its encoding and would use it, and there *are*  
technical justifications which appeal to those Phoenician proponents, even  
if you won't acknowledge them as such. It's apparent that one reason you  
won't acknowledge any technical issues is that you disagree on first  
principles and refuse to acknowledge any other needs or viewpoints than  
your own.

As far as I'm concerned, that's about the end of the discussion.

Personally speaking only, and not any official policy, and not speaking  
for any UTC member, etc,

Rick



New Public Review Issue posted

2004-05-25 Thread Rick McGowan
The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new item closes on June 8, 2004.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:


Draft Unicode Technical Report #30 Character Foldings   2004.06.08

An updated draft of UTR #30 Character Foldings is now available. This
update also provides draft data files for four types of character foldings.
The Unicode Technical Committee especially seeks review of the data files.


If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback  reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Regards,
Rick McGowan
Unicode, Inc.



Re: Proposal to encode dominoes and other game symbols

2004-05-25 Thread Rick McGowan
Ken wrote...

 P.S. Regarding the dominoes per se, I'm coming down on the side of
 those arguing (as John Cowan has) that the *orientation* of the bones
 is not significant in the plain text usages. The *characters* to
 encode here should be for each distinct bone, regardless of
 orientation.

That is also my opinion.

 John that going beyond the double-twelve (for now) is just speculative
 and not supported by actual use in dominoes books.

I don't think this is speculative. A photograph of production domino sets  
above 12 is included in the proposal. We might as well add them now as  
later.

Rick



Re: ISO 15924

2004-05-21 Thread Rick McGowan
 Use of #x200B; is perfectly appropriate to allow line breaks.
 What is not yet being done is to *disallow* line breaks in the dates;
 that is a mistake, since IE will break in dates and numbers, e.g.
 the number -
 3.

Yes, but... In this particular set of files, no matter *HOW* narrow I made  
the windows, I couldn't seem to get Netscape to line-break within the date  
field. IE line-breaks more stupid^H^H^H^H^H^Hreadily, but you have to make  
the window unreadably narrow to do so. It thus seemed not important to  
insert break positions before every - in the whole file, or even in the  
date field(s). For IE we could just set nowrap on the cells (or even just  
the first cell!), instead of inserting ZWSPs all over. I would find that  
preferrable.

Rick



Response to Everson Phoenician and why June 7?

2004-05-19 Thread Rick McGowan
Elaine asked:

 Why did Debbie suggest June 7 as a the latest date for
 responses?

Probably because that is the deadline for documents to be submitted for  
consideration at the upcoming UTC meeting. The issue will be discussed  
there, so anyone who wants to get their input into that meeting should do  
it soon.

Rick



Subject lines that have nothing to do with message content

2004-05-10 Thread Rick McGowan
Personally speaking, I would have expected that a recent message on this  
list with the sujbect line Katakana_Or_Hiragana might have something to  
do with Japanese, Hiragana, Katakana, or at least Han, or perhaps even  
Asia. But no... It was about Phoenician.

It would be really helpful if people could use subject lines that have  
something to do with the subject of the message.

It just can't be that difficult for people to pick a reasonable subject  
line. And if you're going to go off-topic in a thread, you might consider  
getting a different subject line -- or at least adding a parenthetical  
about how you're going to go off the thread...

(As usual, this is my personal opinion and doesn't reflect an official  
policy, etc.)

Rick



  1   2   3   4   >