Re: ICU and Parrot
On Sat, Jun 01, 2002 at 02:20:15AM +0900, Dan Kogai wrote: > >2) If not, would a Encode::ICU be wise? > I'm not so sure. But if I were the one to implement Encode::ICU, it > will not be just a compiled collection of UCM files but a wrapper to all > library functions that ICU has to offer. I, for one, am too lazy for > that. That would be Text::Uconv's job, wouldn't it? Then Encode::ICU could just interface to that module instead. > >3) A number of encodings are in HanExtra but not their ucm repository, > > namedly big5plus, big5ext and cccii. Is is wise to feed back to them > > under the name of e.g. perl-big5plus.ucm? > You should in time and I should, too, because I have expanded UCM a > little so that you can define combined characters commonly seen in > Mac*. But I don't see any reason to be in hurry for the time being. Understood. In a related note: http://www.li18nux.org/docs/html/CodesetAliasTable-V10.html has spurred quite a bit discussion in Taiwan because of the mandated standardization of Big5 => TCA-BIG5, and Big5-HKSCS => HKSCS-BIG5 (i.e. the standard body first.) But it struck me as making lots of sense, if in a rather rigid way. Should Encode.pm probably add them to the Alias table, in the name of 'practical'? In particular, supporting CP-xxx (=> CPxxx) and ISO-646-US (=> US-ASCII) should be rather beneficial. /Autrijus/ msg01339/pgp0.pgp Description: PGP signature
22nd Unicode Conference, September 2002, San Jose, CA, USA
Twenty-second International Unicode Conference (IUC22) Unicode and the Web: Evolution or Revolution? http://www.unicode.org/iuc/iuc22 September 9-13, 2002 San Jose, California Mark your diary! >> 14 weeks to go >> Mark your diary! >> 14 weeks to go The software industry continues its rapid growth and change. In this year alone, Unicode 3.2 was released and several new proposals for the Internet and the World Wide Web were promoted to standards. Web Services is the latest buzz. Are the vendors of software that support these technologies keeping up? How can you be sure that you are deploying software components that work well together today and in the future? This Conference is where you go to find out. Experts will describe the latest changes to the Unicode standard and the other standards used for e-business today. You will also learn about the best practices for utilizing, integrating and deploying these technologies based on real-world examples and experience. Demonstrations are often provided. Conference attendees are generally involved in either the development, deployment or use of Unicode software or content, or the globalization of software and the Internet. They include managers, software engineers, systems analysts, font designers, graphic designers, content developers, technical writers, and product marketing personnel. CONFERENCE WEB SITE, PROGRAM and REGISTRATION The Conference Program and Registration form will be available soon at the Conference Web site: http://www.unicode.org/iuc/iuc22 CONFERENCE SPONSORS Agfa Monotype Corporation Basis Technology Corporation Microsoft Corporation Netscape Communications Oracle Corporation Reuters Ltd. Sun Microsystems, Inc. World Wide Web Consortium (W3C) GLOBAL COMPUTING SHOWCASE Visit the Showcase to find out more about products supporting the Unicode Standard, and products and services that can help you globalize/localize your software, documentation and Internet content. For details, visit the Conference Web site. CONFERENCE VENUE The Conference will take place at: DoubleTree Hotel San Jose 2050 Gateway Place San Jose, CA 95110 USA Tel: +1 408 453 4000 Fax: +1 408 437 2898 CONFERENCE MANAGEMENT Global Meeting Services Inc. 8949 Lombard Place, #416 San Diego, CA 92122, USA Tel: +1 858 638 0206 (voice) +1 858 638 0504 (fax) Email: [EMAIL PROTECTED] or: [EMAIL PROTECTED] THE UNICODE CONSORTIUM The Unicode Consortium was founded as a non-profit organization in 1991. It is dedicated to the development, maintenance and promotion of The Unicode Standard, a worldwide character encoding. The Unicode Standard encodes the characters of the world's principal scripts and languages, and is code-for-code identical to the international standard ISO/IEC 10646. In addition to cooperating with ISO on the future development of ISO/IEC 10646, the Consortium is responsible for providing character properties and algorithms for use in implementations. Today the membership base of the Unicode Consortium includes major computer corporations, software producers, database vendors, research institutions, international agencies and various user groups. For further information on the Unicode Standard, visit the Unicode Web site at http://www.unicode.org or e-mail <[EMAIL PROTECTED]> * * * * * Unicode(r) and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission. -- -- Visit our Internet site at http://www.reuters.com Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
Re: ICU and Parrot
On Saturday, June 1, 2002, at 12:34 AM, Autrijus Tang wrote: > On Fri, May 31, 2002 at 06:18:55AM +0900, Dan Kogai wrote: >> As a matter of fact GB18030 is ALREADY supported via Encode::HanExtra >> by >> Autrijus Tang. The only reason GB18030 was not included in Encode main >> is sheer size of the map. > > Yes, partly because it was not implemented algorithmically. :) > > I was browsing http://www-124.ibm.com/cvs/icu/charset/data/ucm/ and > toying > with uconv, and wondered: > > 1) Does Encode have (or intend to have) them all covered? No, Unless they appear in www.unicode.org. Though some of them are actually adopted. Useful it may be I found raw ICM too Big and too Blue :) > 2) If not, would a Encode::ICU be wise? I'm not so sure. But if I were the one to implement Encode::ICU, it will not be just a compiled collection of UCM files but a wrapper to all library functions that ICU has to offer. I, for one, am too lazy for that. > 3) A number of encodings are in HanExtra but not their ucm repository, >namedly big5plus, big5ext and cccii. Is is wise to feed back to them >under the name of e.g. perl-big5plus.ucm? You should in time and I should, too, because I have expanded UCM a little so that you can define combined characters commonly seen in Mac*. But I don't see any reason to be in hurry for the time being. If any of you are a member of team ICU you may redirect this dialogue to your team so we can work together in future (after 5.8.0, that is). Dan the Encode Maintainer
Re: ICU and Parrot
On Fri, May 31, 2002 at 06:18:55AM +0900, Dan Kogai wrote: > As a matter of fact GB18030 is ALREADY supported via Encode::HanExtra by > Autrijus Tang. The only reason GB18030 was not included in Encode main > is sheer size of the map. Yes, partly because it was not implemented algorithmically. :) I was browsing http://www-124.ibm.com/cvs/icu/charset/data/ucm/ and toying with uconv, and wondered: 1) Does Encode have (or intend to have) them all covered? 2) If not, would a Encode::ICU be wise? 3) A number of encodings are in HanExtra but not their ucm repository, namedly big5plus, big5ext and cccii. Is is wise to feed back to them under the name of e.g. perl-big5plus.ucm? Thanks, /Autrijus/ msg01336/pgp0.pgp Description: PGP signature