Re: ICU and Parrot

2002-05-31 Thread Autrijus Tang

On Sat, Jun 01, 2002 at 02:20:15AM +0900, Dan Kogai wrote:
> >2) If not, would a Encode::ICU be wise?
> I'm not so sure.  But if I were the one to implement Encode::ICU, it 
> will not be just a compiled collection of UCM files but a wrapper to all 
> library functions that ICU has to offer.  I, for one, am too lazy for 
> that.

That would be Text::Uconv's job, wouldn't it? Then Encode::ICU could just
interface to that module instead.

> >3) A number of encodings are in HanExtra but not their ucm repository,
> >   namedly big5plus, big5ext and cccii. Is is wise to feed back to them
> >   under the name of e.g. perl-big5plus.ucm?
> You should in time and I should, too, because I have expanded UCM a 
> little so that you can define combined characters commonly seen in 
> Mac*.  But I don't see any reason to be in hurry for the time being.

Understood.

In a related note:

http://www.li18nux.org/docs/html/CodesetAliasTable-V10.html

has spurred quite a bit discussion in Taiwan because of the mandated
standardization of Big5 => TCA-BIG5, and Big5-HKSCS => HKSCS-BIG5 (i.e.
the standard body first.)  But it struck me as making lots of sense,
if in a rather rigid way.

Should Encode.pm probably add them to the Alias table, in the name of
'practical'? In particular, supporting CP-xxx (=> CPxxx) and ISO-646-US 
(=> US-ASCII) should be rather beneficial.

/Autrijus/



msg01339/pgp0.pgp
Description: PGP signature


22nd Unicode Conference, September 2002, San Jose, CA, USA

2002-05-31 Thread Misha . Wolf


 Twenty-second International Unicode Conference (IUC22)
 Unicode and the Web: Evolution or Revolution?
http://www.unicode.org/iuc/iuc22
  September 9-13, 2002
  San Jose, California

Mark your diary! >> 14 weeks to go >> Mark your diary! >> 14 weeks to go


The software industry continues its rapid growth and change. In this
year alone, Unicode 3.2 was released and several new proposals for the
Internet and the World Wide Web were promoted to standards. Web Services
is the latest buzz. Are the vendors of software that support these
technologies keeping up? How can you be sure that you are deploying
software components that work well together today and in the future?
This Conference is where you go to find out. Experts will describe the
latest changes to the Unicode standard and the other standards used for
e-business today. You will also learn about the best practices for
utilizing, integrating and deploying these technologies based on
real-world examples and experience. Demonstrations are often provided.

Conference attendees are generally involved in either the development,
deployment or use of Unicode software or content, or the globalization
of software and the Internet. They include managers, software engineers,
systems analysts, font designers, graphic designers, content developers,
technical writers, and product marketing personnel.

CONFERENCE WEB SITE, PROGRAM and REGISTRATION

   The Conference Program and Registration form will be available soon
   at the Conference Web site:
  http://www.unicode.org/iuc/iuc22

CONFERENCE SPONSORS

   Agfa Monotype Corporation
   Basis Technology Corporation
   Microsoft Corporation
   Netscape Communications
   Oracle Corporation
   Reuters Ltd.
   Sun Microsystems, Inc.
   World Wide Web Consortium (W3C)

GLOBAL COMPUTING SHOWCASE

   Visit the Showcase to find out more about products supporting the
   Unicode Standard, and products and services that can help you
   globalize/localize your software, documentation and Internet content.
   For details, visit the Conference Web site.

CONFERENCE VENUE

The Conference will take place at:

   DoubleTree Hotel San Jose
   2050 Gateway Place
   San Jose, CA 95110
   USA

   Tel: +1 408 453 4000
   Fax: +1 408 437 2898

CONFERENCE MANAGEMENT

   Global Meeting Services Inc.
   8949 Lombard Place, #416
   San Diego, CA 92122, USA

   Tel: +1 858 638 0206 (voice)
+1 858 638 0504 (fax)

   Email: [EMAIL PROTECTED]
  or: [EMAIL PROTECTED]

THE UNICODE CONSORTIUM

The Unicode Consortium was founded as a non-profit organization in 1991.
It is dedicated to the development, maintenance and promotion of The
Unicode Standard, a worldwide character encoding. The Unicode Standard
encodes the characters of the world's principal scripts and languages,
and is code-for-code identical to the international standard ISO/IEC
10646. In addition to cooperating with ISO on the future development of
ISO/IEC 10646, the Consortium is responsible for providing character
properties and algorithms for use in implementations. Today the
membership base of the Unicode Consortium includes major computer
corporations, software producers, database vendors, research
institutions, international agencies and various user groups.

For further information on the Unicode Standard, visit the Unicode Web
site at http://www.unicode.org or e-mail <[EMAIL PROTECTED]>

   *  *  *  *  *

Unicode(r) and the Unicode logo are registered trademarks of Unicode,
Inc. Used with permission.









-- --
Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.



Re: ICU and Parrot

2002-05-31 Thread Dan Kogai

On Saturday, June 1, 2002, at 12:34 AM, Autrijus Tang wrote:
> On Fri, May 31, 2002 at 06:18:55AM +0900, Dan Kogai wrote:
>> As a matter of fact GB18030 is ALREADY supported via Encode::HanExtra 
>> by
>> Autrijus Tang.  The only reason GB18030 was not included in Encode main
>> is sheer size of the map.
>
> Yes, partly because it was not implemented algorithmically. :)
>
> I was browsing http://www-124.ibm.com/cvs/icu/charset/data/ucm/ and 
> toying
> with uconv, and wondered:
>
> 1) Does Encode have (or intend to have) them all covered?

No,  Unless they appear in www.unicode.org.  Though some of them are 
actually adopted.  Useful it may be I found raw ICM too Big and too 
Blue :)

> 2) If not, would a Encode::ICU be wise?

I'm not so sure.  But if I were the one to implement Encode::ICU, it 
will not be just a compiled collection of UCM files but a wrapper to all 
library functions that ICU has to offer.  I, for one, am too lazy for 
that.

> 3) A number of encodings are in HanExtra but not their ucm repository,
>namedly big5plus, big5ext and cccii. Is is wise to feed back to them
>under the name of e.g. perl-big5plus.ucm?

You should in time and I should, too, because I have expanded UCM a 
little so that you can define combined characters commonly seen in 
Mac*.  But I don't see any reason to be in hurry for the time being.

If any of you are a member of team ICU you may redirect this dialogue to 
your team so we can work together in future (after 5.8.0, that is).

Dan the Encode Maintainer




Re: ICU and Parrot

2002-05-31 Thread Autrijus Tang

On Fri, May 31, 2002 at 06:18:55AM +0900, Dan Kogai wrote:
> As a matter of fact GB18030 is ALREADY supported via Encode::HanExtra by 
> Autrijus Tang.  The only reason GB18030 was not included in Encode main 
> is sheer size of the map.

Yes, partly because it was not implemented algorithmically. :)

I was browsing http://www-124.ibm.com/cvs/icu/charset/data/ucm/ and toying
with uconv, and wondered:

1) Does Encode have (or intend to have) them all covered?
2) If not, would a Encode::ICU be wise?
3) A number of encodings are in HanExtra but not their ucm repository,
   namedly big5plus, big5ext and cccii. Is is wise to feed back to them
   under the name of e.g. perl-big5plus.ucm?

Thanks,
/Autrijus/



msg01336/pgp0.pgp
Description: PGP signature