He Zhiqiang wrote: > can easily convert via Javascript function escape(), but i wonder that > is there some method or function > or modules can do the same job?? If i can do it, then in one html page, > i can display ont only chinese, but > also japanese, korea etc... This is something like HTML unicode, am i > right? > The ord() function can't do the job because it return the incorrect > decimal value. perhaps i do not describe
To call it "HTML unicode" seems to be wrong, regularly it had better to do as "Numeric character references", I think. <http://www.w3.org/TR/html4/charset.html#entities> To use numeric character references is not the only way to display multi-lingual text in a html document. Actually, I use 'raw' UTF-8 characters in some html documents. For that, I edit the source file of a html with a text editor which can handle UTF-8 encoding. Please browse the sample.html which is attached with this mail. Not only to view with browser but also to do the source of the file. You may learn more about Unicode and HTML. About Unicode: <http://www.unicode.org/standard/WhatIsUnicode.html> About HTML: <http://www.w3.org/MarkUp/> BTW, when you use numeric character references method, there is no need to look around any modules. Only to use "unpack('U*', $string)" function is enough to do. Please inspect and estimate my sample code which is attached as sample.pl. -- Masanori HATA <[EMAIL PROTECTED]> He's always with us!
Browse this html with Unicode (UTF-8) encoding.
Using raw UTF-8 data:
- News in English
- Actualités in French
- 新闻 in Simplified Chinese
- 新聞 in Japanese
- 뉴스 in Korean Hangul
Using numeric character references (each data is encoded with ASCII itself):
- News in English
- Actualités in French
- 新闻 in Chinese
- 新聞 in Japanese
- 뉴스 in Hangul
#!/usr/local/bin/perl -w use 5.008; use strict; use warnings; use utf8; my %raw = ( 'English' => 'News', 'French' => 'Actualités', 'Chinese' => '新闻', 'Japanese' => '新聞', 'Hangul' => '뉴스', ); my %numeric_ref; foreach my $lang (keys %raw) { my @numbers = unpack('U*', $raw{$lang}); $numeric_ref{$lang} = join ';&#', @numbers; $numeric_ref{$lang} = '&#' . $numeric_ref{$lang} . ';'; } print "<ul>\n"; foreach my $lang (keys %numeric_ref) { print "<li>$numeric_ref{$lang} in $lang</li>\n"; } print "</ul>\n"; __END__