Re: multilanguage site
On Tue, 5 Sep 2000, Paul Lindner wrote: > On Tue, Sep 05, 2000 at 10:23:45AM +0100, Matt Sergeant wrote: > > On Tue, 5 Sep 2000, Paul Lindner wrote: > > > > > Anyway, here's what's in my global.asa to take care of this character > > > set conversion mess.. Full details available to those that are > > > interested.. > > > > [snip] > > > > Yikes, you redhat guys really need to look at AxKit: > > We have. > > > # in .htaccess > > AxOutputCharset ISO-8859-1 > > > > And thats it. :-) > > > > But that doesn't provide me dynamic switching between character sets > based on user preferences. Based on HTTP_ACCEPT_CHARSET we can choose > to use iso-8859-1 or utf8, plus we need to force Japanese to use > x-euc-jp on certain platforms, sjis on others. Thats why everything is a plugin in AxKit. You're free to do that. However you've reminded me that I do need to implement ACCEPT_CHARSET directly. > Tell us, how do you do the character set conversion behind the scenes > for various data sources? Well all data sources are XML at some point, so XML::Parser converts to UTF8 for us. Then outgoing charset is converted to via Unicode::String and Map8. This surely ignores DB's and random files that we don't know the format of - but how could we expect to cope with that? I'll consider adding an incoming-charset attribute to the SQL taglib though - that sounds like a good idea. And maybe some day perl will get input filters or something to control that for ordinary files... -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org | AxKit: http://axkit.org
Re: multilanguage site
On Tue, Sep 05, 2000 at 10:23:45AM +0100, Matt Sergeant wrote: > On Tue, 5 Sep 2000, Paul Lindner wrote: > > > Anyway, here's what's in my global.asa to take care of this character > > set conversion mess.. Full details available to those that are > > interested.. > > [snip] > > Yikes, you redhat guys really need to look at AxKit: We have. > # in .htaccess > AxOutputCharset ISO-8859-1 > > And thats it. :-) > But that doesn't provide me dynamic switching between character sets based on user preferences. Based on HTTP_ACCEPT_CHARSET we can choose to use iso-8859-1 or utf8, plus we need to force Japanese to use x-euc-jp on certain platforms, sjis on others. Tell us, how do you do the character set conversion behind the scenes for various data sources? Do you just leave everything to XML::Parser? What about DB sources? What about data on disk? Can you switch this on the fly for various browser condition? Inquiring minds want to know. thanks. -- Paul Lindner [EMAIL PROTECTED] Red Hat Inc.
Re: multilanguage site
On Tue, 5 Sep 2000, Paul Lindner wrote: > Anyway, here's what's in my global.asa to take care of this character > set conversion mess.. Full details available to those that are > interested.. [snip] Yikes, you redhat guys really need to look at AxKit: # in .htaccess AxOutputCharset ISO-8859-1 And thats it. :-) -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org | AxKit: http://axkit.org
Re: multilanguage site
On Fri, Sep 01, 2000 at 10:44:10PM -0400, Greg Stark wrote: > > > >> can someone suggest me the best way to build a multilanguage web site > > >> (english, french, ..). > > >> I'm using Apache + mod_perl + Apache::asp (for applications) > > I'm really interested in what other people are doing here. We've just released > our first cut at i18n and it's going fairly well. But so far we haven't dealt > with the big bugaboo, character encoding. > One major problem I anticipate is what to do when individual include files are > not available in the local language. For iso-8859-1 encoded languages that's > not a major hurdle as we can simply use the english text until it's > translated. But for other encodings does it make sense to include english > text? > If we use UTF-8 all the ascii characters would display properly, but do most > browsers support UTF-8 now? Or do people still use BIG5, EUS, etc? > As far as I can tell there's no way in html to indicate to the browser that a > chunk of content is in some other encoding other than what was specified in > the headers or meta tag. There's no attribute or anything > like that. This seems to make truly multilingual pages really awkward. You > basically must use an encoding like UTF-8 which can reach the entire unicode > character set or else you cannot mix languages. It's a mess, but you're just going to have to assume multiple character sets for the forseeable future. We try to use all utf8 data sources. XML defaults to this. Oracle can be easily set up this way, and you can use utf8 in your html sources too. You just have to be careful, for example in our message catalogs we source translations into utf8. Anyway, here's what's in my global.asa to take care of this character set conversion mess.. Full details available to those that are interested.. In Script_OnStart we convert submitted data to utf8 ... #set $Apps::Param to form data or querystring. # decide on character set based on submitted form data element # 'asp_charset', or based on user's language. my $charset = $Apps::Param->{'asp_charset'}; $charset = 'x-euc-jp' if (!$charset && $Session->{"Lang"} eq 'ja'); $charset ||= 'iso-8859-1'; # Convert japanese to UTF8 ... messy Jcode stuff removed.. # convert utf8 ; # no-op # convert iso8859-1 to utf8 ... messy Unicode::String code.. $Response->{Charset} = $charset; In Script_OnFlush we convert the internal utf8 data to the target charset my $charset = $Response->{Charset}; # do character set conversion.. if ($charset eq 'x-euc-jp') { ... messy Jcode stuff } elsif ($charset eq 'iso-8859-1') { ... unicode::string stuff here. } # here's the tricky part: # Automatically add hidden charset fields to forms? $$data =~ s,(),formfixer($1),sige; Here's the formfixer thing, it adds hidden charset values to the form: sub formfixer { my $form = shift; return($form) if ($form =~ /action="?http/); $form =~ s,,,si; return($form); } -- Paul Lindner [EMAIL PROTECTED] Red Hat Inc.
Re: [OT] multilanguage site
Hi all, On Sun, 3 Sep 2000, [UTF-8] Ričardas Čepas wrote: > On Fri Sep 1 23:18:13 2000 -0400 Eric L. Brine wrote: > > This would require unicode capable browser anyway. Even more, > Netscape v4 doesn't show these escapes unless you set encoding to utf-8. There's a rather good document about character set encoding at http://www.physics.gla.ac.uk/r2h-extras/rtfunicode.html and some useful background stuff at http://ppewww.ph.gla.ac.uk/~flavell/charset/ Flavell has done a lot of good work on browser response too, if you browse around those sites you'll find there's a table there somewhere which shows how many different browser versions respond to what I'd call `funny characters'. See also 'man unicode', 'man utf-8' (even 'man latin-1') on Linux. 73, Ged. (And what's all this \342\230\273 stuff? Looks funny in Pine...:)
Re: multilanguage site
On Fri Sep 1 23:18:13 2000 -0400 Eric L. Brine wrote: > > > You basically must use an encoding like UTF-8 which can reach the > > entire unicode character set or else you cannot mix languages. > > Not quite. To display characters not in the current character set, use > "&...;" encodings, such as "é" and "✏" (where is > unicode). > This would require unicode capable browser anyway. Even more, Netscape v4 doesn't show these escapes unless you set encoding to utf-8. -- ☻ Ričardas Čepas ☺ ~~ ~
Re: multilanguage site
On Sat, 2 Sep 2000, Eric L. Brine wrote: > > > > As far as I can tell there's no way in html to indicate to the > > > browser that a chunk of content is in some other encoding other > > > than what was specified in the headers or meta tag. There's no > > > attribute or anything like that. > > > > Yes, there is. > > None exists in the standard, as seen below, and I don't see anything in > CSS either. My bad. I was mistaken by HTML form's accept-charset attribute. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org | AxKit: http://axkit.org
Re: multilanguage site
> > As far as I can tell there's no way in html to indicate to the > > browser that a chunk of content is in some other encoding other > > than what was specified in the headers or meta tag. There's no > > attribute or anything like that. > > Yes, there is. None exists in the standard, as seen below, and I don't see anything in CSS either. ELB
Re: multilanguage site
On 1 Sep 2000, Greg Stark wrote: > > > >> can someone suggest me the best way to build a multilanguage web site > > >> (english, french, ..). > > >> I'm using Apache + mod_perl + Apache::asp (for applications) > > I'm really interested in what other people are doing here. We've just released > our first cut at i18n and it's going fairly well. But so far we haven't dealt > with the big bugaboo, character encoding. > > One major problem I anticipate is what to do when individual include files are > not available in the local language. For iso-8859-1 encoded languages that's > not a major hurdle as we can simply use the english text until it's > translated. But for other encodings does it make sense to include english > text? > > If we use UTF-8 all the ascii characters would display properly, but do most > browsers support UTF-8 now? Or do people still use BIG5, EUS, etc? My experience has been really good. With 4.x+ browsers UTF8 displays just fine, with the obvious caveat that you have to be using the right fonts. Generally the people you are displaying to have the right fonts (otherwise they wouldn't be able to use their computers!). My only problems were two things: 1. Title bars in Linux just displayed junk. This was probably both an encoding/window manager issue and a font issue. 2. People don't want their content in UTF8 - they want it in the character set they are used to, like ISO-8859-2. So I added support in AxKit for alternate output encodings. Of course being XML, AxKit handles different character sets in included files just fine - everything is UTF8 to axkit. > As far as I can tell there's no way in html to indicate to the browser that a > chunk of content is in some other encoding other than what was specified in > the headers or meta tag. There's no attribute or anything > like that. Yes, there is. > This seems to make truly multilingual pages really awkward. You > basically must use an encoding like UTF-8 which can reach the entire unicode > character set or else you cannot mix languages. Or use AxKit ;-) -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org | AxKit: http://axkit.org
Re: multilanguage site
> As far as I can tell there's no way in html to indicate to the browser > that a chunk of content is in some other encoding other than what was > specified in the headers or meta tag. There's no > attribute or anything like that. This seems to make truly multilingual > pages really awkward. > You basically must use an encoding like UTF-8 which can reach the > entire unicode character set or else you cannot mix languages. Not quite. To display characters not in the current character set, use "&...;" encodings, such as "é" and "✏" (where is unicode). ELB
Re: multilanguage site
> >> can someone suggest me the best way to build a multilanguage web site > >> (english, french, ..). > >> I'm using Apache + mod_perl + Apache::asp (for applications) I'm really interested in what other people are doing here. We've just released our first cut at i18n and it's going fairly well. But so far we haven't dealt with the big bugaboo, character encoding. One major problem I anticipate is what to do when individual include files are not available in the local language. For iso-8859-1 encoded languages that's not a major hurdle as we can simply use the english text until it's translated. But for other encodings does it make sense to include english text? If we use UTF-8 all the ascii characters would display properly, but do most browsers support UTF-8 now? Or do people still use BIG5, EUS, etc? As far as I can tell there's no way in html to indicate to the browser that a chunk of content is in some other encoding other than what was specified in the headers or meta tag. There's no attribute or anything like that. This seems to make truly multilingual pages really awkward. You basically must use an encoding like UTF-8 which can reach the entire unicode character set or else you cannot mix languages. -- greg
Re: multilanguage site
On Tue, Aug 29, 2000 at 01:10:46PM -0700, Joshua Chamas wrote: > Francesco Pasqualini wrote: > > > > can someone suggest me the best way to build a multilanguage web site > > (english, french, ..). > > I'm using Apache + mod_perl + Apache::asp (for applications) > > > > Can be usefull XML/XSL whit AxKit ? > > Is there any example/guideline ? > > > > The approach used by Paul at RedHat seems to have been > to wrap internationalized messages with message > where is an XMLSub, which would do a lookup at runtime > into a message catalog for the right message, based on what > language the client was set to. I'm sure its much more > complicated than that, but that was the gist of it. Yeah, it's more complicated than that. :-) Basically there are four tools that we use, based on a hacked version of Locale::PGetText, and the standard .po file format provided by GNU gettext. The tools are: XText - extracts xxx text, Apps::gettext() strings into messages.po ... then we cp messages.po to messages..po and convert MsgProcess - processes messages..po into messages.db msgmerge - standard GNU gettext stuff. At runtime the code dynamically looks up the message text in the local messages.db file. Let me know if anyone is interested in this stuff. It's a bit rough at this point but works quite well for us. -- Paul Lindner [EMAIL PROTECTED] Red Hat Inc.
Re: multilanguage site
Francesco Pasqualini wrote: > > can someone suggest me the best way to build a multilanguage web site > (english, french, ..). > I'm using Apache + mod_perl + Apache::asp (for applications) > > Can be usefull XML/XSL whit AxKit ? > Is there any example/guideline ? > The approach used by Paul at RedHat seems to have been to wrap internationalized messages with message where is an XMLSub, which would do a lookup at runtime into a message catalog for the right message, based on what language the client was set to. I'm sure its much more complicated than that, but that was the gist of it. -- Joshua _ Joshua Chamas Chamas Enterprises Inc. NodeWorks >> free web link monitoring Huntington Beach, CA USA http://www.nodeworks.com1-714-625-4051
Re: multilanguage site
On Tue, 29 Aug 2000, Stas Bekman wrote: > On Tue, 29 Aug 2000, Matt Sergeant wrote: > > > On Tue, 29 Aug 2000, Francesco Pasqualini wrote: > > > > > can someone suggest me the best way to build a multilanguage web site > > > (english, french, ..). > > > I'm using Apache + mod_perl + Apache::asp (for applications) > > > > > > Can be usefull XML/XSL whit AxKit ? > > > Is there any example/guideline ? > > > > This month's Web Techniques is all about this (albeit in a framework > > independant manner). I suggest you try as hard as you can to get a copy as > > it covers way more than I could possibly type here. > > You can get as many copies as want :) it's online: > http://www.webtechniques.com/ Ah - I thought they had a lead time before it went online - guess not! They also didn't used to include all articles online, but I guess that has changed. Maybe I won't buy a subscription again (especially since its free to US readers!)... -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org | AxKit: http://axkit.org
Re: multilanguage site
"Francesco Pasqualini" <[EMAIL PROTECTED]> writes: > can someone suggest me the best way to build a multilanguage web site > (english, french, ..). > I'm using Apache + mod_perl + Apache::asp (for applications) > > Can be usefull XML/XSL whit AxKit ? > Is there any example/guideline ? I'm interested in this too :-) The Deep Purple site just went vaguely multilingual, but I'm doing this with straight Apache MultiViews (which _are_ honoured by SSI, which is nice) and I can see this becoming a huge headache. I'd like to do it with the Template Toolkit if at all possible. Dave -- Dave Hodgkinson, http://www.hodgkinson.org Editor-in-chief, The Highway Star http://www.deep-purple.com Apache, mod_perl, MySQL, Sybase hired gun for, well, hire -
Re: multilanguage site
On Tue, 29 Aug 2000, Matt Sergeant wrote: > On Tue, 29 Aug 2000, Francesco Pasqualini wrote: > > > can someone suggest me the best way to build a multilanguage web site > > (english, french, ..). > > I'm using Apache + mod_perl + Apache::asp (for applications) > > > > Can be usefull XML/XSL whit AxKit ? > > Is there any example/guideline ? > > This month's Web Techniques is all about this (albeit in a framework > independant manner). I suggest you try as hard as you can to get a copy as > it covers way more than I could possibly type here. You can get as many copies as want :) it's online: http://www.webtechniques.com/ > Also look up content negotiation in the Apache docs. > > -- > > > Fastnet Software Ltd. High Performance Web Specialists > Providing mod_perl, XML, Sybase and Oracle solutions > Email for training and consultancy availability. > http://sergeant.org | AxKit: http://axkit.org > > _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://apachetoday.com http://jazzvalley.com http://singlesheaven.com http://perlmonth.com perl.org apache.org
RE: multilanguage site
Try this: http://webtechniques.com/archives/2000/09/yunker/ and perhaps this: http://webtechniques.com/archives/2000/09/lagon/ >-Original Message- >From: Matt Sergeant [mailto:[EMAIL PROTECTED]] >Sent: Tuesday, August 29, 2000 9:16 AM >To: Francesco Pasqualini >Cc: [EMAIL PROTECTED] >Subject: Re: multilanguage site > > >On Tue, 29 Aug 2000, Francesco Pasqualini wrote: > >> can someone suggest me the best way to build a multilanguage web site >> (english, french, ..). >> I'm using Apache + mod_perl + Apache::asp (for applications) >> >> Can be usefull XML/XSL whit AxKit ? >> Is there any example/guideline ? > >This month's Web Techniques is all about this (albeit in a framework >independant manner). I suggest you try as hard as you can to >get a copy as >it covers way more than I could possibly type here. > >Also look up content negotiation in the Apache docs. > >-- > > >Fastnet Software Ltd. High Performance Web Specialists >Providing mod_perl, XML, Sybase and Oracle solutions >Email for training and consultancy availability. >http://sergeant.org | AxKit: http://axkit.org >
Re: multilanguage site
On Tue, 29 Aug 2000, Francesco Pasqualini wrote: > can someone suggest me the best way to build a multilanguage web site > (english, french, ..). > I'm using Apache + mod_perl + Apache::asp (for applications) > > Can be usefull XML/XSL whit AxKit ? > Is there any example/guideline ? This month's Web Techniques is all about this (albeit in a framework independant manner). I suggest you try as hard as you can to get a copy as it covers way more than I could possibly type here. Also look up content negotiation in the Apache docs. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org | AxKit: http://axkit.org
multilanguage site
can someone suggest me the best way to build a multilanguage web site (english, french, ..). I'm using Apache + mod_perl + Apache::asp (for applications) Can be usefull XML/XSL whit AxKit ? Is there any example/guideline ? Thanks Francesco Pasqualini