Re: Chinese big5 encoding and PO files
Denis Barbier: This is why I suggest to insert backslashes in MO files and not PO files. Those backslashes will be removed when processed through WML. The thing is that there are no backslashes, only 0x5C bytes. Those are very different, although they might look very similar at a first look... It's possible that one can apply workarounds for this, but it is much better to fix the tools that read the files to handle them properly. Gettext should already be encoding safe, shouldn't it? -- \\// Peter - http://www.softwolves.pp.se/ I do not read or respond to mail with HTML attachments.
Processed: Re: Bug#133280: wrong links at wmweather+ package page
Processing commands for [EMAIL PROTECTED]: tags 133280 + patch Bug#133280: file list for aewm++ displays aewm files There were no tags set. Tags added: patch thanks Stopping processing here. Please contact me if you need assistance. Debian bug tracking system administrator (administrator, Debian Bugs database)
Re: Chinese big5 encoding and PO files
On Wed, Jan 29, 2003 at 07:28:57AM +0100, Peter Karlsson wrote: Denis Barbier: This is why I suggest to insert backslashes in MO files and not PO files. Those backslashes will be removed when processed through WML. The thing is that there are no backslashes, only 0x5C bytes. Those are very different, although they might look very similar at a first look... Err, ascii(7) tells me that 0x5C *is* a backslash. It's possible that one can apply workarounds for this, but it is much better to fix the tools that read the files to handle them properly. Gettext should already be encoding safe, shouldn't it? It is, and I will fix WML too. Could you please have a look at chinese/po/others.zh.po and tell me what to do with Subscribe/Unsubscribe translations? Denis
Re: family name, personal name in devel/people
On Tue, Jan 28, 2003 at 07:48:35PM -0800, Osamu Aoki wrote: I imagine names in http://www.debian.org/devel/people have the unified format of Surname, Given name. I found two exceptions: Shuzo, Hatta, where Hatta is surname and Shuzo is given name. Yuuma, Oohara, where Oohara is surname and Yuuma is given name. Thanks! I've added them to the exception list. I am wondering why these strange entry exist. Are they mistake of original data entry? I never see Japanese name spelled that way. Shuzo Hatta(Most common, Passport) Hatta, Shuzo (Many government forms, Some achademic paper) Hatta Shuzo(If you are historical figure or famous leterature writer) Where did these data originally taken? Just curious. From their own packages' Maintainer fields, i.e. they wrote it themselves: % grep-available -F Maintainer Shuzo -s Maintainer | sort -u Maintainer: HATTA Shuzo [EMAIL PROTECTED] Maintainer: Hatta Shuzo [EMAIL PROTECTED] % grep-available -F Maintainer Yuuma -s Maintainer | sort -u Maintainer: Oohara Yuuma [EMAIL PROTECTED] Perhaps one of you could politely inform these two developers that they might get the westerners to read their name right if they changed the ordering? :) -- 2. That which causes joy or happiness.
Re: Chinese big5 encoding and PO files
Denis Barbier: Err, ascii(7) tells me that 0x5C *is* a backslash. Yes, but these documents aren't ASCII, so 0x5C may not or may not be a backslash there, depending on where they are located in the file. Could you please have a look at chinese/po/others.zh.po and tell me what to do with Subscribe/Unsubscribe translations? Nothing should need to be done, since the 0x5C byte is the trail byte of the character, a proper MBCS aware string scanner will recognize that it is not a backslash character (unlike, for instance, in the please respect the ad policy string a bit further down, which *does* contain a backslash in the translation). Getting the string scanner to work properly requires configuring the locales properly. Big5 is a bit problematic since it allows non-highbit characters as trail bytes, similar to the problems with ISO 2022-JP. A stateful string scanner is required to handle it properly. LibC should work fine as long as the proper locale is available, and I am pretty sure that the gettext utilities will handle this properly. -- \\// Peter - http://www.softwolves.pp.se/ I do not read or respond to mail with HTML attachments.
Re: family name, personal name in devel/people
On Wed, Jan 29, 2003 at 01:56:30PM +0900, Tomohiro KUBOTA wrote: You know, some Japanese people write names in their native order, Family Given, and such expressions exist in db.debian.org database. ... but I checked the script (klecker:/org/www.debian.org/cron/people_scripts/people.pl) and I couldn't find additional handlers. On that note, there should indeed be one such handler -- to make cn, sn ldap fields preferred somehow, and one other handler -- to understand LASTNAME Firstname and Firstname LASTNAME properly. Anyone care to write a patch? :) -- 2. That which causes joy or happiness.
Bug#178831: packages.debian.org could use real substring searches
severity wishlist merge 178831 103694 thanks * Ralph Siemsen [EMAIL PROTECTED] [2003-01-28 17:11]: I realize the comments right on the page indicate that this doesn't happen, so its not a bug.. so could this go on the wishlist? You mean like #103694, #115004, #156794 and #175644? Yes, of course, you are now merged with them. Have fun, Alfie -- Debian trennt strikt zwischen stable, unstable und testing releases, so daß Du entscheiden kannst, ob Du auf den Gegner, Deinen Fuß oder beide Füße gleichzeitig schießen willst. -- Robin S. Socha in [EMAIL PROTECTED] pgpMFyg4J6Zjg.pgp Description: PGP signature
Re: Chinese big5 encoding and PO files
* Denis Barbier [EMAIL PROTECTED] [2003-01-29 10:32]: On Wed, Jan 29, 2003 at 07:28:57AM +0100, Peter Karlsson wrote: The thing is that there are no backslashes, only 0x5C bytes. Those are very different, although they might look very similar at a first look... Err, ascii(7) tells me that 0x5C *is* a backslash. I guess Peter meant that in a multibyte environment. In that 0x5C might not (always) be a backslash... Just trying to clear the thing. Alfie -- ...you might as well skip the Xmas celebration completely, and instead sit in front of your linux computer playing with the all-new-and-improved linux kernel version. -- Linus Torvalds pgp3o0JzuRnWg.pgp Description: PGP signature
Re: family name, personal name in devel/people
Hi, From: Josip Rodin [EMAIL PROTECTED] Subject: Re: family name, personal name in devel/people Date: Wed, 29 Jan 2003 10:33:58 +0100 % grep-available -F Maintainer Shuzo -s Maintainer | sort -u Maintainer: HATTA Shuzo [EMAIL PROTECTED] Maintainer: Hatta Shuzo [EMAIL PROTECTED] % grep-available -F Maintainer Yuuma -s Maintainer | sort -u Maintainer: Oohara Yuuma [EMAIL PROTECTED] Perhaps one of you could politely inform these two developers that they might get the westerners to read their name right if they changed the ordering? :) Well, I don't want to do this. I want nobody to do this. It is not a very good idea that non-westerners have to follow the customs of westerners but westerners don't need to follow that of non-westerners. Non-westerners already suffer from paying cost to learn many customs of westerners when we want to do something in international societies, and I want to reduce the load if possible. I think I can ask them to write family name in uppercase, it is the maximum which I can ask them. I don't know they will accept even this idea. Please note that this *is* what I recently mentioned as a 10-year flamewar and I *never* want to join it, and even asking writing familyname in uppercase might arouse the flamewar. (If I would ask to change name order, I would certainly stimulate the core part of flamewar and Japanese members of Debian might drop their activity as developers.) --- Tomohiro KUBOTA [EMAIL PROTECTED] http://www.debian.or.jp/~kubota/
Processed: Re: Processed: Re: Bug#178831: packages.debian.org could use real substring searches
Processing commands for [EMAIL PROTECTED]: severity 178831 wishlist Bug#178831: packages.debian.org could use real substring searches Severity set to `wishlist'. merge 178831 103694 Bug#103694: packages.d.o doesn't search on subwords in package names Bug#178831: packages.debian.org could use real substring searches Bug#115004: packages.d.o: needs searching on REAL subwords Bug#156794: Package search subwords not working Bug#175644: www.debian.org: searching on packages.debian.org doesn't seem to work Merged 103694 115004 156794 175644 178831. thanks Stopping processing here. Please contact me if you need assistance. Debian bug tracking system administrator (administrator, Debian Bugs database)
Re: Chinese big5 encoding and PO files
On Wed, Jan 29, 2003 at 11:14:56AM +0100, Peter Karlsson wrote: Denis Barbier: Err, ascii(7) tells me that 0x5C *is* a backslash. Yes, but these documents aren't ASCII, so 0x5C may not or may not be a backslash there, depending on where they are located in the file. Ok. Could you please have a look at chinese/po/others.zh.po and tell me what to do with Subscribe/Unsubscribe translations? Nothing should need to be done, since the 0x5C byte is the trail byte of the character, a proper MBCS aware string scanner will recognize that it is not a backslash character (unlike, for instance, in the please respect the ad policy string a bit further down, which *does* contain a backslash in the translation). Getting the string scanner to work properly requires configuring the locales properly. The problem with current WML is that streams are bytes and not characters, this is why 0x5C bytes have to be escaped. I am preparing a character oriented version, but there are major backward compatibility problems. It means that any single file must contain only one encoding, some files have to be fixed under webwml. Big5 is a bit problematic since it allows non-highbit characters as trail bytes, similar to the problems with ISO 2022-JP. A stateful string scanner is required to handle it properly. LibC should work fine as long as the proper locale is available, and I am pretty sure that the gettext utilities will handle this properly. Yes, gettext is safe. Instead of escaping some problematic characters, a better solution could be to perform encoding conversions (as with Japanese files) to a safe encoding. Is there anyone interested in testing this scheme? Denis
Re: Chinese big5 encoding and PO files
Denis Barbier wrote: Hi, there are trouble with big5 encoding in PO files, because some backslashes are not escaped (e.g. MailingLists/subscribe.wml cannot be processed). Maybe fix_big5.pl should be run against those PO files so that MO files contain escaped backslashes? But I am not sure that encoding is then still valid, could a Chinese translator investigate this issue? I fixed the problem and commited. No, fix_big5.pl can not resolve the problem. There is a package called bg5cc that can converts `\' in Big-5 wide-characters that appear in source programs to `\\'. http://packages.debian.org/stable/devel/bg5cc.html We can use this perl script, too. http://i18n.linux.org.tw/bg5cc.pl I think we can do this before commit. -- -Rex, geek by nature linux by choice
Re: Chinese big5 encoding and PO files
Rex Tsai wrote: I fixed the problem and commited. Oops, removed x5c again, I will check the wml scripts for fixed the problem at wml compile time. -- -Rex, geek by nature linux by choice
Re: Debian WWW CVS commit by chinese: webwml/chinese/po others.zh.po
On Wed, Jan 29, 2003 at 10:28:34AM -0700, Debian WWW CVS wrote: CVSROOT: /cvs/webwml Module name: webwml Changes by: chinese 03/01/29 10:28:34 Modified files: chinese/po : others.zh.po Log message: removed backslash. The 0x5C works with new gettext (after gettext-0.10.38?) The problem occurs when strings are extracted from MO files, because it happens during WML pass 2. Consider for instance this snippet code: print gettextfoo/gettext; and suppose that foo is translated into bar\, then WML will write print bar\; and pass 3 (eperl) fails. Denis
Re: Debian WWW CVS commit by chinese: webwml/chinese/po others.zh.po
Denis Barbier: and pass 3 (eperl) fails. Even with locales enabled (use locale; in standard Perl)? -- \\// Peter - http://www.softwolves.pp.se/ I do not read or respond to mail with HTML attachments.
Question to the wml encoding of e-mailadresses on the website
Hi. While I worked on the translation of a document from the website, I wonder if there is any explanation why some e-mailadresses are written kbd[EMAIL PROTECTED]/kbd and others are written email [EMAIL PROTECTED]. Is one of these ways deprecated or is there a deeper sense behind? Greetings, Frank Lichtenheld -- Frank Lichtenheld www: http://www.djpig.de mail: [EMAIL PROTECTED] PGP: http://www.djpig.de/Frank.Lichtenheld.asc