On Sat, Jul 14, 2007 at 01:54:20PM +0200, Jens Seidel wrote: > On Sat, Jul 14, 2007 at 03:01:13AM -0500, Ming Hua wrote: > > Chinese must be separated as (at least) zh_CN and zh_TW, as although > > they are the same language, they use different scripts. I have no idea > > what script the current zh translations are, is there an easy way to > > review them? I'll do that and put them into correct category. > > > > Also please make sure DDTP/DDTSS do not accept ambiguous zh translations > > anymore. > > Can you please check http://ddtp.debian.net/debian/dists/sid/main/i18n/ > and merge zh translation manually with either zh_CN or zh_TW (via mail > interface), we cannot do this without knowledge of the language and > don't wont to drop all 22 translations.
Of course, I didn't suggest to drop them. I've now downloaded the Translation-zh file and found out that they are all simplified (zh_CN) translations, so should be merged into zh_CN. I'll do this via email interface in the following days (I need to learn using DDTP), and it's safe to turn off accepting zh translations now. > But are you really sure that it is not possible to convert a common > Chinese translation into zh_CN AND zh_TW? I'm so glad that you brought this up again. I was reading the thread "Re: DDTP - please activate support for pt" yesterday and found you've mentioned Chinese translation in that thread. I wanted to reply, only to realize your mail was in February. > Please note that this is done > by the Debian website, there is only a single translation but multiple > output encodings!? I know for website translation, zh_CN and zh_TW pages are generated from a single source file. However, it's not exactly a single translation, the source wml file supports the grammar "[CN:foo][HKTW:bar]", so that the generated page will use "foo" for zh_CN html and "bar" for zh_TW html. It's quite a maintenance hassle. Also, the difference between scripts is not only encoding (both zh_CN and zh_TW translations can, and prefer to, use UTF-8 nowadays, at least in the open source world). Encoding is not even the main part of the conversion. The website's wml source is usually in zh_CN or zh_TW depending on the preference of the main translator (as wml doesn't support UTF-8, IIRC), then the conversion to the other script is done by a third-party program (iconv is not really good enough for this task). I think currently the tools in zh-autoconvert package are used. The result, if not touched up by a translator of the other script (by adding more [zh:foo][HKTW:bar] alternative tags), will read awkward most of the time, and sometime even confusing. So you see, generating both zh_CN and zh_TW translations from a single source is not really ideal. IMHO the maintenance hassle, as well as the suboptimal results, is one of the reason that Chinese website translations have been stagnant these years. > Could someone please explain this? Why waste time for two > encodings/scripts if one is sufficient? So in short, it's not an encoding issue. I only know English and Chinese, but I suspect the difference is probably on par with nn (Norwegian Nynorsk) and nb (Norwegian Bokmål). One translation is not possible. One source is possible, but inconvenient and suboptimal. Both zh_CN translators and zh_TW ones (though I can't speak for them) would prefer separate translations. Hope this makes things a bit more clear. Ming 2007.07.14 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

