On Fri, Jan 20, 2023 at 7:42 AM Bill Allombert <ballo...@debian.org> wrote: > > On Thu, Jan 19, 2023 at 11:47:42AM +0000, Simon McVittie wrote: > > On Wed, 18 Jan 2023 at 16:30:46 -0700, Anthony Fok wrote: > > > In their mind, GB 18030 encompasses a lot more than just > > > a character encoding mapping table. It is the full support package > > > (including fonts, display, printing, input methods, etc.) for Han > > > Chinese and all other minority languages used in China. > > > > Preferring to use Unicode does seem to be the direction that all of > > computing is going in, as a simplifying assumption - for example W3C > > advice for HTML is "You should always use the UTF-8 character encoding"[1] > > - and as we know, things that aren't tested usually don't work. So I > > think the level of functionality for non-UTF-8 locales and encodings in > > the software we package is going to decline over time, whether Debian > > wants it to or not.
Re-reading Simon's comment again: Yes, UTF-8 is the ideal, but supposedly some older Chinese websites are still using "GBK" as encoding, probably something like: <meta http-equiv="Content-Type" content="text/html;charset=gbk"> which has less than 30,000 characters and thus a very limited subset of Unicode. And, presumably not everyone has the know how to convert to UTF-8, the Chinese government wants those unable to at least change that meta tag to: <meta http-equiv="Content-Type" content="text/html;charset=gb18030"> where GB18030, being a Unicode Transformation Format, albeit a somewhat awkward one, would be able to display any characters in Unicode. > It is true for everything. Users know how to pick the software that works for > their > environment. It is not relevant that software they do not use do not support > their > environment. > > Telling users to switch to UTF-8 because such and such software they never > used > and were never going to use do not support GB18030 does not make sense. I have the feeling that many tech-savvy Chinese have already switched to UTF-8, but then perhaps in some circles there are lots of legacy GB2312/GBK documents or systems that made GB18030 a necessity, as an intermediate step to Unicode. (Not so in Taiwan and Hong Kong, they jump straight to UTF-8 from Big5 or Big5-HKSCS. For better or for worse.) > It is like saying the Linux console is deprecated because there are Debian > packages that requires X or Wayland. > > Cheers, > -- > Bill. <ballo...@debian.org> > > Imagine a large red swirl here. Thanks for the wonderful discussion, Bill! Cheers, Anthony