Bug#349938: Encoding of manpages
Wow, either that's a encoding disaster, or I'm not doing something right. First of all, I cannot say that the files you sent are in UTF-8, How should I say this... The text is not in Korean at all. It uses letters that are legal in a Korean encoding, but the result is completely unintelligible. I guess an analogy would be easier. French uses letters like a, h, or ç, right? But what if the h's are changed into a's and a's into ç? Still valid to the computer, but not to humans. I attached the manpage files installed in my system. They are encoding in EUC-KR, and display just fine. They might be a bit dated, though. Could anyone help me in investigating this problem? 2006/1/26, Nicolas François [EMAIL PROTECTED]: Hello, In rpm 4.4.1-5, the Korean man pages are encoded in UTF-8. Can you check if they are displayed correctly in a Korean environment. IMO, they should be encoded in EUC-KR, however iconv fails to recode the rpm man page to EUC-KR. If the man page must be recoded, can anybody check why iconv is failing, and submit a patch for the rpm.8 and rpm2cpio.8 man page. I think the man pages use a character that can't be recoded in EUC-KR. $ LC_ALL=C iconv -t EUC-KR -f UTF-8 doc/ko/rpm.8 /dev/null iconv: illegal input sequence at position 79 I attach the 2 manpages. Thanks in advance, -- Nekral -- Sunjae Park(daréhanl) We choose to go to the moon and do the other things, not because they are easy, but because they are hard. - John F. Kennedy - rpm.8.gz Description: GNU Zip compressed data rpm2cpio.8.gz Description: GNU Zip compressed data
Bug#349938: Encoding of manpages
Hello, I think I understand what happened. A bug was filled against rpm (https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=106050) to request the change of encodings of the Japanese and Korean man pages from respectively the EUC-JP and EUC-KR encodings to UTF-8. But all the pages were converted assuming the original encodings was EUC-JP. so iconv -f UTF-8 -t EUC-JP rpm.8 rpm.8.recoded give the right man page (the same pages sent by Sunjae Parkin), in EUC-KR. A bug should be filled against the Redhat rpm package. They should recode the Korean pages with: cat rpm.8 | iconv -f UTF-8 -t EUC-JP | \ iconv -f EUC-KR -t UTF-8 rpm.8.recoded \ mv rpm.8.recoded rpm.8 (the same with rpm2cpio) Debian rpm maintainers, to fix the Korean pages, you can either change the encoding in recode_manpages.sh, or call the above command in debian/rules before calling recode_manpages.sh. Kind Regards, -- Nekral -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#349938: Encoding of manpages
Hello, In rpm 4.4.1-5, the Korean man pages are encoded in UTF-8. Can you check if they are displayed correctly in a Korean environment. IMO, they should be encoded in EUC-KR, however iconv fails to recode the rpm man page to EUC-KR. If the man page must be recoded, can anybody check why iconv is failing, and submit a patch for the rpm.8 and rpm2cpio.8 man page. I think the man pages use a character that can't be recoded in EUC-KR. $ LC_ALL=C iconv -t EUC-KR -f UTF-8 doc/ko/rpm.8 /dev/null iconv: illegal input sequence at position 79 I attach the 2 manpages. Thanks in advance, -- Nekral rpm.8.gz Description: Binary data rpm2cpio.8.gz Description: Binary data
Bug#349938: Encoding of manpages
On Thu, Jan 26, 2006 at 12:57:40PM +0100, Nicolas François wrote: Hello, In rpm 4.4.1-5, the Korean man pages are encoded in UTF-8. Can you check if they are displayed correctly in a Korean environment. IMO, they should be encoded in EUC-KR, however iconv fails to recode the rpm man page to EUC-KR. You absolutely want to have the manpages *source* in UTF8, principally for keeping sync. with upstream but also so we can make changes easily. If the man page must be recoded, can anybody check why iconv is failing, and submit a patch for the rpm.8 and rpm2cpio.8 man page. I think the man pages use a character that can't be recoded in EUC-KR. $ LC_ALL=C iconv -t EUC-KR -f UTF-8 doc/ko/rpm.8 /dev/null iconv: illegal input sequence at position 79 Ahh, that means the rpm.8 manpage will be corrupt for Korean users; we will have to see what the problematic characters are so we can strip them out before doing the conversion then. Thanks for pointing that out. Regards, Anand -- `When any government, or any church for that matter, undertakes to say to its subjects, This you may not read, this you must not see, this you are forbidden to know, the end result is tyranny and oppression no matter how holy the motives' -- Robert A Heinlein, If this goes on -- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]