Bug#349938: Encoding of manpages

2006-01-27 Thread Sunjae Park
Wow, either that's a encoding disaster, or I'm not doing something right.

First of all, I cannot say that the files you sent are in UTF-8, How
should I say this...
The text is not in Korean at all. It uses letters that are legal in a
Korean encoding, but the result is completely unintelligible.

I guess an analogy would be easier.

French uses letters like a, h, or ç, right? But what if the h's are
changed into a's and a's into ç? Still valid to the computer, but not
to humans.

I attached the manpage files installed in my system. They are encoding
in EUC-KR, and display just fine. They might be a bit dated, though.
Could anyone help me in investigating this problem?

2006/1/26, Nicolas François [EMAIL PROTECTED]:
 Hello,

 In rpm 4.4.1-5, the Korean man pages are encoded in UTF-8.
 Can you check if they are displayed correctly in a Korean environment.

 IMO, they should be encoded in EUC-KR, however iconv fails to recode the
 rpm man page to EUC-KR.

 If the man page must be recoded, can anybody check why iconv is failing,
 and submit a patch for the rpm.8 and rpm2cpio.8 man page. I think the man
 pages use a character that can't be recoded in EUC-KR.

 $ LC_ALL=C iconv -t EUC-KR -f UTF-8  doc/ko/rpm.8  /dev/null
 iconv: illegal input sequence at position 79

 I attach the 2 manpages.

 Thanks in advance,
 --
 Nekral





--
Sunjae Park(daréhanl)

We choose to go to the moon and do the other things, not because they
are easy, but because they are hard.
 - John F. Kennedy -


rpm.8.gz
Description: GNU Zip compressed data


rpm2cpio.8.gz
Description: GNU Zip compressed data


Bug#349938: Encoding of manpages

2006-01-27 Thread Nicolas François
Hello,


I think I understand what happened.
A bug was filled against rpm
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=106050) to request
the change of encodings of the Japanese and Korean man pages from
respectively the EUC-JP and EUC-KR encodings to UTF-8.
But all the pages were converted assuming the original encodings was EUC-JP.

so iconv -f UTF-8 -t EUC-JP  rpm.8  rpm.8.recoded
give the right man page (the same pages sent by Sunjae Parkin), in EUC-KR.

A bug should be filled against the Redhat rpm package. They should
recode the Korean pages with:
cat rpm.8 | iconv -f UTF-8 -t EUC-JP | \
iconv -f EUC-KR -t UTF-8  rpm.8.recoded  \
mv rpm.8.recoded rpm.8
(the same with rpm2cpio)

Debian rpm maintainers, to fix the Korean pages, you can either change the
encoding in recode_manpages.sh, or call the above command in debian/rules
before calling recode_manpages.sh.

Kind Regards,
-- 
Nekral


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#349938: Encoding of manpages

2006-01-26 Thread Nicolas François
Hello,

In rpm 4.4.1-5, the Korean man pages are encoded in UTF-8.
Can you check if they are displayed correctly in a Korean environment.

IMO, they should be encoded in EUC-KR, however iconv fails to recode the
rpm man page to EUC-KR.

If the man page must be recoded, can anybody check why iconv is failing,
and submit a patch for the rpm.8 and rpm2cpio.8 man page. I think the man
pages use a character that can't be recoded in EUC-KR.

$ LC_ALL=C iconv -t EUC-KR -f UTF-8  doc/ko/rpm.8  /dev/null
iconv: illegal input sequence at position 79

I attach the 2 manpages.

Thanks in advance,
-- 
Nekral


rpm.8.gz
Description: Binary data


rpm2cpio.8.gz
Description: Binary data


Bug#349938: Encoding of manpages

2006-01-26 Thread Anand Kumria
On Thu, Jan 26, 2006 at 12:57:40PM +0100, Nicolas François wrote:
 Hello,
 
 In rpm 4.4.1-5, the Korean man pages are encoded in UTF-8.
 Can you check if they are displayed correctly in a Korean environment.
 
 IMO, they should be encoded in EUC-KR, however iconv fails to recode the
 rpm man page to EUC-KR.

You absolutely want to have the manpages *source* in UTF8, principally
for keeping sync. with upstream but also so we can make changes easily.

 If the man page must be recoded, can anybody check why iconv is failing,
 and submit a patch for the rpm.8 and rpm2cpio.8 man page. I think the man
 pages use a character that can't be recoded in EUC-KR.
 
 $ LC_ALL=C iconv -t EUC-KR -f UTF-8  doc/ko/rpm.8  /dev/null
 iconv: illegal input sequence at position 79

Ahh, that means the rpm.8 manpage will be corrupt for Korean users; we will 
have to see what the problematic characters are so we can strip them out 
before doing the conversion then.

Thanks for pointing that out.

Regards,
Anand

-- 
 `When any government, or any church for that matter, undertakes to say to
  its subjects, This you may not read, this you must not see, this you are
  forbidden to know, the end result is tyranny and oppression no matter how
  holy the motives' -- Robert A Heinlein, If this goes on --


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]