Re: Chinese big5 encoding and PO files

2003-01-29 Thread Peter Karlsson
Denis Barbier:

 This is why I suggest to insert backslashes in MO files and not PO files.
 Those backslashes will be removed when processed through WML.

The thing is that there are no backslashes, only 0x5C bytes. Those are
very different, although they might look very similar at a first
look...

It's possible that one can apply workarounds for this, but it is much
better to fix the tools that read the files to handle them properly.
Gettext should already be encoding safe, shouldn't it?

-- 
\\//
Peter - http://www.softwolves.pp.se/
  I do not read or respond to mail with HTML attachments.



Processed: Re: Bug#133280: wrong links at wmweather+ package page

2003-01-29 Thread Debian Bug Tracking System
Processing commands for [EMAIL PROTECTED]:

 tags 133280 + patch
Bug#133280: file list for aewm++ displays aewm files
There were no tags set.
Tags added: patch

 thanks
Stopping processing here.

Please contact me if you need assistance.

Debian bug tracking system administrator
(administrator, Debian Bugs database)



Re: Chinese big5 encoding and PO files

2003-01-29 Thread Denis Barbier
On Wed, Jan 29, 2003 at 07:28:57AM +0100, Peter Karlsson wrote:
 Denis Barbier:
 
  This is why I suggest to insert backslashes in MO files and not PO files.
  Those backslashes will be removed when processed through WML.
 
 The thing is that there are no backslashes, only 0x5C bytes. Those are
 very different, although they might look very similar at a first
 look...

Err, ascii(7) tells me that 0x5C *is* a backslash.

 It's possible that one can apply workarounds for this, but it is much
 better to fix the tools that read the files to handle them properly.
 Gettext should already be encoding safe, shouldn't it?

It is, and I will fix WML too.

Could you please have a look at chinese/po/others.zh.po and tell me
what to do with Subscribe/Unsubscribe translations?

Denis



Re: family name, personal name in devel/people

2003-01-29 Thread Josip Rodin
On Tue, Jan 28, 2003 at 07:48:35PM -0800, Osamu Aoki wrote:
   I imagine names in http://www.debian.org/devel/people have the
   unified format of Surname, Given name.
   
   I found two exceptions:
   
   Shuzo, Hatta, where Hatta is surname and Shuzo is given name.
   
   Yuuma, Oohara, where Oohara is surname and Yuuma is given name.
  
  Thanks! I've added them to the exception list.
 
 I am wondering why these strange entry exist.  Are they mistake of
 original data entry?  I never see Japanese name spelled that way.
 
  Shuzo Hatta(Most common, Passport)
  Hatta, Shuzo   (Many government forms, Some achademic paper)
  Hatta Shuzo(If you are historical figure or famous leterature writer)
  
 Where did these data originally taken?  Just curious.

From their own packages' Maintainer fields, i.e. they wrote it themselves:

% grep-available -F Maintainer Shuzo -s Maintainer | sort -u
Maintainer: HATTA Shuzo [EMAIL PROTECTED]
Maintainer: Hatta Shuzo [EMAIL PROTECTED]

% grep-available -F Maintainer Yuuma -s Maintainer | sort -u
Maintainer: Oohara Yuuma [EMAIL PROTECTED]

Perhaps one of you could politely inform these two developers that they
might get the westerners to read their name right if they changed the
ordering? :)

-- 
 2. That which causes joy or happiness.



Re: Chinese big5 encoding and PO files

2003-01-29 Thread Peter Karlsson
Denis Barbier:

 Err, ascii(7) tells me that 0x5C *is* a backslash.

Yes, but these documents aren't ASCII, so 0x5C may not or may not be a
backslash there, depending on where they are located in the file.

 Could you please have a look at chinese/po/others.zh.po and tell me
 what to do with Subscribe/Unsubscribe translations?

Nothing should need to be done, since the 0x5C byte is the trail byte
of the character, a proper MBCS aware string scanner will recognize
that it is not a backslash character (unlike, for instance, in the
please respect the ad policy string a bit further down, which *does*
contain a backslash in the translation). Getting the string scanner to
work properly requires configuring the locales properly.

Big5 is a bit problematic since it allows non-highbit characters as
trail bytes, similar to the problems with ISO 2022-JP. A stateful
string scanner is required to handle it properly. LibC should work fine
as long as the proper locale is available, and I am pretty sure that
the gettext utilities will handle this properly.

-- 
\\//
Peter - http://www.softwolves.pp.se/
  I do not read or respond to mail with HTML attachments.



Re: family name, personal name in devel/people

2003-01-29 Thread Josip Rodin
On Wed, Jan 29, 2003 at 01:56:30PM +0900, Tomohiro KUBOTA wrote:
 You know, some Japanese people write names in their native order,
 Family Given, and such expressions exist in db.debian.org database.
 
 ... but I checked the script
 (klecker:/org/www.debian.org/cron/people_scripts/people.pl) and
 I couldn't find additional handlers.

On that note, there should indeed be one such handler -- to make cn, sn ldap
fields preferred somehow, and one other handler -- to understand LASTNAME
Firstname and Firstname LASTNAME properly. Anyone care to write a patch? :)

-- 
 2. That which causes joy or happiness.



Bug#178831: packages.debian.org could use real substring searches

2003-01-29 Thread Gerfried Fuchs
severity wishlist
merge 178831 103694
thanks

* Ralph Siemsen [EMAIL PROTECTED] [2003-01-28 17:11]:
 I realize the comments right on the page indicate that this doesn't
 happen, so its not a bug.. so could this go on the wishlist?

 You mean like #103694, #115004, #156794 and #175644?  Yes, of course,
you are now merged with them.

 Have fun,
Alfie
-- 
Debian trennt strikt zwischen stable, unstable und testing releases, so daß
Du entscheiden kannst, ob Du auf den Gegner, Deinen Fuß oder beide Füße
gleichzeitig schießen willst.
   -- Robin S. Socha in [EMAIL PROTECTED]


pgpMFyg4J6Zjg.pgp
Description: PGP signature


Re: Chinese big5 encoding and PO files

2003-01-29 Thread Gerfried Fuchs
* Denis Barbier [EMAIL PROTECTED] [2003-01-29 10:32]:
 On Wed, Jan 29, 2003 at 07:28:57AM +0100, Peter Karlsson wrote:
 The thing is that there are no backslashes, only 0x5C bytes. Those are
 very different, although they might look very similar at a first
 look...
 
 Err, ascii(7) tells me that 0x5C *is* a backslash.

 I guess Peter meant that in a multibyte environment.  In that 0x5C
might not (always) be a backslash...

 Just trying to clear the thing.
Alfie
-- 
...you might as well skip the Xmas celebration completely, and instead
sit in front of your linux computer playing with the all-new-and-improved
linux kernel version.
-- Linus Torvalds


pgp3o0JzuRnWg.pgp
Description: PGP signature


Re: family name, personal name in devel/people

2003-01-29 Thread Tomohiro KUBOTA
Hi,

From: Josip Rodin [EMAIL PROTECTED]
Subject: Re: family name, personal name in devel/people
Date: Wed, 29 Jan 2003 10:33:58 +0100

 % grep-available -F Maintainer Shuzo -s Maintainer | sort -u
 Maintainer: HATTA Shuzo [EMAIL PROTECTED]
 Maintainer: Hatta Shuzo [EMAIL PROTECTED]
 
 % grep-available -F Maintainer Yuuma -s Maintainer | sort -u
 Maintainer: Oohara Yuuma [EMAIL PROTECTED]
 
 Perhaps one of you could politely inform these two developers that they
 might get the westerners to read their name right if they changed the
 ordering? :)

Well, I don't want to do this.  I want nobody to do this.
It is not a very good idea that non-westerners have to follow the
customs of westerners but westerners don't need to follow that of
non-westerners.  Non-westerners already suffer from paying cost to
learn many customs of westerners when we want to do something in
international societies, and I want to reduce the load if possible.

I think I can ask them to write family name in uppercase, it is the
maximum which I can ask them.  I don't know they will accept even
this idea.  Please note that this *is* what I recently mentioned as
a 10-year flamewar and I *never* want to join it, and even asking
writing familyname in uppercase might arouse the flamewar.  (If I
would ask to change name order, I would certainly stimulate the
core part of flamewar and Japanese members of Debian might drop
their activity as developers.)

---
Tomohiro KUBOTA [EMAIL PROTECTED]
http://www.debian.or.jp/~kubota/




Processed: Re: Processed: Re: Bug#178831: packages.debian.org could use real substring searches

2003-01-29 Thread Debian Bug Tracking System
Processing commands for [EMAIL PROTECTED]:

 severity 178831 wishlist
Bug#178831: packages.debian.org could use real substring searches
Severity set to `wishlist'.

 merge 178831 103694
Bug#103694: packages.d.o doesn't search on subwords in package names
Bug#178831: packages.debian.org could use real substring searches
Bug#115004: packages.d.o: needs searching on REAL subwords
Bug#156794: Package search subwords not working
Bug#175644: www.debian.org: searching on packages.debian.org doesn't seem to 
work
Merged 103694 115004 156794 175644 178831.

 thanks
Stopping processing here.

Please contact me if you need assistance.

Debian bug tracking system administrator
(administrator, Debian Bugs database)



Re: Chinese big5 encoding and PO files

2003-01-29 Thread Denis Barbier
On Wed, Jan 29, 2003 at 11:14:56AM +0100, Peter Karlsson wrote:
 Denis Barbier:
 
  Err, ascii(7) tells me that 0x5C *is* a backslash.
 
 Yes, but these documents aren't ASCII, so 0x5C may not or may not be a
 backslash there, depending on where they are located in the file.

Ok.

  Could you please have a look at chinese/po/others.zh.po and tell me
  what to do with Subscribe/Unsubscribe translations?
 
 Nothing should need to be done, since the 0x5C byte is the trail byte
 of the character, a proper MBCS aware string scanner will recognize
 that it is not a backslash character (unlike, for instance, in the
 please respect the ad policy string a bit further down, which *does*
 contain a backslash in the translation). Getting the string scanner to
 work properly requires configuring the locales properly.

The problem with current WML is that streams are bytes and not characters,
this is why 0x5C bytes have to be escaped.
I am preparing a character oriented version, but there are major backward
compatibility problems.  It means that any single file must contain only
one encoding, some files have to be fixed under webwml.

 Big5 is a bit problematic since it allows non-highbit characters as
 trail bytes, similar to the problems with ISO 2022-JP. A stateful
 string scanner is required to handle it properly. LibC should work fine
 as long as the proper locale is available, and I am pretty sure that
 the gettext utilities will handle this properly.

Yes, gettext is safe.

Instead of escaping some problematic characters, a better solution could
be to perform encoding conversions (as with Japanese files) to a safe
encoding.  Is there anyone interested in testing this scheme?

Denis



Re: Chinese big5 encoding and PO files

2003-01-29 Thread Rex Tsai

Denis Barbier wrote:

Hi,

there are trouble with big5 encoding in PO files, because some
backslashes are not escaped (e.g. MailingLists/subscribe.wml
cannot be processed).  Maybe fix_big5.pl should be run against
those PO files so that MO files contain escaped backslashes?
But I am not sure that encoding is then still valid, could a
Chinese translator investigate this issue?


I fixed the problem and commited.

No, fix_big5.pl can not resolve the problem. There is a package called bg5cc 
  that can converts `\' in Big-5 wide-characters that appear in source 
programs to `\\'. http://packages.debian.org/stable/devel/bg5cc.html

We can use this perl script, too. http://i18n.linux.org.tw/bg5cc.pl

I think we can do this before commit.
--
-Rex, geek by nature linux by choice



Re: Chinese big5 encoding and PO files

2003-01-29 Thread Rex Tsai

Rex Tsai wrote:

I fixed the problem and commited.


Oops, removed x5c again,  I will check the wml scripts for fixed the 
problem at wml compile time.


--
-Rex, geek by nature linux by choice



Re: Debian WWW CVS commit by chinese: webwml/chinese/po others.zh.po

2003-01-29 Thread Denis Barbier
On Wed, Jan 29, 2003 at 10:28:34AM -0700, Debian WWW CVS wrote:
 CVSROOT:  /cvs/webwml
 Module name:  webwml
 Changes by:   chinese 03/01/29 10:28:34
 
 Modified files:
   chinese/po : others.zh.po 
 
 Log message:
   removed backslash. The 0x5C works with new gettext (after 
 gettext-0.10.38?)

The problem occurs when strings are extracted from MO files, because it
happens during WML pass 2.  Consider for instance this snippet code:
   print gettextfoo/gettext;
and suppose that foo is translated into bar\, then WML will write
   print bar\;
and pass 3 (eperl) fails.

Denis



Re: Debian WWW CVS commit by chinese: webwml/chinese/po others.zh.po

2003-01-29 Thread Peter Karlsson
Denis Barbier:

 and pass 3 (eperl) fails.

Even with locales enabled (use locale; in standard Perl)?

-- 
\\//
Peter - http://www.softwolves.pp.se/
  I do not read or respond to mail with HTML attachments.



Question to the wml encoding of e-mailadresses on the website

2003-01-29 Thread Frank Lichtenheld
Hi.

While I worked on the translation of a document from the website, I wonder
if there is any explanation why some e-mailadresses are written
kbd[EMAIL PROTECTED]/kbd and others are written email
[EMAIL PROTECTED]. Is one of these ways deprecated or is there a
deeper sense behind?

Greetings,
Frank Lichtenheld

-- 
Frank Lichtenheld
www:  http://www.djpig.de
mail: [EMAIL PROTECTED]
PGP:  http://www.djpig.de/Frank.Lichtenheld.asc