Bug#199422: Bug#470844: [Pkg-samba-maint] Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-30 Thread Colin Watson
On Sun, Mar 23, 2008 at 09:07:03PM -0700, Steve Langasek wrote:
 On Fri, Mar 14, 2008 at 07:29:07AM +0100, Christian Perrier wrote:
  Apparently, these are non-breakable spacesencoded in ISO-8859-1.
  I'm not sure this can be called incorrect. Colin
 
  I don't see any deep reasons for these spaces to be uncreakable,
  though. I'd rather see them as regular spaces, which would help good
  portability of original manpages.
 
  Samba's manpages are generated from XML files in upstream's samba-doc
  SVN. They are included in upstream's distribution when releases are
  published.. The fix should probably go there rather than having a
  Debian-cooked patch to the generated manpages
 
 Interestingly, there are no non-break spaces in the original HTML sources
 for these documents; so these are entirely an artifact of the translation
 tools (db2man.xsl?) used by upstream.
 
 I've prepared a patch that replaces all of the \xA0 chars in the manpages
 with the \  sequence Colin mentions, and will commit to svn shortly; and
 I'm cloning this bug to man-db in case Colin thinks this smashing of 0xA0 is
 something that should be fixed there.

I initially thought that this was a bug in groff's fonts (they don't
define a character at position 160), and indeed it is possible to make
this bug go away by changing the fonts.

However, after plugging my brain in, I realised that this was wrong.
Fonts deal with output glyphs, not input characters, and this is a
matter of the correct handling of the input character at position 160 in
ISO-8859-1 (or U+00A0). In fact, /usr/share/groff/current/tmac/troffrc
already transliterates \[char160] to \~ (stretchable non-breaking
space). So what's going on?

It turns out that this is the fault of the multibyte patch. The
translation was being read by troff and applied to char160. However,
with the multibyte patch input character 160 is handled as a wide
character (because its encoding in the input stream may vary).
Characters in the range 128-255 are strange, because, although they are
wide characters, their properties are already stored by ordinary troff
in its charset_table array. Unfortunately, troff wasn't looking these up
properly when the multibyte patch was in effect, and as a result it
ignored the translation requested by troffrc.

Fixing the lookup code in wcharset_table_entry makes the translation
work again. I've applied this fix for my next upload.

Cheers,

-- 
Colin Watson   [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#470844: [Pkg-samba-maint] Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-23 Thread Steve Langasek
clone 470844 -1
reassign -1 man-db
retitle -1 man: does not handle embedded iso8859-1 nbsp chars
thanks

On Fri, Mar 14, 2008 at 07:29:07AM +0100, Christian Perrier wrote:
 Quoting Filipus Klutiero ([EMAIL PROTECTED]):
  Package: samba-common
  Version: 3.0.28-4
  Severity: minor

  konqueror shows the SYNOPIS section of nmblookup's manpage like this:

  SYNOPSIS
  nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-B�broadcast�address] 
  [-U�unicast�address] [-d�debug�level] [-s�smb�config�file] 
  [-i�NetBIOS�scope] [-T] [-f] {name} 

  man like this:
  SYNOPSIS
 nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-Bbroadcastaddress] 
  [-Uunicastaddress] [-ddebuglevel] [-ssmbconfigfile] 
  [-iNetBIOSscope] 
  [-T] [-f] {name}

  The characters look like simple spaces in stable.

 Apparently, these are non-breakable spacesencoded in ISO-8859-1.
 I'm not sure this can be called incorrect. Colin

 I don't see any deep reasons for these spaces to be uncreakable,
 though. I'd rather see them as regular spaces, which would help good
 portability of original manpages.

 Samba's manpages are generated from XML files in upstream's samba-doc
 SVN. They are included in upstream's distribution when releases are
 published.. The fix should probably go there rather than having a
 Debian-cooked patch to the generated manpages

Interestingly, there are no non-break spaces in the original HTML sources
for these documents; so these are entirely an artifact of the translation
tools (db2man.xsl?) used by upstream.

I've prepared a patch that replaces all of the \xA0 chars in the manpages
with the \  sequence Colin mentions, and will commit to svn shortly; and
I'm cloning this bug to man-db in case Colin thinks this smashing of 0xA0 is
something that should be fixed there.

Long term, to fix this upstream will require a fix to whatever version of
the db2man toolchain upstream is using.

Cheers,
-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
[EMAIL PROTECTED] [EMAIL PROTECTED]




Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-14 Thread Colin Watson
On Thu, Mar 13, 2008 at 08:07:09AM -0400, Filipus Klutiero wrote:
 konqueror shows the SYNOPIS section of nmblookup's manpage like this:
 
 SYNOPSIS
 nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-B�broadcast�address] 
 [-U�unicast�address] [-d�debug�level] [-s�smb�config�file] 
 [-i�NetBIOS�scope] [-T] [-f] {name} 
 
 man like this:
 SYNOPSIS
nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-Bbroadcastaddress] 
 [-Uunicastaddress] [-ddebuglevel] [-ssmbconfigfile] [-iNetBIOSscope] 
 [-T] [-f] {name}
 
 The characters look like simple spaces in stable.

What version of konqueror do you have installed?

-- 
Colin Watson   [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-14 Thread Filipus Klutiero
Le March 14, 2008 05:16:56 am Colin Watson, vous avez écrit :
 What version of konqueror do you have installed?

4:3.5.8.dfsg.1-7




Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-14 Thread Colin Watson
On Fri, Mar 14, 2008 at 05:26:52AM -0400, Filipus Klutiero wrote:
 Le March 14, 2008 05:16:56 am Colin Watson, vous avez écrit :
  What version of konqueror do you have installed?
 
 4:3.5.8.dfsg.1-7

I'd be interested to know if 4:3.5.9.dfsg.1-1 helps with this.

  * Add 71_kio_man_utf8 patch to support the recode feature of man-db.
This allows the man kio slave's rendering code to work regardless of the
source encoding of the manual page.
Thanks to Colin Watson. (Closes: #449554)

-- 
Colin Watson   [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#470844: [Pkg-samba-maint] Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-14 Thread Christian Perrier
Quoting Filipus Klutiero ([EMAIL PROTECTED]):
 Package: samba-common
 Version: 3.0.28-4
 Severity: minor
 
 konqueror shows the SYNOPIS section of nmblookup's manpage like this:
 
 SYNOPSIS
 nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-B�broadcast�address] 
 [-U�unicast�address] [-d�debug�level] [-s�smb�config�file] 
 [-i�NetBIOS�scope] [-T] [-f] {name} 
 
 man like this:
 SYNOPSIS
nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-Bbroadcastaddress] 
 [-Uunicastaddress] [-ddebuglevel] [-ssmbconfigfile] [-iNetBIOSscope] 
 [-T] [-f] {name}
 
 The characters look like simple spaces in stable.


Apparently, these are non-breakable spacesencoded in ISO-8859-1.
I'm not sure this can be called incorrect. Colin

I don't see any deep reasons for these spaces to be uncreakable,
though. I'd rather see them as regular spaces, which would help good
portability of original manpages.

Samba's manpages are generated from XML files in upstream's samba-doc
SVN. They are included in upstream's distribution when releases are
published.. The fix should probably go there rather than having a
Debian-cooked patch to the generated manpages



signature.asc
Description: Digital signature


Bug#470844: [Pkg-samba-maint] Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-14 Thread Colin Watson
On Fri, Mar 14, 2008 at 07:29:07AM +0100, Christian Perrier wrote:
 Quoting Filipus Klutiero ([EMAIL PROTECTED]):
  konqueror shows the SYNOPIS section of nmblookup's manpage like this:
  
  SYNOPSIS
  nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-B�broadcast�address] 
  [-U�unicast�address] [-d�debug�level] [-s�smb�config�file] 
  [-i�NetBIOS�scope] [-T] [-f] {name} 
  
  man like this:
  SYNOPSIS
 nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-Bbroadcastaddress] 
  [-Uunicastaddress] [-ddebuglevel] [-ssmbconfigfile] 
  [-iNetBIOSscope] 
  [-T] [-f] {name}
  
  The characters look like simple spaces in stable.
 
 Apparently, these are non-breakable spacesencoded in ISO-8859-1.
 I'm not sure this can be called incorrect. Colin
 
 I don't see any deep reasons for these spaces to be uncreakable,
 though. I'd rather see them as regular spaces, which would help good
 portability of original manpages.

The correct groff spelling of non-breaking spaces is '\ ' rather than an
ISO-8859-1 character. That said, note what 'info groff' says:

 (2) The last solution, i.e., using escaped spaces, is classical in
  the sense that it can be found in most `troff' documents.
  Nevertheless, it is not optimal in all situations, since `\ ' inserts a
  fixed-width, non-breaking space character which can't stretch.
  `gtroff' provides a different command `\~' to insert a stretchable,
  non-breaking space.

Samba might not want to use the latter since it would be specific to
groff.

Aside from all of this, as soon as I get a chance to do so I would like
to investigate why Konqueror is misrendering this. In the most recent
version it's supposed to handle encodings correctly, but even in the
version Filipus is using it should have expected and correctly handled
ISO-8859-1.

-- 
Colin Watson   [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-14 Thread Filipus Klutiero
Le March 14, 2008 05:46:22 am Colin Watson, vous avez écrit :
 On Fri, Mar 14, 2008 at 05:26:52AM -0400, Filipus Klutiero wrote:
  Le March 14, 2008 05:16:56 am Colin Watson, vous avez écrit :
   What version of konqueror do you have installed?
 
  4:3.5.8.dfsg.1-7

 I'd be interested to know if 4:3.5.9.dfsg.1-1 helps with this.

Ah, it does fix the konqueror part. The problem with man(1) remains, but then 
I checked more and realized this doesn't happen with $ LANG=C man nmblookup

$ locale
LANG=fr_CA.UTF-8
LC_CTYPE=fr_CA.UTF-8
LC_NUMERIC=fr_CA.UTF-8
LC_TIME=fr_CA.UTF-8
LC_COLLATE=fr_CA.UTF-8
LC_MONETARY=fr_CA.UTF-8
LC_MESSAGES=fr_CA.UTF-8
LC_PAPER=fr_CA.UTF-8
LC_NAME=fr_CA.UTF-8
LC_ADDRESS=fr_CA.UTF-8
LC_TELEPHONE=fr_CA.UTF-8
LC_MEASUREMENT=fr_CA.UTF-8
LC_IDENTIFICATION=fr_CA.UTF-8
LC_ALL=

So maybe this should be reassigned to man-db.




Bug#470844: [Pkg-samba-maint] Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-14 Thread Steve Langasek
On Fri, Mar 14, 2008 at 01:12:41PM +, Colin Watson wrote:
 The correct groff spelling of non-breaking spaces is '\ ' rather than an
 ISO-8859-1 character. That said, note what 'info groff' says:

  (2) The last solution, i.e., using escaped spaces, is classical in
   the sense that it can be found in most `troff' documents.
   Nevertheless, it is not optimal in all situations, since `\ ' inserts a
   fixed-width, non-breaking space character which can't stretch.
   `gtroff' provides a different command `\~' to insert a stretchable,
   non-breaking space.

 Samba might not want to use the latter since it would be specific to
 groff.

 Aside from all of this, as soon as I get a chance to do so I would like
 to investigate why Konqueror is misrendering this. In the most recent
 version it's supposed to handle encodings correctly, but even in the
 version Filipus is using it should have expected and correctly handled
 ISO-8859-1.

After a closer look I realize that man is also misrendering; the
non-breaking spaces are being turned into zero-width non-breaking spaces,
which isn't correct either :)

So we should probably fix the html source to use nbsp; instead of a literal
iso8859-1 no-break space, and then try to work out afterwards whether fixes
are needed to the toolchain.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
[EMAIL PROTECTED] [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#470844: encoding issue with spaces in nmblookup(1) synopsis

2008-03-13 Thread Filipus Klutiero
Package: samba-common
Version: 3.0.28-4
Severity: minor

konqueror shows the SYNOPIS section of nmblookup's manpage like this:

SYNOPSIS
nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-B�broadcast�address] 
[-U�unicast�address] [-d�debug�level] [-s�smb�config�file] 
[-i�NetBIOS�scope] [-T] [-f] {name} 

man like this:
SYNOPSIS
   nmblookup [-M] [-R] [-S] [-r] [-A] [-h] [-Bbroadcastaddress] 
[-Uunicastaddress] [-ddebuglevel] [-ssmbconfigfile] [-iNetBIOSscope] 
[-T] [-f] {name}

The characters look like simple spaces in stable.

--- System information. ---
Architecture: i386
Kernel:   Linux 2.6.24

Debian Release: lenny/sid
  990 testing security.debian.org 
  990 testing ftp.ca.debian.org 
  500 unstabledebian.savoirfairelinux.net 

--- Package information. ---
Depends  (Version) | Installed
==-+-
debconf  (= 0.5)  | 1.5.19
 OR debconf-2.0| 
libc6   (= 2.7-1) | 2.7-6
libcomerr2 (= 1.33-3) | 1.40.6-1
libkrb53   (= 1.6.dfsg.2) | 1.6.dfsg.3~beta1-3
libldap-2.4-2   (= 2.4.7) | 2.4.7-5
libncurses5(= 5.6+20071006-3) | 5.6+20080203-1
libpam-modules | 0.99.7.1-5
libpopt0 (= 1.10) | 1.10-3
libreadline5  (= 5.2) | 5.2-3
libuuid1   | 1.40.6-1
ucf| 3.005