Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-15 Thread Vincent Lefevre
On 2011-02-14 16:43:11 +, Ian Jackson wrote: When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode characters to stdout should use UTF-8. That's what LC_TYPE means. So, cat, grep, etc. are all broken. :) -- Vincent Lefèvre vinc...@vinc17.net - Web: http://www.vinc17.net/

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-15 Thread Adam Borowski
On Wed, Feb 16, 2011 at 01:01:07AM +0100, Vincent Lefevre wrote: On 2011-02-14 16:43:11 +, Ian Jackson wrote: When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode characters to stdout should use UTF-8. That's what LC_TYPE means. So, cat, grep, etc. are all broken. :)

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-15 Thread Vincent Lefevre
On 2011-02-16 01:34:51 +0100, Adam Borowski wrote: On Wed, Feb 16, 2011 at 01:01:07AM +0100, Vincent Lefevre wrote: On 2011-02-14 16:43:11 +, Ian Jackson wrote: When LC_CTYPE=en_GB.utf-8, programs which attempt to print unicode characters to stdout should use UTF-8. That's what

Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)

2011-02-14 Thread Josselin Mouette
Le vendredi 11 février 2011 à 19:33 +0100, Axel Beckert a écrit : Kicking out good and unique software, only because of missing or incomplete UTF-8 support, will surely lower Debian's quality more than missing or broken UTF-8 support in very few packages. And it would make those users (and

Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)

2011-02-14 Thread Ian Jackson
Josselin Mouette writes (Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)): Kicking out software that doesn?t work at all in UTF-8 locales and requires the user to set a broken locale, OTOH, sounds like a sanitary emergency. Excellent, I look forward to the removal

Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)

2011-02-14 Thread Jakub Wilk
* Ian Jackson ijack...@chiark.greenend.org.uk, 2011-02-14, 12:42: Kicking out software that doesn?t work at all in UTF-8 locales and requires the user to set a broken locale, OTOH, sounds like a sanitary emergency. Excellent, I look forward to the removal of python. I always hated that

OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Klaus Ethgen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hi, lets start a python rant. I love to hate this language. :-) Am Mo den 14. Feb 2011 um 14:14 schrieb Jakub Wilk: $ LC_CTYPE=en_GB.utf-8 python -c 'print u\u00a3' unicode pound sign [...] $ LC_CTYPE=en_GB.utf-8 python -c 'print u\u00a3' | cat

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Philipp Kern
On 2011-02-14, Klaus Ethgen kl...@ethgen.de wrote: ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' | cat Both gives the same result, a '£' sign as expected. And what's the value in that demonstration? Yes, you can treat UTF8 like a

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Lars Wirzenius
On ma, 2011-02-14 at 14:37 +0100, Klaus Ethgen wrote: lets start a python rant. I love to hate this language. :-) Let's not. Let's not rant about any languages, or tools, or desktop environments. Let's be constructive on Debian mailing lists, shall we? We have plenty of side-channels for

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Jakub Wilk
* Klaus Ethgen kl...@ethgen.de, 2011-02-14, 14:37: ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' | cat Let me try... $ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' | isutf8 stdin: line 1, char 1, byte offset 1: invalid UTF-8 code

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Klaus Ethgen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Am Mo den 14. Feb 2011 um 15:15 schrieb Lars Wirzenius: On ma, 2011-02-14 at 14:37 +0100, Klaus Ethgen wrote: lets start a python rant. I love to hate this language. :-) Let's not. 'Till here it is personal desire. Let's not rant about any

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Adam Borowski
On Mon, Feb 14, 2011 at 02:02:11PM +, Philipp Kern wrote: On 2011-02-14, Klaus Ethgen kl...@ethgen.de wrote: ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' | cat Both gives the same result, a '£' sign as expected. And what's the

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Ian Jackson
Jakub Wilk writes (Re: OT: Python (was: Make Unicode bugs release critical?)): * Klaus Ethgen kl...@ethgen.de, 2011-02-14, 14:37: ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n;' | cat Let me try... $ LC_CTYPE=en_GB.utf-8 perl -e 'print

Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)

2011-02-14 Thread Josselin Mouette
Le lundi 14 février 2011 à 12:42 +, Ian Jackson a écrit : Josselin Mouette writes (Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)): Kicking out software that doesn?t work at all in UTF-8 locales and requires the user to set a broken locale, OTOH, sounds like

Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)

2011-02-14 Thread Henrique de Moraes Holschuh
On Mon, 14 Feb 2011, Josselin Mouette wrote: You must specify the encoding of your data in your bitstreams. I agree this is inconvenient (and one of the things I dislike in Python), but it is: 1. completely independent of the locale (UTF8 or not) 2. easy to work with once you

Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)

2011-02-14 Thread brian m. carlson
On Mon, Feb 14, 2011 at 02:01:08PM -0200, Henrique de Moraes Holschuh wrote: As long as python 3 is compiled to use UCS-4 as the internal representation, you mean. Are our packages set to use UCS-4? At least for python 3.1, yes: common_configure_args = \ --prefix=/usr \

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Klaus Ethgen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Am Mo den 14. Feb 2011 um 16:24 schrieb Ian Jackson: Jakub Wilk writes (Re: OT: Python (was: Make Unicode bugs release critical?)): * Klaus Ethgen kl...@ethgen.de, 2011-02-14, 14:37: ~ LC_CTYPE=en_GB.utf-8 perl -e 'print \x{00a3}\n

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Ian Jackson
Klaus Ethgen writes (Re: OT: Python (was: Make Unicode bugs release critical?)): No, it is not. 00a3 is just not a utf-8 character, it is unicode. To get a correct utf-8 character you need to print \x{c2a3} and then isutf8 is happy. When LC_CTYPE=en_GB.utf-8, programs which attempt to print

Re: OT: Python (was: Make Unicode bugs release critical?)

2011-02-14 Thread Konstantin Khomoutov
On Mon, 14 Feb 2011 16:43:11 + Ian Jackson ijack...@chiark.greenend.org.uk wrote: Klaus Ethgen writes (Re: OT: Python (was: Make Unicode bugs release critical?)): No, it is not. 00a3 is just not a utf-8 character, it is unicode. To get a correct utf-8 character you need to print \x{c2a3

Re: Make Unicode bugs release critical?

2011-02-14 Thread Ron Johnson
On 02/14/2011 10:39 AM, Ian Jackson wrote: [snip] The fact that naive Python programs work (honouring LC_CTYPE as they should) unless you pipe their output to something is clearly a bug. The fact that it's a specification bug doesn't mean it's not a bug. It doesn't seem to work for me. $

Re: Make Unicode bugs release critical?

2011-02-14 Thread The Fungi
On Mon, Feb 14, 2011 at 03:57:44PM -0600, Ron Johnson wrote: It doesn't seem to work for me. [...] $ LC_CTYPE=en_GB.utf-8 python -c 'print u\u00a3' Traceback (most recent call last): File string, line 1, in module UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position

Re: Make Unicode bugs release critical?

2011-02-14 Thread Ron Johnson
On 02/14/2011 04:26 PM, The Fungi wrote: On Mon, Feb 14, 2011 at 03:57:44PM -0600, Ron Johnson wrote: It doesn't seem to work for me. [...] $ LC_CTYPE=en_GB.utf-8 python -c 'print u\u00a3' Traceback (most recent call last): File string, line 1, inmodule UnicodeEncodeError: 'ascii' codec

Re: Make Unicode bugs release critical?

2011-02-14 Thread Adam Borowski
On Mon, Feb 14, 2011 at 06:10:37PM -0600, Ron Johnson wrote: On 02/14/2011 04:26 PM, The Fungi wrote: You probably don't have an en_GB.utf-8 locale (maybe you have localepurge installed?). I bet en_US.utf-8 will net you different results. That's it... No localepurge, but when initially

Make Unicode bugs release critical?

2011-02-11 Thread Lars Wirzenius
On pe, 2011-02-11 at 10:05 +0100, Vincent Fourmond wrote: On 11/02/11 09:52, Josselin Mouette wrote: Le vendredi 11 février 2011 à 09:47 +0100, Adam Borowski a écrit : I'd say there should be no place in Debian in 2011 for software that can't do UTF-8, especially if near-identical forks

Re: Make Unicode bugs release critical?

2011-02-11 Thread Miroslav Kure
On Fri, Feb 11, 2011 at 09:37:54AM +, Lars Wirzenius wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. Mostly it is just the old stuff like - eterm, aterm - elvis -

Re: Make Unicode bugs release critical?

2011-02-11 Thread Andrey Rahmatullin
On Fri, Feb 11, 2011 at 11:14:42AM +0100, Miroslav Kure wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. Mostly it is just the old stuff like - eterm, aterm - elvis

Re: Make Unicode bugs release critical?

2011-02-11 Thread Luca Capello
Hi there! On Fri, 11 Feb 2011 11:14:42 +0100, Miroslav Kure wrote: On Fri, Feb 11, 2011 at 09:37:54AM +, Lars Wirzenius wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent

Re: Make Unicode bugs release critical?

2011-02-11 Thread Roger Leigh
On Fri, Feb 11, 2011 at 11:14:42AM +0100, Miroslav Kure wrote: On Fri, Feb 11, 2011 at 09:37:54AM +, Lars Wirzenius wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent

Re: Make Unicode bugs release critical?

2011-02-11 Thread Klaus Ethgen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hi, Am Fr den 11. Feb 2011 um 10:37 schrieb Lars Wirzenius: The first Unicode standard was published in 1991. That's twenty years ago. Any software that processes text at all and is incapable of dealing with UTF-8 should be considered with

Re: Make Unicode bugs release critical?

2011-02-11 Thread Torsten Werner
Am -10.01.-28163 20:59, schrieb Andrey Rahmatullin: On Fri, Feb 11, 2011 at 11:14:42AM +0100, Miroslav Kure wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. Mostly it is

Re: Make Unicode bugs release critical?

2011-02-11 Thread Andrey Rahmatullin
On Fri, Feb 11, 2011 at 01:20:24PM +0100, Torsten Werner wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. Mostly it is just the old stuff like - eterm, aterm -

Re: Make Unicode bugs release critical?

2011-02-11 Thread Lars Wirzenius
On pe, 2011-02-11 at 13:20 +0100, Torsten Werner wrote: grep, sed, awk, bash, ... grep, sed, and awk, at least, seem to work acceptably for me with UTF-8. The support can be improved, I'm sure. -- Blog/wiki/website hosting with ikiwiki (free for free software): http://www.branchable.com/ --

Re: Make Unicode bugs release critical?

2011-02-11 Thread Norbert Preining
On Fr, 11 Feb 2011, Roger Leigh wrote: XeTeX and XeLaTeX allow native UTF-8 input. Should be made the default, IMO, given how obsolete and broken the standard TeX encodings are. Being able to write in actual text rather than Please don't write rubbish if you don't know what you are talking

Re: Make Unicode bugs release critical?

2011-02-11 Thread Faidon Liambotis
On 02/11/11 14:20, Torsten Werner wrote: grep, sed, awk, bash, ... ? $ echo αβγ | sed 's/./a/' aβγ Regards, Φαίδων :-) -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive:

Re: Make Unicode bugs release critical?

2011-02-11 Thread Vincent Lefevre
On 2011-02-11 21:46:29 +0900, Norbert Preining wrote: On Fr, 11 Feb 2011, Roger Leigh wrote: XeTeX and XeLaTeX allow native UTF-8 input. Should be made the default, IMO, given how obsolete and broken the standard TeX encodings are. Being able to write in actual text rather than Please

Re: Make Unicode bugs release critical?

2011-02-11 Thread Roger Leigh
On Fri, Feb 11, 2011 at 09:46:29PM +0900, Norbert Preining wrote: On Fr, 11 Feb 2011, Roger Leigh wrote: XeTeX and XeLaTeX allow native UTF-8 input. Should be made the default, IMO, given how obsolete and broken the standard TeX encodings are. Being able to write in actual text rather

Re: Make Unicode bugs release critical?

2011-02-11 Thread Vincent Lefevre
On 2011-02-11 15:33:49 +0500, Andrey Rahmatullin wrote: On Fri, Feb 11, 2011 at 11:14:42AM +0100, Miroslav Kure wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times.

Re: Make Unicode bugs release critical?

2011-02-11 Thread Adam Borowski
On Fri, Feb 11, 2011 at 12:59:46PM +0100, Klaus Ethgen wrote: Am Fr den 11. Feb 2011 um 10:37 schrieb Lars Wirzenius: The first Unicode standard was published in 1991. That's twenty years ago. Any software that processes text at all and is incapable of dealing with UTF-8 should be

Re: Make Unicode bugs release critical?

2011-02-11 Thread Norbert Preining
On Fr, 11 Feb 2011, Roger Leigh wrote: Um, no need to be rude. Well, you started with throw TeX into the bin! (cum grano salis) The only possible answer to that is mine. Or shutting up and ignoring that kind of rants from your side. insults is a step too far. I haven't said anything that

Re: Make Unicode bugs release critical?

2011-02-11 Thread Torsten Werner
Am 11.02.2011 14:02, schrieb Faidon Liambotis: $ echo αβγ | sed 's/./a/' aβγ Okay. But... $ echo αβγ | busybox sed 's/./a/' a�βγ :) -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive:

Re: Make Unicode bugs release critical?

2011-02-11 Thread Adam Borowski
On Fri, Feb 11, 2011 at 02:30:24PM +0100, Vincent Lefevre wrote: On 2011-02-11 15:33:49 +0500, Andrey Rahmatullin wrote: On Fri, Feb 11, 2011 at 11:14:42AM +0100, Miroslav Kure wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the

Re: Make Unicode bugs release critical?

2011-02-11 Thread Roger Leigh
On Fri, Feb 11, 2011 at 10:43:38PM +0900, Norbert Preining wrote: On Fr, 11 Feb 2011, Roger Leigh wrote: Um, no need to be rude. Well, you started with throw TeX into the bin! (cum grano salis) The only possible answer to that is mine. Or shutting up and ignoring that kind of rants from

Re: Make Unicode bugs release critical?

2011-02-11 Thread Norbert Preining
On Fr, 11 Feb 2011, Roger Leigh wrote: read what I actually wrote, rather than what you thought I wrote. So *what* is your proposal, instead of discussing uselessly and wasting bytes? Is it: ln -sf tex xetex Best wishes Norbert

Re: Make Unicode bugs release critical?

2011-02-11 Thread Vincent Lefevre
On 2011-02-11 15:02:02 +0100, Adam Borowski wrote: On Fri, Feb 11, 2011 at 02:30:24PM +0100, Vincent Lefevre wrote: On 2011-02-11 15:33:49 +0500, Andrey Rahmatullin wrote: On Fri, Feb 11, 2011 at 11:14:42AM +0100, Miroslav Kure wrote: However, I'm curious: is there a lot of software

Re: Make Unicode bugs release critical?

2011-02-11 Thread Joey Hess
Lars Wirzenius wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. We chose an 80% quickfix to get where we are, and so now we have the other 80% to go. It's been whittled

Re: Make Unicode bugs release critical?

2011-02-11 Thread Marco Túlio Gontijo e Silva
Excerpts from Joey Hess's message of Sex Fev 11 13:39:08 -0200 2011: (...) It can be as simple as software written trusting language documentation that says strings are processed in unicode and doesn't point out all the exceptions that can let non-unicode data in. For example, this simple

Re: Make Unicode bugs release critical? (was: Re: RFA: all my packages)

2011-02-11 Thread Axel Beckert
Hi, Adam Borowski wrote: Speaking of rxvt... shouldn't this clusterϫϫck become the only rxvt in Debian? Both rxvt and rxvt-beta, completely dead upstream for 10 and 8 years respectively, besides having terrible support for terminal codes lack even such a tiny detail as UTF-8 support. I'd

Re: Make Unicode bugs release critical?

2011-02-11 Thread Kurt Roeckx
On Fri, Feb 11, 2011 at 09:37:54AM +, Lars Wirzenius wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. ispell, aspell. I think hunspell got fix recently. Kurt --

Re: Make Unicode bugs release critical?

2011-02-11 Thread Henrique de Moraes Holschuh
On Fri, 11 Feb 2011, Lars Wirzenius wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. 1. Stuff that cannot do one of UTF-8, UTF-16 or UCS-4. 2. Anything that cannot deal

Re: Make Unicode bugs release critical?

2011-02-11 Thread Ron Johnson
On 02/11/2011 07:36 AM, Adam Borowski wrote: [snip] UTF-16 is never, ever useful. It is a sad trap for win32 and Java developers, due to a bad engineering decision suggested, as I was told, by [snip] No, there is only one encoding left, as long as you don't have to talk to Windows. Never

Re: Make Unicode bugs release critical?

2011-02-11 Thread Peter Samuelson
[Ron Johnson] Never useful except for 90% of the market? (I wonder how SAMBA deals with it...) I don't think you really want to know. There's a 'unicode' flag in much of the CIFS protocol that means filenames and such are in UTF-16 (I think UTF-16LE) instead of

Re: Make Unicode bugs release critical?

2011-02-11 Thread Adam Borowski
On Fri, Feb 11, 2011 at 08:16:54PM -0200, Henrique de Moraes Holschuh wrote: On Fri, 11 Feb 2011, Lars Wirzenius wrote: However, I'm curious: is there a lot of software that is broken with Unicode, particularly with the UTF-8 encoding? I can't remember anything much in recent times. 2.

Re: Make Unicode bugs release critical?

2011-02-11 Thread Henrique de Moraes Holschuh
On Sat, 12 Feb 2011, Adam Borowski wrote: On Fri, Feb 11, 2011 at 08:16:54PM -0200, Henrique de Moraes Holschuh wrote: 2. Anything that cannot deal with Supplementary planes. This includes the use of UCS-2 instead of UTF-16, as it cannot represent the Supplementary planes. python