Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Jan Engelhardt
On Jan 8 2007 14:17, Tim Pepper wrote: > On 1/8/07, Pavel Machek <[EMAIL PROTECTED]> wrote: >> On Sun 2007-01-07 22:30:55, Alan wrote: >> > I think that would be a good idea - and add it to the coding/docs >> > specs >> > that documentation is UTF-8. Code should IMHO say 7bit though. >> >> Yes, y

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Eberhard Moenkeberg
Hi, On Tue, 9 Jan 2007, Jan Engelhardt wrote: > On Jan 8 2007 22:00, Ken Moffat wrote: > > Looks nicely done, but I query the postal address changes in > >Documentation/cdrom/sbpcd - that seems to be a change of address > >(without anything to explain it). > > Eberhard [cc], please attach an Ack

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Jan Engelhardt
On Jan 8 2007 22:00, Ken Moffat wrote: > Looks nicely done, but I query the postal address changes in >Documentation/cdrom/sbpcd - that seems to be a change of address >(without anything to explain it). Eberhard [cc], please attach an Acked-by: YourName keep Ccs, thanks ;-) [thread/patch: http

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Tim Pepper
On 1/8/07, Pavel Machek <[EMAIL PROTECTED]> wrote: On Sun 2007-01-07 22:30:55, Alan wrote: > I think that would be a good idea - and add it to the coding/docs specs > that documentation is UTF-8. Code should IMHO say 7bit though. Yes, yes, please. I have been flamed when someone tried to do 8bi

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Ken Moffat
On Mon, Jan 08, 2007 at 09:17:06PM +0100, Jan Engelhardt wrote: > > On Jan 8 2007 02:22, Jan Engelhardt wrote: > >On Jan 7 2007 22:30, Alan wrote: > >> > >>> >The kernel maintainers/help/config pretty consistently use UTF8 > >>> > >>> I've seen a lot of places that don't do so. Want a patch? > >>

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Pavel Machek
On Sun 2007-01-07 22:30:55, Alan wrote: > > >The kernel maintainers/help/config pretty consistently use UTF8 > > > > I've seen a lot of places that don't do so. Want a patch? > > I think that would be a good idea - and add it to the coding/docs specs > that documentation is UTF-8. Code should IMH

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Valdis . Kletnieks
On Mon, 08 Jan 2007 01:38:57 +0100, Willy Tarreau said: > it's clearly the proof of a flaw in the initial design. And I'm not even > discussing the stupidity which requires that you read a whole text to get > its number of characters ! It's no more stupid than the *current* situation with Linux ke

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Nicolas Mailhot
Le Lun 8 janvier 2007 11:44, Alan a écrit : >> (case in point: Russel's system. I was ROTFL when he proudly announced >> he >> was running a full iso-8859-1 system after dissing UTF-8. Last I've seen >> the official 8bit EU encoding was iso-8859-15, and UK is part of the EU) > > There is no correc

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Alan
> (case in point: Russel's system. I was ROTFL when he proudly announced he > was running a full iso-8859-1 system after dissing UTF-8. Last I've seen > the official 8bit EU encoding was iso-8859-15, and UK is part of the EU) There is no correct UK encoding. You need -14 or -15 depending upon lang

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Nicolas Mailhot
>> How would you do this technically in a way that it's significantely >> easier than simply finishing the UTF=8 transition? > In how many decades do you think the transition will be finished ? Right now it looks like it will be finished way earlier than app bother supporting the later 8-bit enco

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Nicolas Mailhot
> elinks is one such program. It now assumes UTF-8 _only_ displays. > That's no better than programs which assume ISO-8859-1 only or US-ASCII > only. That's way better than programs: - which assume an encoding you can't write most world languages in (BTW ISO-8859-1 & US-ASCII are broken by design

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-08 Thread Adrian Bunk
On Mon, Jan 08, 2007 at 07:52:48AM +0100, Jan Engelhardt wrote: > > On Jan 8 2007 02:03, Adrian Bunk wrote: > > > >The only major MUA not supporting UTF-8 is Eudora. > > > >And if you are talking about buggy old pine, in the latest development > >version [1] it does not only become open source, i

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Jan Engelhardt
On Jan 8 2007 02:03, Adrian Bunk wrote: > >The only major MUA not supporting UTF-8 is Eudora. > >And if you are talking about buggy old pine, in the latest development >version [1] it does not only become open source, it also got some >working Unicode support. Uhm, just for the record, I run pi

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread David Woodhouse
On Sun, 2007-01-07 at 15:05 -0500, Dave Jones wrote: > This has been bugging me for a while. > Viewing the mail I applied in mutt shows his name correctly as Rafał > Applying it with git-applymbox and viewing the log on master.kernel.org > with git log shows Rafa And then later when put into emai

Re: OT: character encodings

2007-01-07 Thread Adrian Bunk
On Mon, Jan 08, 2007 at 02:32:42AM +0100, Tilman Schmidt wrote: > Am 08.01.2007 01:38 schrieb Willy Tarreau: >... > > And I'm not even > > discussing the stupidity which requires that you read a whole text to get > > its number of characters ! > > Personally I find the requirement to know the numb

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Adrian Bunk
On Mon, Jan 08, 2007 at 02:14:41AM +0100, Willy Tarreau wrote: > On Mon, Jan 08, 2007 at 02:03:37AM +0100, Adrian Bunk wrote: > > On Mon, Jan 08, 2007 at 01:38:57AM +0100, Willy Tarreau wrote: > > > On Mon, Jan 08, 2007 at 12:37:50AM +0100, Adrian Bunk wrote: > > > > On Sun, Jan 07, 2007 at 09:48:3

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Horst H. von Brand
Russell King <[EMAIL PROTECTED]> wrote: [...] > All that UTF-8 has done is added to the "which charset is this data" > problem rather than actually solving any proper real life problem. It solves real-world problems, the pain is that it is not (yet) universally used. The charset problems today a

Re: OT: character encodings

2007-01-07 Thread Tilman Schmidt
Am 08.01.2007 01:38 schrieb Willy Tarreau: > I'm not blaming UTF-8 per se, but people who still believe in encoding > *whole documents*. Copy-paste, text insertion, git output, etc... everything > has a good reason not to be in the same encoding as what your MUA believes. > If major MUAs still have

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Jan Engelhardt
On Jan 7 2007 22:30, Alan wrote: > >> >The kernel maintainers/help/config pretty consistently use UTF8 >> >> I've seen a lot of places that don't do so. Want a patch? > >I think that would be a good idea - and add it to the coding/docs specs >that documentation is UTF-8. Code should IMHO say 7bit

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Willy Tarreau
On Mon, Jan 08, 2007 at 02:03:37AM +0100, Adrian Bunk wrote: > On Mon, Jan 08, 2007 at 01:38:57AM +0100, Willy Tarreau wrote: > > On Mon, Jan 08, 2007 at 12:37:50AM +0100, Adrian Bunk wrote: > > > On Sun, Jan 07, 2007 at 09:48:34PM +0100, Willy Tarreau wrote: > > > > On Sun, Jan 07, 2007 at 08:11:3

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Adrian Bunk
On Mon, Jan 08, 2007 at 01:38:57AM +0100, Willy Tarreau wrote: > On Mon, Jan 08, 2007 at 12:37:50AM +0100, Adrian Bunk wrote: > > On Sun, Jan 07, 2007 at 09:48:34PM +0100, Willy Tarreau wrote: > > > On Sun, Jan 07, 2007 at 08:11:38PM +0100, Jan Engelhardt wrote: > > > > > > > > On Jan 7 2007 17:06

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Willy Tarreau
On Mon, Jan 08, 2007 at 12:37:50AM +0100, Adrian Bunk wrote: > On Sun, Jan 07, 2007 at 09:48:34PM +0100, Willy Tarreau wrote: > > On Sun, Jan 07, 2007 at 08:11:38PM +0100, Jan Engelhardt wrote: > > > > > > On Jan 7 2007 17:06, Russell King wrote: > > > >On Mon, Jan 08, 2007 at 12:29:05AM +0800, Da

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Adrian Bunk
On Sun, Jan 07, 2007 at 09:48:34PM +0100, Willy Tarreau wrote: > On Sun, Jan 07, 2007 at 08:11:38PM +0100, Jan Engelhardt wrote: > > > > On Jan 7 2007 17:06, Russell King wrote: > > >On Mon, Jan 08, 2007 at 12:29:05AM +0800, David Woodhouse wrote: > > > > > >$ git log | head -n 1000 | tail -n 200

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Alan
> >The kernel maintainers/help/config pretty consistently use UTF8 > > I've seen a lot of places that don't do so. Want a patch? I think that would be a good idea - and add it to the coding/docs specs that documentation is UTF-8. Code should IMHO say 7bit though. Alan - To unsubscribe from this

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Xavier Bestel
Le dimanche 07 janvier 2007 à 21:40 +0100, Jan Engelhardt a écrit : > >On Sun, 7 Jan 2007 15:05:53 -0500 > >Dave Jones <[EMAIL PROTECTED]> wrote: > > > >> If there's something I should be doing when I commit that I'm not, > >> I'll be happy to change my scripts. My $LANG is set to en_US.UTF-8 > >>

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Willy Tarreau
On Sun, Jan 07, 2007 at 08:11:38PM +0100, Jan Engelhardt wrote: > > On Jan 7 2007 17:06, Russell King wrote: > >On Mon, Jan 08, 2007 at 12:29:05AM +0800, David Woodhouse wrote: > > > >$ git log | head -n 1000 | tail -n 200 > o > >$ file -i o > >o: text/plain; charset=us-ascii > >$ git log | head -

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Jan Engelhardt
>On Sun, 7 Jan 2007 15:05:53 -0500 >Dave Jones <[EMAIL PROTECTED]> wrote: > >> If there's something I should be doing when I commit that I'm not, >> I'll be happy to change my scripts. My $LANG is set to en_US.UTF-8 >> which should DTRT to the best of my knowledge, but clearly, that isn't >> the

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Robin Rosenberg
söndag 07 januari 2007 20:17 skrev Russell King: [...] > clearly not UTF-8. I doubt whether any of the commits I do on my > en_GB ISO-8859-1 systems end up being UTF-8 encoded. They don't. Git doesn't convert, with the exception of two mail-related tools, which is the reason the commit being dis

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Sean
On Sun, 7 Jan 2007 15:05:53 -0500 Dave Jones <[EMAIL PROTECTED]> wrote: Including the Git list... > On Sun, Jan 07, 2007 at 07:17:30PM +, Russell King wrote: > > > commit 24ebead82bbf9785909d4cf205e2df5e9ff7da32 > > tree 921f686860e918a01c3d3fb6cd106ba82bf4ace6 > > parent 264166e604a7e14c

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Dave Jones
On Sun, Jan 07, 2007 at 07:17:30PM +, Russell King wrote: > commit 24ebead82bbf9785909d4cf205e2df5e9ff7da32 > tree 921f686860e918a01c3d3fb6cd106ba82bf4ace6 > parent 264166e604a7e14c278e31cadd1afb06a7d51a11 > author Rafa³ Bilski <[EMAIL PROTECTED]> 1167691774 +0100 > committer Dave Jones <

Re: OT: character encodings

2007-01-07 Thread Tilman Schmidt
Am 07.01.2007 18:06 schrieb Russell King: > > $ git log | head -n 1000 | tail -n 200 > o > $ file -i o > o: text/plain; charset=us-ascii > $ git log | head -n 1000 | tail -n 300 > o > $ file -i o > o: text/plain; charset=us-ascii > $ git log | head -n 1000 | tail -n 400 > o > $ file -i o > o: text

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Russell King
On Sun, Jan 07, 2007 at 08:11:38PM +0100, Jan Engelhardt wrote: > > On Jan 7 2007 17:06, Russell King wrote: > >On Mon, Jan 08, 2007 at 12:29:05AM +0800, David Woodhouse wrote: > > > >$ git log | head -n 1000 | tail -n 200 > o > >$ file -i o > >o: text/plain; charset=us-ascii > >$ git log | head -

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Russell King
On Sun, Jan 07, 2007 at 06:21:51PM +, Alan wrote: > > So, in short, UTF-8 is all fine and dandy if your _entire_ universe > > is UTF-8 enabled. If you're operating in a mixed charset environment > > it's one bloody big pain in the butt. > > Net ASCII is 7bit and is 1:1 mapped with UTF-8 unico

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Jan Engelhardt
On Jan 7 2007 18:21, Alan wrote: > >> So, in short, UTF-8 is all fine and dandy if your _entire_ universe >> is UTF-8 enabled. If you're operating in a mixed charset environment >> it's one bloody big pain in the butt. > >Net ASCII is 7bit and is 1:1 mapped with UTF-8 unicode. It's just old >brok

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Jan Engelhardt
On Jan 7 2007 17:06, Russell King wrote: >On Mon, Jan 08, 2007 at 12:29:05AM +0800, David Woodhouse wrote: > >$ git log | head -n 1000 | tail -n 200 > o >$ file -i o >o: text/plain; charset=us-ascii >$ git log | head -n 1000 | tail -n 300 > o >$ file -i o >o: text/plain; charset=us-ascii >$ git lo

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Alan
> So, in short, UTF-8 is all fine and dandy if your _entire_ universe > is UTF-8 enabled. If you're operating in a mixed charset environment > it's one bloody big pain in the butt. Net ASCII is 7bit and is 1:1 mapped with UTF-8 unicode. It's just old broken 8bit encodings that are problematic. T

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Russell King
On Mon, Jan 08, 2007 at 12:29:05AM +0800, David Woodhouse wrote: > On Sun, 2007-01-07 at 15:38 +, Russell King wrote: > > When a text file is stored on disk, there's no way to tell what > > character set the characters in that file belong to. As a result, > > ISO-8859-1 folk assume that all te

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread David Woodhouse
On Sun, 2007-01-07 at 15:38 +, Russell King wrote: > On Sun, Jan 07, 2007 at 11:13:57PM +0800, David Woodhouse wrote: > > On Sun, 2007-01-07 at 14:06 +0100, Tilman Schmidt wrote: > > > Russell King schrieb: > > > > Welcome to the mess which the UTF-8 charset creates. > > > > Utter bollocks. >

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Russell King
On Sun, Jan 07, 2007 at 11:13:57PM +0800, David Woodhouse wrote: > On Sun, 2007-01-07 at 14:06 +0100, Tilman Schmidt wrote: > > Russell King schrieb: > > > Welcome to the mess which the UTF-8 charset creates. > > Utter bollocks. Wrong. The problem is partly caused by not everything understanding

Re: OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread David Woodhouse
On Sun, 2007-01-07 at 14:06 +0100, Tilman Schmidt wrote: > Russell King schrieb: > > Welcome to the mess which the UTF-8 charset creates. Utter bollocks. > The problem of different character encodings coexisting on the same > platform, and the resulting occasional messing-up, far predates Unicode

OT: character encodings (was: Linux 2.6.20-rc4)

2007-01-07 Thread Tilman Schmidt
Russell King schrieb: [Leonard NorrgÃ¥rd (1):] > That is an å if you look at the raw message in UTF-8. However, Linus > sends mail in with a charset of ISO-8859-1, and if you place UTF-8 > encoded text in such a message body, you will see A¥. Only if the mechanism used for placing it there ignore