Bug#99933: second attempt at more comprehensive unicode policy

2003-01-20 Thread Radovan Garabik
On Fri, Jan 17, 2003 at 05:11:32PM -0600, Manoj Srivastava wrote: > Hi, > > Just because you are using a UTF-8 capable terminal does not > mean you can actually see a UTF encoded string. ሰው እንደቤቱ እንጅ እንደ > ጉረቤቱ አይተዳደርም።, though encoded in UTF, is hard for me to display. If > you are ab

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-18 Thread Colin Walters
On Sat, 2003-01-18 at 03:38, Manoj Srivastava wrote: > Actually, if we must take a stance, I would say that while > unicode does remain the only sane choice in the future, at this > point the only sane choice is pure ascii; for reasons that have come > up often in this thread. I think th

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-18 Thread Manoj Srivastava
Colin Walters <[EMAIL PROTECTED]> writes: > On Fri, 2003-01-17 at 17:49, Manoj Srivastava wrote:> >> perhaps we should stick to pure ascii file names, if we >> must have policy take a stance about file names at all? > > First of all, I strongly believe policy should have a stance about file > name

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-18 Thread Jérôme Marant
Manoj Srivastava <[EMAIL PROTECTED]> writes: > Hi, > > Just because you are using a UTF-8 capable terminal does not > mean you can actually see a UTF encoded string. ሰው እንደቤቱ እንጅ እንደ > ጉረቤቱ አይተዳደርም።, though encoded in UTF, is hard for me to display. If > you are able to see this, would

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-17 Thread Colin Walters
On Fri, 2003-01-17 at 17:49, Manoj Srivastava wrote: > Hi, > > Sorry for the late entry into the discussion. I am > comfortable with making the changelog UTF-8 only, but file names in > pure UTF-8 perhaps is premature. (मनोज्.conf, anyone?). Please see my second proposal (the third in

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-17 Thread Manoj Srivastava
Hi, Just because you are using a UTF-8 capable terminal does not mean you can actually see a UTF encoded string. ሰው እንደቤቱ እንጅ እንደ ጉረቤቱ አይተዳደርም።, though encoded in UTF, is hard for me to display. If you are able to see this, would you please share what fontset you are using?

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-17 Thread Manoj Srivastava
Hi, Sorry for the late entry into the discussion. I am comfortable with making the changelog UTF-8 only, but file names in pure UTF-8 perhaps is premature. (मनोज्.conf, anyone?). Indeed, until we have a wider deployment of a font that has a decent coverage of UTF-8 glyphs (haw many of

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-15 Thread starner
>On Tue, 2003-01-14 at 21:50, [EMAIL PROTECTED] wrote: > >> And? A POSIX filename is not a string of characters, it's a string >> of bytes. You have no technical need to differentiate between the >> two. > >If you do any sort of character-oriented manipulation on those names, >you will. Like what

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-15 Thread Colin Watson
On Wed, Jan 15, 2003 at 01:17:51AM -0500, Colin Walters wrote: > On Tue, 2003-01-14 at 21:50, [EMAIL PROTECTED] wrote: > > Are you volunteering to write patches for every program in Debian, and > > maintain them (since the upstream author probably won't be interested > > in this Debian-only scheme)

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-15 Thread Colin Watson
On Wed, Jan 15, 2003 at 04:41:57PM +0900, Junichi Uekawa wrote: > > > Not all of the statements made in that thread are not quite true, > > > and I seem to remember seeing some hacks done by Ukai-san on that > > > respect, for UTF-8. > > > > Hmmm...could you elaborate? > > I think our man-db and

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-15 Thread Denis Barbier
On Wed, Jan 15, 2003 at 12:28:43PM +0900, Junichi Uekawa wrote: > > > But the current situation is *already* broken! For example, for a > > > Chinese person, an ISO-8859-1 system simply cannot encode, nor display, > > > their language. I am aware that for people entrenched in legacy > > > charset

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-15 Thread Junichi Uekawa
> > Not all of the statements made in that thread are not quite true, > > and I seem to remember seeing some hacks done by Ukai-san on that > > respect, for UTF-8. > > Hmmm...could you elaborate? I think our man-db and groff have been hacked in two ways: 1) to special-case japanese locale (ja_J

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-15 Thread Colin Walters
On Tue, 2003-01-14 at 21:50, [EMAIL PROTECTED] wrote: > And? A POSIX filename is not a string of characters, it's a string > of bytes. You have no technical need to differentiate between the > two. If you do any sort of character-oriented manipulation on those names, you will. > Good. It remind

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread Colin Walters
On Tue, 2003-01-14 at 22:28, Junichi Uekawa wrote: > The point is, we have working "iconv", and > changing changelog will work. Yep, definitely. > man may need some hacking or other, I am not sure. I hear the other Colin is on the job :) > Not all of the statements made in that thread are not

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread Junichi Uekawa
> > Sorry, we have to start somewhere. Unicode is the way of the future, > > and if we wait until every vendor of some random terminal updates it > > with support for UTF-8, we will never start. > > I don't disagree that we should move to Unicode. I disagree that such > a move must inherently

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread Junichi Uekawa
> > But the current situation is *already* broken! For example, for a > > Chinese person, an ISO-8859-1 system simply cannot encode, nor display, > > their language. I am aware that for people entrenched in legacy > > charsets like ISO-8859-1, the transition may introduce > > incompatibilities.

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread starner
>Moreover, say the system administrator does something like 'find >/home'. The resulting stream will be a mixture of ISO-8859-X and BIG5, >and impossible to reliably differentiate. And? A POSIX filename is not a string of characters, it's a string of bytes. You have no technical need to differ

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread starner
>1) A multiuser machine, with users using different charsets. > Who decides which one is "local"? > >2) The sysamin/user changes the charset, e.g. from iso-8859-1 > to iso-8859-15 to get the Euro character. > How should the filenames stay in the local charset when > this changes? Would the

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread Colin Walters
On Tue, 2003-01-14 at 02:23, [EMAIL PROTECTED] wrote: > Not acceptable. Filenames are and must be in the locale charset. > There is no other sane option [...] Heh. I will quote from a previous message of mine about filenames in the locale charset, which, since you joined the discussion later, y

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread Jochen Voss
Hello Lars, On Tue, Jan 14, 2003 at 12:30:28PM +0200, Lars Wirzenius wrote: > ti, 14-01-2003 kello 10:23, Jochen Voss kirjoitti: > > On Tue, Jan 14, 2003 at 01:23:51AM -0600, [EMAIL PROTECTED] wrote: > > > Not acceptable. Filenames are and must be in the locale charset. There is > > > no other san

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread Lars Wirzenius
ti, 14-01-2003 kello 10:23, Jochen Voss kirjoitti: > On Tue, Jan 14, 2003 at 01:23:51AM -0600, [EMAIL PROTECTED] wrote: > > Not acceptable. Filenames are and must be in the locale charset. There is > > no other sane option [...] > No, this does not work, too. Imagine two scenarios: 3) Floppies, C

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread Jochen Voss
Hello, On Tue, Jan 14, 2003 at 01:23:51AM -0600, [EMAIL PROTECTED] wrote: > Not acceptable. Filenames are and must be in the locale charset. There is > no other sane option [...] No, this does not work, too. Imagine two scenarios: 1) A multiuser machine, with users using different charsets. W

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-14 Thread starner
>But what if the program *knows* the data is UTF-8 internally? Like all >GNOME programs do, and my patch for dpkg tries to do? Then it should be easy to convert it. You can't not convert and expect a reasonable response - among other things, innocent UTF-8 characters can include C1 bytes, and scr

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-13 Thread Colin Walters
On Sat, 2003-01-11 at 06:21, David Starner wrote: > You can input any Unicode character you want, but you probably have > to out of your way to input something outside your charset (i.e. probably > not on your keyboard or standard IM.) Ok, that is probably going to be true. > If I receive a fil

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-11 Thread Jakob Bohm
On Wed, Jan 08, 2003 at 01:30:09AM -0500, Colin Walters wrote: > On Tue, 2003-01-07 at 03:07, Jakob Bohm wrote: > > > I agree, this is the only way to go. Naive, simple, classic > > UNIX-style programming should continue to "just work", > > Naïve, simple, classic UNIX-style programs are ASCII-on

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-11 Thread David Starner
At 11:55 PM 1/9/2003 -0500, Colin Walters wrote: What do you expect GNOME programs to do? Since they fully support UTF-8, you can input any Unicode character you want. Also, a program like Evolution may receive a file in mail whose name uses Unicode characters. And a lot of locale charsets (li

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-09 Thread Colin Walters
On Thu, 2003-01-09 at 23:05, David Starner wrote: > Not anything written up that I know of. Debian-i18n has a large cross > membership, which was part of the reason this should be on debian-i18n. Ok, if people want to move this discussion that's fine by me. > >Are you saying that programs should

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-09 Thread David Starner
At 10:29 PM 1/9/2003 -0500, Colin Walters wrote: Right. Did the people on that list come up with any general plan for how GNU/Linux vendors should transition? Not anything written up that I know of. Debian-i18n has a large cross membership, which was part of the reason this should be on debian

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-09 Thread Colin Walters
On Thu, 2003-01-09 at 20:57, David Starner wrote: > A Posix filename is a null terminated byte string (sans '/'). Any > widescale conversion is going to cause aliasing issues and other > bugs, whether or not we stay Posix compatible. > Just as important, conversion is not an issue for debian-pol

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-09 Thread David Starner
>I agree that it would be a good idea to store filenames as UTF-8 >in the filesystem. But I (being a part of "everyone") do not >agree, that we should even try to switch every terminal in the >world to UTF-8. We do need conversion of file names somewhere >between the filesystem level and output.

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-09 Thread Colin Walters
On Thu, 2003-01-09 at 13:28, Jochen Voss wrote: > Hello, > > On Wed, Jan 08, 2003 at 01:00:19AM -0500, Colin Walters wrote: > > Seriously, I didn't mean it that way; I just meant that I think everyone > > has generally accepted that UTF-8 is the way of the future; we're just > > debating when, whe

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-09 Thread Jochen Voss
Hello, On Wed, Jan 08, 2003 at 01:00:19AM -0500, Colin Walters wrote: > Seriously, I didn't mean it that way; I just meant that I think everyone > has generally accepted that UTF-8 is the way of the future; we're just > debating when, where, and how. I want to challenge the "everyone" in your sent

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread David Starner
At 05:03 PM 1/8/2003 -0600, John Goerzen wrote: Yes, there are UTF-8 versions available. Does everyone have them? Do we enable them by default? Everyone who has the most recent version. They're enabled by default if you're running a UTF-8 locale, like they should be. Do all other vendors sh

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread Colin Walters
On Wed, 2003-01-08 at 18:03, John Goerzen wrote: > Colin was advocating what amounted to exactly that. He was advocating > removing all support for non-UTF8 terminals. Um, woah there. The key word is *eventually*. Again: the only "must" my present policy proposal introduces is for filenames in

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread John Goerzen
On Wed, Jan 08, 2003 at 02:54:43PM -0800, David Starner wrote: > At 02:32 PM 1/8/2003 -0600, John Goerzen wrote: > >It's not just physical terminals we're talking about here. We're talking > >about the vast majority of the state of the art terminal emulators *today*. > > I'd have a hard time des

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread David Starner
At 02:32 PM 1/8/2003 -0600, John Goerzen wrote: >It's not just physical terminals we're talking about here. We're talking >about the vast majority of the state of the art terminal emulators *today*. I'd have a hard time describing a terminal emulator that doesn't support UTF-8 as "start of the a

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread John Goerzen
On Tue, Jan 07, 2003 at 03:50:45PM -0800, David Starner wrote: > If you're using a terminal that can't support UTF-8, you always have the > option of running > something like GNU screen to translate the system charset to the terminal > charset. > It seems more important to get a systemwide encodi

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread John Goerzen
On Wed, Jan 08, 2003 at 01:30:09AM -0500, Colin Walters wrote: > > I like > > the idea that I can download any old program written in a past > > decade and just type make. > > Yay for broken software. Unicode did not exist until fairly recently. Lots of useful software was written prior to its i

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread Radovan Garabik
On Wed, Jan 08, 2003 at 01:10:36AM -0500, Colin Walters wrote: > On Tue, 2003-01-07 at 18:50, David Starner wrote: > > If you're using a terminal that can't support UTF-8, you always have the > > option of running > > something like GNU screen to translate the system charset to the terminal > > c

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread Colin Walters
On Tue, 2003-01-07 at 03:07, Jakob Bohm wrote: > I agree, this is the only way to go. Naive, simple, classic > UNIX-style programming should continue to "just work", Naïve, simple, classic UNIX-style programs are ASCII-only. Then someone got the idea to bolt this huge "locale" kludge on top of

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread David Starner
At 01:10 AM 1/8/2003 -0500, Colin Walters wrote: >That is interesting advice. I am not sure I understand exactly how it >would work though. Would you just tell screen that all input is in >UTF-8? It seems like this would not be true if the user has legacy >filenames, and they do something simple

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread Colin Walters
On Mon, 2003-01-06 at 16:15, Jochen Voss wrote: > Hello Colin, > > On Fri, Jan 03, 2003 at 09:50:26PM -0500, Colin Walters wrote: > > In summary, UTF-8 is the *only* sane character set to use for > > filenames. > At least I agree to this :-) Cool. > I think that we need filename conversion betwe

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread Colin Walters
On Tue, 2003-01-07 at 18:50, David Starner wrote: > If you're using a terminal that can't support UTF-8, you always have the > option of running > something like GNU screen to translate the system charset to the terminal > charset. > It seems more important to get a systemwide encoding working, t

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-08 Thread Colin Walters
On Tue, 2003-01-07 at 15:10, John Goerzen wrote: > Colin Walters <[EMAIL PROTECTED]> writes: > > > On Tue, 2003-01-07 at 13:50, John Goerzen wrote: > > > > Sorry, we have to start somewhere. Unicode is the way of the future, > > and if we wait until every vendor of some random terminal updates it

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread David Starner
If you're using a terminal that can't support UTF-8, you always have the option of running something like GNU screen to translate the system charset to the terminal charset. It seems more important to get a systemwide encoding working, then worry about the minority who use physical terminals.

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread Denis Barbier
On Tue, Jan 07, 2003 at 01:31:57PM -0500, Colin Walters wrote: > On Tue, 2003-01-07 at 10:22, John Goerzen wrote: > > > Then your solution is broken. Seriously, this would be a huge problem > > for many people. > > But the current situation is *already* broken! For example, for a > Chinese pers

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread John Goerzen
Colin Walters <[EMAIL PROTECTED]> writes: > On Tue, 2003-01-07 at 13:50, John Goerzen wrote: > > Sorry, we have to start somewhere. Unicode is the way of the future, > and if we wait until every vendor of some random terminal updates it > with support for UTF-8, we will never start. I don't di

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread Colin Walters
On Tue, 2003-01-07 at 13:50, John Goerzen wrote: > I don't disagree. I'm saying that your solution is worse than the problem. Sorry, we have to start somewhere. Unicode is the way of the future, and if we wait until every vendor of some random terminal updates it with support for UTF-8, we will

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread John Goerzen
Colin Walters <[EMAIL PROTECTED]> writes: >> Then your solution is broken. Seriously, this would be a huge problem >> for many people. > > But the current situation is *already* broken! For example, for a I don't disagree. I'm saying that your solution is worse than the problem. > Chinese per

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread Colin Walters
On Tue, 2003-01-07 at 10:22, John Goerzen wrote: > Then your solution is broken. Seriously, this would be a huge problem > for many people. But the current situation is *already* broken! For example, for a Chinese person, an ISO-8859-1 system simply cannot encode, nor display, their language.

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread John Goerzen
Colin Walters <[EMAIL PROTECTED]> writes: >> I think that this would be a really bad idea, because it would be a to >> severe restriction on the set of supported terminal types. Think of >> remote logins from non-Debian machines: we cannot control the program >> at the other end of the line. And

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread Marco d'Itri
On Jan 06, Jochen Voss <[EMAIL PROTECTED]> wrote: >Because a lot of programs is affected, it would gain us much, if we >could move this as deep as into libc or even into the kernel. I >remember there are some questions about character sets in the kernel >configuration. Are there file-systems

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-07 Thread Jakob Bohm
Hello everybody, On Mon, Jan 06, 2003 at 10:15:24PM +0100, Jochen Voss wrote: > Hello Colin, > > On Fri, Jan 03, 2003 at 09:50:26PM -0500, Colin Walters wrote: > > In summary, UTF-8 is the *only* sane character set to use for > > filenames. > At least I agree to this :-) > > I think that we need

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Colin Walters
[ CC's trimmed, since mail to the bug will reach -policy ] On Mon, 2003-01-06 at 16:07, Jason Gunthorpe wrote: > Fixing progams that handle terminal input is a different matter IMHO, it's > something that should be decided on a more case by case basis, and alot of > cases might be effortless han

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Colin Walters
On Mon, 2003-01-06 at 16:01, Jochen Voss wrote: > On Mon, Jan 06, 2003 at 12:21:27AM -0500, Colin Walters wrote: > > After we have a "sufficient" number of programs supporting UTF-8 > > natively in this way, we change the policy on filenames to a "must", > > drop support for legacy terminals and e

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Jochen Voss
Hello Colin, On Fri, Jan 03, 2003 at 09:50:26PM -0500, Colin Walters wrote: > In summary, UTF-8 is the *only* sane character set to use for > filenames. At least I agree to this :-) I think that we need filename conversion between UTF-8 and the user's character set, because we cannot ban all non-

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Jochen Voss
Hello, On Mon, Jan 06, 2003 at 12:21:27AM -0500, Colin Walters wrote: > After we have a "sufficient" number of programs supporting UTF-8 > natively in this way, we change the policy on filenames to a "must", > drop support for legacy terminals and encodings, and switch everyone to > a UTF-8 termin

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Jason Gunthorpe
On 6 Jan 2003, Colin Walters wrote: > Since we will have to change programs anyways, we might as well fix them > to decode filenames as well. The shell is kind of tempting as a "quick > fix", but I don't think it will really help us. Fixing progams that handle terminal input is a different matt

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Colin Walters
On Mon, 2003-01-06 at 02:46, Jason Gunthorpe wrote: > I think you'd need to have all of argv be converted to utf-8 by the shell. Besides Sebastien's reply, there is another good reason not to do recoding in the shell: for any program which actually manipulates filenames, we will need to add Unico

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Jason Gunthorpe
On Mon, 6 Jan 2003, Sebastian Rittau wrote: > > I think you'd need to have all of argv be converted to utf-8 by the shell. > > This wouldn't work, since you're not able to handle files that are not > in UTF-8 encoding, then. This is especially bothersome if you have some > old non-UTF-8 files ly

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Sebastian Rittau
On Mon, Jan 06, 2003 at 12:46:46AM -0700, Jason Gunthorpe wrote: > I think you'd need to have all of argv be converted to utf-8 by the shell. This wouldn't work, since you're not able to handle files that are not in UTF-8 encoding, then. This is especially bothersome if you have some old non-UTF-

Re: Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Jason Gunthorpe
On Mon, 6 Jan 2003, Richard Braakman wrote: > I guess this conversion should be done by the user's shell, and all > filename arguments on the command line should be encoded in UTF-8. > Umm, except that the shell doesn't know which arguments are filenames. > How should this be done? I think you'd

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-06 Thread Colin Walters
On Sun, 2003-01-05 at 22:00, Richard Braakman wrote: > Hmm. Remember the far more common case of a program that takes a > filename on the command line and then tries to open it. The user > would have typed it in the local encoding, so it needs conversion. > On the other hand, if the program was

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-05 Thread Colin Walters
On Sun, 2003-01-05 at 22:00, Richard Braakman wrote: > On Sun, Jan 05, 2003 at 09:12:36PM -0500, Colin Walters wrote: > > However, if these programs display > > them to the user on a tty, it will be necessary to convert them to the > > user's locale encoding > > Hmm. Remember the far more common

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-05 Thread Richard Braakman
On Sun, Jan 05, 2003 at 09:12:36PM -0500, Colin Walters wrote: > However, if these programs display > them to the user on a tty, it will be necessary to convert them to the > user's locale encoding Hmm. Remember the far more common case of a program that takes a filename on the command line and

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-05 Thread Colin Walters
On Sun, 2003-01-05 at 15:13, Denis Barbier wrote: > Consider a program written in C, which creates new files with open(2); > if I understand your proposal right, when a filename is not UTF-8 > encoded, it should be converted into UTF-8 according to user's locale. Well, broadly speaking, there are

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-05 Thread Denis Barbier
On Sun, Jan 05, 2003 at 12:09:09PM -0500, Colin Walters wrote: > On Sun, 2003-01-05 at 09:23, Denis Barbier wrote: > > On Sat, Jan 04, 2003 at 12:10:42PM -0500, Colin Walters wrote: > > [...] > > > What *is* debatable is when and how to make the transition, which is > > > what we're doing now. > >

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-05 Thread Colin Walters
On Sun, 2003-01-05 at 09:23, Denis Barbier wrote: > On Sat, Jan 04, 2003 at 12:10:42PM -0500, Colin Walters wrote: > [...] > > What *is* debatable is when and how to make the transition, which is > > what we're doing now. > [...] > > So how to implement your proposal? > The main issue is to patch

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-05 Thread Denis Barbier
On Sat, Jan 04, 2003 at 12:10:42PM -0500, Colin Walters wrote: [...] > What *is* debatable is when and how to make the transition, which is > what we're doing now. [...] So how to implement your proposal? The main issue is to patch glibc API so that filenames are supposed to be UTF-8 encoded. Has

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-05 Thread Tollef Fog Heen
* Colin Walters | On Sat, 2003-01-04 at 19:22, Tollef Fog Heen wrote: | | > And any hard coded scripts using -d norsk (or -d bokmal) for getting | > Norwegian ispell output. | | Hm, but if the filename is already UTF-8, what is the problem? It isn't in stable, which means that I want to keep co

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Colin Walters
On Sat, 2003-01-04 at 21:17, Marco d'Itri wrote: > On Jan 04, Colin Walters <[EMAIL PROTECTED]> wrote: > > >> We may want a BOM, at the start, though. > > > >We don't need one for UTF-8. That's another one of the great things > >about it. > What do you know about international environments? M

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Colin Walters
On Sat, 2003-01-04 at 20:01, Marco d'Itri wrote: > The same applies to bash. There has been patch in the BTS for a very > long time but it has never been applied. Hm, the latest bash appears to work for me at least. I've been using it when I want to do UTF-8 file manipulation until zsh is fixed.

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Colin Walters
On Sat, 2003-01-04 at 19:22, Tollef Fog Heen wrote: > Actually, the file names are in UTF8 already. :) Well, hey, so they are. Don't know why it didn't look like it before... > And any hard coded scripts using -d norsk (or -d bokmal) for getting > Norwegian ispell output. Hm, but if the filena

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Richard Braakman
On Sun, Jan 05, 2003 at 03:17:17AM +0100, Marco d'Itri wrote: > I propose a new policy amendment: developers whose native language is > english should not discuss i18n-related policy matters. That would make sure that i18n is always an afterthought. You need to work *with* developers, not *agains

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Marco d'Itri
On Jan 04, Colin Walters <[EMAIL PROTECTED]> wrote: >> We may want a BOM, at the start, though. > >We don't need one for UTF-8. That's another one of the great things >about it. What do you know about international environments? Maybe you do not need a BOM because your native language needs j

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Marco d'Itri
On Jan 04, Robert Bihlmeyer <[EMAIL PROTECTED]> wrote: >Considering old standards broken because a newer one exists is just >ridiculous. Agreed. >> I've noticed that UTF-8 sometimes makes zsh unhappy, [...] > >That's quite an understatement. The commandline editor can't deal with >multibyte

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Tollef Fog Heen
* Colin Walters | On Sat, 2003-01-04 at 13:15, Tollef Fog Heen wrote: | > * Colin Walters | > | > | Note that in my proposal UTF-8 filenames are only mandatory (a "must") | > | for files *included directly* in Debian packages or created by | > | maintainer scripts. Since I don't think we have

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Clint Adams
> > and then deleting it gets you in trouble, because zsh does not handle > > the two byte sequence as one character. > > Ok. Well, this should not be impossible to fix, I hope. No, just difficult to fix without a nasty kludge.

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Colin Walters
On Sat, 2003-01-04 at 16:33, Robert Bihlmeyer wrote: > Don't you think this is a common case? I'd even say more common than > your scenarios. At least common enough that it should be acknowledged. I agree, it is common enough. But previously people had no choice but to use a broken hack; now we

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Robert Bihlmeyer
Colin Walters <[EMAIL PROTECTED]> writes: > I don't think so. I have put forth many real-world scenarios in which > using national charsets for filenames simply breaks, in ways that are > basically impossible to fix. You may be able to get away with using a > national charset on a machine where

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Wichert Akkerman
Previously Colin Walters wrote: > Opinions? I second this proposal. Wichert. -- Wichert Akkerman <[EMAIL PROTECTED]> http://www.wiggy.net/ A random hacker

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Colin Walters
On Sat, 2003-01-04 at 13:15, Tollef Fog Heen wrote: > * Colin Walters > > | Note that in my proposal UTF-8 filenames are only mandatory (a "must") > | for files *included directly* in Debian packages or created by > | maintainer scripts. Since I don't think we have any packages including > | any

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Tollef Fog Heen
* Colin Walters | Note that in my proposal UTF-8 filenames are only mandatory (a "must") | for files *included directly* in Debian packages or created by | maintainer scripts. Since I don't think we have any packages including | anything but ASCII filenames, this will not change a thing. You ar

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Colin Walters
On Sat, 2003-01-04 at 10:55, Robert Bihlmeyer wrote: > Colin Walters <[EMAIL PROTECTED]> writes: > > > > As I see it, the current (broken ?) behaviour is, to use the user's > > > locale setting (LC_CTYPE) to encode file names. > > > > It appears so, and yes, this behavior is completely and fund

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Colin Walters
On Sat, 2003-01-04 at 06:10, Marco d'Itri wrote: > On Jan 04, Colin Walters <[EMAIL PROTECTED]> wrote: > > >In summary, UTF-8 is the *only* sane character set to use for > >filenames. > True, but does not work in reality for too many people, so this cannot > be made mandatory. Note that in my p

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Robert Bihlmeyer
Colin Walters <[EMAIL PROTECTED]> writes: > > As I see it, the current (broken ?) behaviour is, to use the user's > > locale setting (LC_CTYPE) to encode file names. > > It appears so, and yes, this behavior is completely and fundamentally > broken. Whether or not this is broken is debatable.

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-04 Thread Marco d'Itri
On Jan 04, Colin Walters <[EMAIL PROTECTED]> wrote: >In summary, UTF-8 is the *only* sane character set to use for >filenames. True, but does not work in reality for too many people, so this cannot be made mandatory. > Major upstream software for Debian like GNOME is moving >towards requiring

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-03 Thread Colin Walters
On Fri, 2003-01-03 at 18:11, Jochen Voss wrote: > Is this meant to apply to programs like "ls", "bash", "touch", and > "emacs"? Yes. > I imagine that the transition period could be a hard time > for users who (like me) use non-ASCII characters in file-names. That is probably true. But we real

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-03 Thread Jochen Voss
Hello, On Thu, Jan 02, 2003 at 05:25:15PM -0500, Colin Walters wrote: > + Programs should expect filenames in general (whether from > + a Debian package or created by the user) to be encoded > + with UTF-8, although it is recommended for programs to try > + graceful

Bug#99933: second attempt at more comprehensive unicode policy

2003-01-02 Thread Colin Walters
On Thu, 2003-01-02 at 13:57, Colin Walters wrote: > #99933 goes a lot farther than #174982. I have a counter-proposal to #99933, which I have attached. I believe it fixes the problems I raised with your proposal, and should also cover some new areas (like filenames). I also hopefully fixed Jame