Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Edward Lam
Corinna Vinschen wrote: On May 29 17:21, Edward Lam wrote: I think the problem I'm running into is: - I give cygwin 1.7's bash a string that is in my system default code page. - cygwin 1.7 thinks the string is actually UTF-8 and tries to convert it as UTF-8 into UTF-16, resulting in a

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Corinna Vinschen
On Jun 3 09:18, Edward Lam wrote: Corinna Vinschen wrote: The question is, what do you expect? [...] [...] Wikipedia has several suggestions on how to handle invalid UTF-8 byte sequences (http://en.wikipedia.org/wiki/UTF-8). Personally, I favor the rule that uses the replacement

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Edward Lam
Corinna Vinschen wrote: What's left as questionable is the LANG=C default case. Due to the discussion from the last month we now use UTF-8 as default encoding, because it's the only encoding which covers all (valid) characters. Sure, we could also convert the command line using the current ANSI

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread IWAMURO Motonori
Hi. How about the addition of the setting of the locale environment variable (like LANG) to the Cygwin installer? 2009/6/3 Corinna Vinschen corinna-cyg...@cygwin.com: On Jun  3 09:18, Edward Lam wrote: Corinna Vinschen wrote: The question is, what do you expect?  [...] [...] Wikipedia has

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Corinna Vinschen
On Jun 3 10:43, Edward Lam wrote: Corinna Vinschen wrote: What's left as questionable is the LANG=C default case. Due to the discussion from the last month we now use UTF-8 as default encoding, because it's the only encoding which covers all (valid) characters. Sure, we could also convert

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Corinna Vinschen
http://cygwin.com/acronyms/#PCYMTNQREAIYR http://cygwin.com/acronyms/#TOFU On Jun 4 00:03, IWAMURO Motonori wrote: 2009/6/3 Corinna Vinschen What's left as questionable is the LANG=C default case.  Due to the discussion from the last month we now use UTF-8 as default encoding, because

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Edward Lam
Corinna Vinschen wrote: No. I'm suggesting to convert the command line always using the default ANSI codepage, same as Windows when calling CreateProcessA. This only affects non-Cygwin processes anyway since Cygwin uses another mechanism to send the command line arguments to the child process.

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Christopher Faylor
On Wed, Jun 03, 2009 at 04:27:55PM +0200, Corinna Vinschen wrote: On Jun 3 09:18, Edward Lam wrote: Corinna Vinschen wrote: The question is, what do you expect? [...] [...] Wikipedia has several suggestions on how to handle invalid UTF-8 byte sequences

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Corinna Vinschen
On Jun 3 12:02, Christopher Faylor wrote: On Wed, Jun 03, 2009 at 04:27:55PM +0200, Corinna Vinschen wrote: On Jun 3 09:18, Edward Lam wrote: Corinna Vinschen wrote: The question is, what do you expect? [...] [...] Wikipedia has several suggestions on how to handle invalid UTF-8 byte

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Corinna Vinschen
On Jun 3 12:01, Edward Lam wrote: Corinna Vinschen wrote: No. I'm suggesting to convert the command line always using the default ANSI codepage, same as Windows when calling CreateProcessA. This only affects non-Cygwin processes anyway since Cygwin uses another mechanism to send the

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Corinna Vinschen
On Jun 3 18:12, Corinna Vinschen wrote: On Jun 3 12:01, Edward Lam wrote: Corinna Vinschen wrote: No. I'm suggesting to convert the command line always using the default ANSI codepage, same as Windows when calling CreateProcessA. This only affects non-Cygwin processes anyway since

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread IWAMURO Motonori
I think that this problem is caused by missing setting the locale environment variable. Therefore, I think that the problem can be solved by compelling the setting with setup.exe. 2009/6/4 Corinna Vinschen corinna-cyg...@cygwin.com: http://cygwin.com/acronyms/#PCYMTNQREAIYR

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread IWAMURO Motonori
And, I think that UTF-8 is best solution when the setting of LC_CTYPE category is C. 2009/6/4 IWAMURO Motonori deenhe...@gmail.com: I think that this problem is caused by missing setting the locale environment variable. Therefore, I think that the problem can be solved by compelling the

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Edward Lam
Corinna Vinschen wrote: On Jun 3 12:02, Christopher Faylor wrote: On Wed, Jun 03, 2009 at 04:27:55PM +0200, Corinna Vinschen wrote: On Jun 3 09:18, Edward Lam wrote: Corinna Vinschen wrote: The question is, what do you expect? [...] [...] Wikipedia has several suggestions on how to

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Edward Lam
IWAMURO Motonori wrote: And, I think that UTF-8 is best solution when the setting of LC_CTYPE category is C. 2009/6/4 IWAMURO Motonori deenhe...@gmail.com: I think that this problem is caused by missing setting the locale environment variable. Therefore, I think that the problem can be solved

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Christopher Faylor
On Wed, Jun 03, 2009 at 12:55:57PM -0400, Edward Lam wrote: Corinna Vinschen wrote: On Jun 3 12:02, Christopher Faylor wrote: On Wed, Jun 03, 2009 at 04:27:55PM +0200, Corinna Vinschen wrote: On Jun 3 09:18, Edward Lam wrote: Corinna Vinschen wrote: The question is, what do you expect?

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Edward Lam
Christopher Faylor wrote: As Corinna said above: Chris implemented using the invalid code point solution That's what is in Cygwin's CVS and in the latest snapshot. I see, you silently committed a fix while this discussion was ongoing?

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Alexey Borzenkov
On Wed, Jun 3, 2009 at 6:27 PM, Corinna Vinschen corinna-cyg...@cygwin.com wrote: What's left as questionable is the LANG=C default case.  Due to the discussion from the last month we now use UTF-8 as default encoding, because it's the only encoding which covers all (valid) characters. Sure,

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Christopher Faylor
On Wed, Jun 03, 2009 at 01:19:45PM -0400, Edward Lam wrote: Christopher Faylor wrote: As Corinna said above: Chris implemented using the invalid code point solution That's what is in Cygwin's CVS and in the latest snapshot. I see, you silently committed a fix while this discussion was ongoing?

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread Larry Hall (Cygwin)
On 06/03/2009, Christopher Faylor wrote: If you really must know, I was waiting for Corinna to return from vacation before advertising the fix because I wanted her to look at it first before it was exposed to sound and fury of the cygwin mailing list. Looks like you gotyour wish. ;-) -- Larry

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-02 Thread Corinna Vinschen
On May 29 17:21, Edward Lam wrote: Alexey Borzenkov wrote: No, the bug is not that it gets wrong number of arguments. In fact, Windows has no concept of arguments, only C runtime does, which parses the command line. If command line is truncated, then C runtime will have missing arguments

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-30 Thread Edward Lam
I'm reposting since I didn't mean to send this privately. On Fri, May 29, 2009 17:22, Alexey Borzenkov wrote: Here, when I use russian Windows and I don't have LANG set (or when I have LANG=en_US.UTF-8), filename will be utf-8 multibyte string. So both, russian and european/chinese/japanese

[Fwd: Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line]

2009-05-30 Thread Edward Lam
Repost for mailing list. On Sat, May 30, 2009 at 6:03 PM, Edward Lam edw...@sidefx.com wrote: Here, when I use russian Windows and I don't have LANG set (or when I have LANG=en_US.UTF-8), filename will be utf-8 multibyte string. So both, russian and european/chinese/japanese filenames will be

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-30 Thread Edward Lam
Ok, so where's the bug tracker so I can log a bug? Isn't this mailing list serving as bug tracker? I just hope that whoever can fix this is reading our emails and will come up with the right solution. Given the lack of developer acknowledgment (or refutation), I'm not getting my hopes up.

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-30 Thread Christopher Faylor
On Sat, May 30, 2009 at 04:23:22PM -0400, Edward Lam wrote: Ok, so where's the bug tracker so I can log a bug? Isn't this mailing list serving as bug tracker? I just hope that whoever can fix this is reading our emails and will come up with the right solution. Given the lack of developer

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-30 Thread Charles Wilson
Edward Lam wrote: Ok, so where's the bug tracker so I can log a bug? Isn't this mailing list serving as bug tracker? I just hope that whoever can fix this is reading our emails and will come up with the right solution. Given the lack of developer acknowledgment (or refutation), I'm not

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread IWAMURO Motonori
Hi. The encoding of C locale is ASCII, and not ISO-8859-1. I don't think ASCII is the same as ISO-8859-1. Does it work on LANG=en_US.ISO-8859-1? 2009/5/29 Edward Lam edw...@sidefx.com: Alexey Borzenkov wrote: On Thu, May 28, 2009 at 7:28 PM, Edward Lam edw...@sidefx.com wrote: PS. In case

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Edward Lam
IWAMURO Motonori wrote: The encoding of C locale is ASCII, and not ISO-8859-1. I don't think ASCII is the same as ISO-8859-1. Does it work on LANG=en_US.ISO-8859-1? No, it doesn't. Mind you though, I haven't managed to get piconv to recognize any of my LANG settings other than C in cygwin

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread IWAMURO Motonori
I think that you should set export LANG=en_US.ISO-8859-1 instead of export LANG=LANG=en_US.ISO-8859-1. 2009/5/30 Edward Lam edw...@sidefx.com: IWAMURO Motonori wrote: The encoding of C locale is ASCII, and not ISO-8859-1. I don't think ASCII is the same as ISO-8859-1. Does it work on

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Edward Lam
IWAMURO Motonori wrote: I think that you should set export LANG=en_US.ISO-8859-1 instead of export LANG=LANG=en_US.ISO-8859-1. Ah, sorry, copy/paste error. Yes, that finally works. Thank you! I think there is still a bug here? I set LANG=C, then shouldn't be just NOT doing any encoding, thus

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Alexey Borzenkov
On Fri, May 29, 2009 at 8:22 PM, Edward Lam edw...@sidefx.com wrote: I think there is still a bug here? I set LANG=C, then shouldn't be just NOT doing any encoding, thus work? If I do this on Linux, it works. If I use a cygwin compiled app, it also works. On Linux, internally, system uses

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Edward Lam
Hi Alexey, Thanks for explaining the UTF8 changes in cygwin 1.7. However, the decision to use UTF-8 for the C locale is questionable. It seems to me that it would be much safer to use the SYSTEM DEFAULT code page (ie. the return value of the system GetACP() function) for CYGWIN instead,

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Alexey Borzenkov
On Sat, May 30, 2009 at 12:10 AM, Edward Lam edw...@sidefx.com wrote: Thanks for explaining the UTF8 changes in cygwin 1.7. However, the decision to use UTF-8 for the C locale is questionable. Not at all, because utf-8, as far as I understand, is used for communication with the system in this

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Edward Lam
Alexey Borzenkov wrote: It might be safe for you, but not for other people. If you have a Russian default codepage and ever need to work with chineese/japanese filenames and cygwin uses default codepage for filesystem operations (as in 1.5 right now), then you are really screwed. In my

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Edward Lam
Alexey Borzenkov wrote: No, the bug is not that it gets wrong number of arguments. In fact, Windows has no concept of arguments, only C runtime does, which parses the command line. If command line is truncated, then C runtime will have missing arguments when it tries to parse it. Sorry, I

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Alexey Borzenkov
On Sat, May 30, 2009 at 1:04 AM, Edward Lam edw...@sidefx.com wrote: Alexey Borzenkov wrote: It might be safe for you, but not for other people. If you have a Russian default codepage and ever need to work with chineese/japanese filenames and cygwin uses default codepage for filesystem

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread Alexey Borzenkov
On Sat, May 30, 2009 at 1:21 AM, Edward Lam edw...@sidefx.com wrote: Here's some more investigation: [...] So note that even when I'm seems to be an UNICODE-AWARE child process, I'm still getting a truncated command line. In fact, call GetCommandLineW() directly seems to give a truncated

1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-28 Thread Edward Lam
Hi Cygwin 1.7 developers, I think I've encountered bug in cygwin 1.7.0-48 on WinXP 32-bit. It seems that passing a character on the command line (from either ash.exe or bash.exe) that is greater than 127 to a native win32 process results in arguments being truncated. Hopefully you can

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-28 Thread Larry Hall (Cygwin)
Edward Lam wrote: Hi Cygwin 1.7 developers, I think I've encountered bug in cygwin 1.7.0-48 on WinXP 32-bit. It seems that passing a character on the command line (from either ash.exe or bash.exe) that is greater than 127 to a native win32 process results in arguments being truncated.

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-28 Thread Edward Lam
Hi Larry, This sounds allot like this report to me: http://cygwin.com/ml/cygwin/2009-05/msg00611.html I don't think it's the same bug because if I replace copyright.txt with a single printable character (eg. c), then it works. Regards, -Edward Larry Hall (Cygwin) wrote: Edward Lam

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-28 Thread Edward Lam
PS. In case you haven't noticed, copyright.txt is not a long file. It consists of a single byte, 0xA9. Edward Lam wrote: Hi Larry, This sounds allot like this report to me: http://cygwin.com/ml/cygwin/2009-05/msg00611.html I don't think it's the same bug because if I replace

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-28 Thread Alexey Borzenkov
On Thu, May 28, 2009 at 7:28 PM, Edward Lam edw...@sidefx.com wrote: PS. In case you haven't noticed, copyright.txt is not a long file. It consists of a single byte, 0xA9. Did you try utf-8 encoding copyright.txt? Perhaps your locale is utf-8 and the encoder fails. -- Unsubscribe info:

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-28 Thread Edward Lam
Alexey Borzenkov wrote: On Thu, May 28, 2009 at 7:28 PM, Edward Lam edw...@sidefx.com wrote: PS. In case you haven't noticed, copyright.txt is not a long file. It consists of a single byte, 0xA9. Did you try utf-8 encoding copyright.txt? Perhaps your locale is utf-8 and the encoder fails.