Re: Cygwin programs doesn't support non-ASCII filenames

2009-05-09 Thread Corinna Vinschen
[Repeated and additional question.  I accidentally sent this as PM.
 Sorry about that.  Let's keep this on the list, please]

On May  9 11:43, Lenik wrote:
> (My system locale is zh_CN)

What ANSI codepage is that?

And what OEM codepage uses the console Window by default?

> 1, test path
> >>> set LANG=& cygpath -am .
> C:/Profiles/Shecti/??
>
> >>> set LANG=zh_CN.GBK& cygpath -am .
> C:/Profiles/Shecti/??
>
> >>> set LANG=C& cygpath -am .
> C:/Profiles/Shecti/×ÀÃæ

Can you please give us the exact name of the directory in either
UTF-8 or UTF-16 notation?

> 2, the `test' utility
> >>> set LANG=& bash -c "D=$(cygpath -am .); if [ -d $D ]; then echo  
> ok $D; else echo fail $D; fi"
> fail C:/Profiles/Shecti/??

What you're actually testing here all the time is cygpath in the first
place.  If you stop using cygpath, start a bash shell and use the Cygwin
commands with the paths in POSIX notation, you would have much less
trouble.  Cygwin is a POSIX emulation layer, after all.

If you give me the above information I'll look into fixing cygpath.

> The GB2312 charset is a subset of GBK charset, and the characters `  
> ??' is included in GB2312 charset. So in this example, GB2312 SHOULD 
> WORK.

Sorry, no.  It's documented that GBK is supported, GB2312 isn't.  From
what I read about GB2312 it's not actually a subset of GBK in terms
of character definitions, it's just a subset in terms of supported
characters.  AFAICS, GB2312 uses chars < 0x7f in multibyte sequences
which is not feasible for Cygwin.  We could support EUC-CN, which
seems to be another way to encode GB2312 chars, but I'm not exactly
willing to add that now.  I'd rather stabilize what we have now and
add further charset support in a later, official 1.7 release.

So you can use LANG=zh_CN.GBK, but not LANG=zh_CN.GB2312.  It's just
treated as invalid input.  Better: Use LANG=zh_CN.UTF-8.


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader  cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: Cygwin programs doesn't support non-ASCII filenames

2009-05-09 Thread Lenik

(This mail is encoded in utf-8)

On 2009-5-9 18:02, Corinna Vinschen wrote:

[Repeated and additional question.  I accidentally sent this as PM.
  Sorry about that.  Let's keep this on the list, please]

On May  9 11:43, Lenik wrote:

(My system locale is zh_CN)


What ANSI codepage is that?

And what OEM codepage uses the console Window by default?

`chcp' shows codepage is 937
I don't know what's difference between ANSI codepage and OEM codepage.




1, test path
 >>>  set LANG=&  cygpath -am .
 C:/Profiles/Shecti/??

 >>>  set LANG=zh_CN.GBK&  cygpath -am .
 C:/Profiles/Shecti/??

 >>>  set LANG=C&  cygpath -am .
 C:/Profiles/Shecti/×ÀÃæ


Can you please give us the exact name of the directory in either
UTF-8 or UTF-16 notation?

The two chinese characters encoding in:
GB2312: d7 c0 c3 e6
UTF-8: e6 a1 8c e9 9d a2
Unicode: \u684c \u9762




2, the `test' utility
 >>>  set LANG=&  bash -c "D=$(cygpath -am .); if [ -d $D ]; then echo
ok $D; else echo fail $D; fi"
 fail C:/Profiles/Shecti/??


What you're actually testing here all the time is cygpath in the first
place.  If you stop using cygpath, start a bash shell and use the Cygwin
commands with the paths in POSIX notation, you would have much less
trouble.  Cygwin is a POSIX emulation layer, after all.

Well, I test the pathnames using cygpath because I want to get absolute 
path so the chinese characters will be included in this test, and I 
can't type these characters in the console window. The second reason is, 
I associated .sh file type with bash, as:

  .sh=C:\lam\sys\cygwin-1.7\bin\bash -c "$(cygpath -u '%0') %*"

This is a new test don't use cygpath:
C:\Profiles\Shecti> set LANG=& bash -c "cat 你好"
cat: 你好: No such file or directory

C:\Profiles\Shecti> set LANG=zh_CN.GB2312& bash -c "cat 你好"
cat: 你好: No such file or directory

C:\Profiles\Shecti> set LANG=zh_CN.GBK& bash -c "cat 你好"
123

C:\Profiles\Shecti> set LANG=zh_CN.UTF-8& bash -c "cat 你好"
123

C:\Profiles\Shecti> set LANG=& bash -c "d 你好"
/mnt/c/Profiles/Shecti/你好 doesn't exist!

C:\Profiles\Shecti> set LANG=zh_CN.GBK& bash -c "d 你好"
/mnt/c/Profiles/Shecti/你好 doesn't exist!

C:\Profiles\Shecti> set LANG=zh_CN.UTF-8& bash -c "d 你好"
/mnt/c/Profiles/Shecti/你好 doesn't exist!

The same result, it shows that `cat' from binutils can support locale 
well, while `d' isn't.



If you give me the above information I'll look into fixing cygpath.


 The GB2312 charset is a subset of GBK charset, and the characters `
??' is included in GB2312 charset. So in this example, GB2312 SHOULD
WORK.


Sorry, no.  It's documented that GBK is supported, GB2312 isn't.  From
what I read about GB2312 it's not actually a subset of GBK in terms
of character definitions, it's just a subset in terms of supported
characters.  AFAICS, GB2312 uses chars<  0x7f in multibyte sequences
which is not feasible for Cygwin.  We could support EUC-CN, which
seems to be another way to encode GB2312 chars, but I'm not exactly
willing to add that now.  I'd rather stabilize what we have now and
add further charset support in a later, official 1.7 release.

So you can use LANG=zh_CN.GBK, but not LANG=zh_CN.GB2312.  It's just
treated as invalid input.  Better: Use LANG=zh_CN.UTF-8.

Yes, GB2312 is a subset in terms of supported characters. Is there 
anyway to know the default locale of current cygwin installation? From 
the test I found that `unset LANG' and `set LANG=zh_CN.GB2312' just get 
the same results, so I thought that GB2312 is the default locale.


And, I'd like to use UTF-8 too, but I won't chcp to 65001, this will 
introduce a lot of new problems when deploy to customers' machines. 
while most programs and files are encoded in GB2312 in the real world.


Lenik


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: Cygwin programs doesn't support non-ASCII filenames

2009-05-09 Thread Corinna Vinschen
On May  9 23:12, Lenik wrote:
> (This mail is encoded in utf-8)
>
> On 2009-5-9 18:02, Corinna Vinschen wrote:
>> [Repeated and additional question.  I accidentally sent this as PM.
>>   Sorry about that.  Let's keep this on the list, please]
>>
>> On May  9 11:43, Lenik wrote:
>>> (My system locale is zh_CN)
>>
>> What ANSI codepage is that?
>>
>> And what OEM codepage uses the console Window by default?
> `chcp' shows codepage is 937

937?!?  Per MSDN there's no 937 codepage, rather a 936 codepage
which is used as ANSI and OEM codepage for the chinese language.
Dependent where you look it's either called GBK or gb2312.  However,
it looks like GBK is more correct.

> I don't know what's difference between ANSI codepage and OEM codepage.

Well, basically ANSI is the codepage used by Windows GUI tools, OEM
is the codepage used by the Windows console by default.  A full
explanation is going a bit over the top in this mailing list.  And it
doesn't actually affect you since, as I wrote above, the 936 CP is used
for both areas.

>> Can you please give us the exact name of the directory in either
>> UTF-8 or UTF-16 notation?
> The two chinese characters encoding in:
> GB2312: d7 c0 c3 e6
> UTF-8: e6 a1 8c e9 9d a2
> Unicode: \u684c \u9762

Thanks, I'll use that for testing next week.

> C:\Profiles\Shecti> set LANG=& bash -c "cat ??"
> cat: ??: No such file or directory
>
> C:\Profiles\Shecti> set LANG=zh_CN.GBK& bash -c "cat ??"
> 123
>
> C:\Profiles\Shecti> set LANG=zh_CN.UTF-8& bash -c "cat ??"
> 123
>
> C:\Profiles\Shecti> set LANG=& bash -c "d ??"
> /mnt/c/Profiles/Shecti/?? doesn't exist!
>
> C:\Profiles\Shecti> set LANG=zh_CN.GBK& bash -c "d ??"
> /mnt/c/Profiles/Shecti/?? doesn't exist!
>
> C:\Profiles\Shecti> set LANG=zh_CN.UTF-8& bash -c "d ??"
> /mnt/c/Profiles/Shecti/?? doesn't exist!
>
> The same result, it shows that `cat' from binutils can support locale  
> well, while `d' isn't.

Ok, but that's not Cygwin's problem, just the d tool would need an
update at one point, perhaps.  OTOH, what you're doing is a bit
borderline.  When you start this stuff from cmd, you will have to enter
the filename in the notation valid for the locale in which the
application works.  For d, which only works in the C locale, you would
have to give the pathname using the SO/UTF-8 sequences.  Right now I
have no idea if there's a workaround for that, but keep in mind that
we're at the beginning of real native language support.  Unfortunately
it's all a bit more complicated than on non-Windows systems, given the
UTF-16-ness of the underlying system.

>> So you can use LANG=zh_CN.GBK, but not LANG=zh_CN.GB2312.  It's just
>> treated as invalid input.  Better: Use LANG=zh_CN.UTF-8.
>>
> Yes, GB2312 is a subset in terms of supported characters. Is there  
> anyway to know the default locale of current cygwin installation? From  
> the test I found that `unset LANG' and `set LANG=zh_CN.GB2312' just get  
> the same results, so I thought that GB2312 is the default locale.

The default lcoale is "C", as demanded by POSIX.  Everything else is
in responsibility of the application.  Please read
http://cygwin.com/1.7/cygwin-ug-net/setup-locale.html
and
http://cygwin.com/1.7/cygwin-ug-net/using-specialnames.html#pathnames-unusual

> And, I'd like to use UTF-8 too, but I won't chcp to 65001, this will  
> introduce a lot of new problems when deploy to customers' machines.  
> while most programs and files are encoded in GB2312 in the real world.

Cygwin 1.7 doesn't require you to use chcp.  Since all internal file I/O
and console I/O uses UTF-16 in Cygwin and the conversion from singlebyte
or multibyte charset to UTF-16 is done in Cygwin itself, the console
codepage has no meaning for Cygwin.  However, in your examples above
it gets a meaning since you enter the filenames while running in cmd,
and cmd of course *does* rely on the console codepage.


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader  cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: print on cygwin command window with gfortran?

2009-05-09 Thread Dave Korn
Gus K wrote:
> I use gfortran on cygwin and i want to print in the command window (like it
> prints in windows)
> i use the usual stuff:
> 
> WRITE(6,*) 'Give a number:'
> 
> or
> 
> PRINT(6,*) 'Give a number:'
> 
> but the execution completes without any printing..
> 
> What is wrong?

  It's a problem with libgfortran as a DLL.  For now, please use static
linking; add "-static -static-libgcc" to your linker flags / command-line.

cheers,
  DaveK

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: Cygwin programs doesn't support non-ASCII filenames

2009-05-09 Thread Lenik

On 2009-5-9 23:44, Corinna Vinschen wrote:

On May  9 23:12, Lenik wrote:

(This mail is encoded in utf-8)

On 2009-5-9 18:02, Corinna Vinschen wrote:

[Repeated and additional question.  I accidentally sent this as PM.
   Sorry about that.  Let's keep this on the list, please]

On May  9 11:43, Lenik wrote:

(My system locale is zh_CN)

What ANSI codepage is that?

And what OEM codepage uses the console Window by default?

`chcp' shows codepage is 937


937?!?  Per MSDN there's no 937 codepage, rather a 936 codepage

Sorry, it's 936.


Ok, but that's not Cygwin's problem, just the d tool would need an
update at one point, perhaps.  OTOH, what you're doing is a bit
borderline.  When you start this stuff from cmd, you will have to enter
the filename in the notation valid for the locale in which the
application works.  For d, which only works in the C locale, you would
have to give the pathname using the SO/UTF-8 sequences.  Right now I
have no idea if there's a workaround for that, but keep in mind that
we're at the beginning of real native language support.  Unfortunately
it's all a bit more complicated than on non-Windows systems, given the
UTF-16-ness of the underlying system.

d is an example, there's more. so I guess it should be resolved in 
cygwin maybe better...


Though I maybe able to use UTF-8 sequences to invoke d tool, but I can't 
do anything about cwd, for example:

bash-3.2$ pwd
/mnt/c/Profiles/Shecti/桌面

bash-3.2$ ls
Gears Shortcut Sample.lnk  hello  setup.xj  worker.js
e-3.4.lnk  reply.txt  sms.xls

bash-3.2$ d
-  :  0  Jan 01  1970  桌面



The default lcoale is "C", as demanded by POSIX.  Everything else is
in responsibility of the application.  Please read

But set LANG=C will get a different result,

C:\Profiles\Shecti> set LANG=& bash -c "cat 你好"
cat: 你好: No such file or directory

C:\Profiles\Shecti> set LANG=C& bash -c "cat 你好"
123

So I guess the default locale isn't C.


Lenik


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



How to detect a cygwin thread?

2009-05-09 Thread Piotr Wyderski
My program has a built-in panic handler, which enumerates
all process threads using the CreateToolhelp32Snapshot
WinAPI function and then suspends them (except itself)
in order to freeze the entire environment in a state as close
as possible to the original error conditions. Unfortunately it
also stops the internal Cygwin thread (which seems to spend
most of its time in cygwin1.dll!toascii+0x15d0) and the entire
process hangs. Is there a way to identify those Cygwin
threads in order not to suspend them? There is a function

 extern "C" DWORD cygwin_internal()

specified as "This function gives you access to various
internal data and functions", so perhaps it could help me?
If not, then how do I achieve the goal specified above?

Best regards
Piotr Wyderski

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: How to detect a cygwin thread?

2009-05-09 Thread Mark Geisert
Piotr Wyderski writes:
> My program has a built-in panic handler, which enumerates
> all process threads using the CreateToolhelp32Snapshot
> WinAPI function and then suspends them (except itself)
> in order to freeze the entire environment in a state as close
> as possible to the original error conditions. Unfortunately it
> also stops the internal Cygwin thread (which seems to spend
> most of its time in cygwin1.dll!toascii+0x15d0) and the entire
> process hangs. Is there a way to identify those Cygwin
> threads in order not to suspend them?

Why assume Cygwin could be the only source of extra threads?

Wouldn't it make more sense to have your program remember its own threads and
only suspend those?  Presumably you know when and where your program's threads
are created and destroyed, right?

..mark




--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: [ANNOUNCEMENT] [1.7] Updated: cygwin-1.7.0-47

2009-05-09 Thread Lee D. Rothstein

Christopher Faylor wrote:

On Thu, May 07, 2009 at 02:20:18PM -0400, Andrew Schulman wrote:
  

By the way, I don't like that setup maximizes the window when on the package
selection step.


I haven't seen it, but it certainly sounds wrong for a "wizard"-style
window to change its size when you press the Next button.
  

Hm, I always think it's kind of handy.  Also that it changes back again
after you're done with the selections.  Saves me 2 clicks every time I run
setup.



That's why I implemented it (that + Corinna asked me to do something
like this a while ago).  It certainly isn't particularly useful to
ALWAYS open a window which is too small to hold the information it is
trying to convey.  That's what setup.exe used to do.

It is pretty difficult to make setup's property sheet methods do
anything more flexible than maximize.  One alternative is to make all of
the setup windows bigger.  I tried that and it looked much more
intrusive than just maximizing the one window which needs it.

Lots of installers take full control of the screen.  setup.exe is unique
in that it doesn't take full control for the entire install process.

cgf

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/


  


For my taste (and this is only taste), it works beautifully, now!

Thank you!

Lee

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/