Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-25 Thread Robert Pendell
On Wed, Sep 23, 2009 at 5:30 PM, Ross Smith wrote: > Corinna Vinschen wrote: >> >> However, if we default to UTF-8 for a subset of languages anyway, it >> gets even more interesting to ask, why not for all languages?  Isn't it >> better in the long run to have the same default for all Cygwin >> ins

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-23 Thread Ross Smith
Corinna Vinschen wrote: However, if we default to UTF-8 for a subset of languages anyway, it gets even more interesting to ask, why not for all languages? Isn't it better in the long run to have the same default for all Cygwin installations? I'm really wondering if we shouldn't simply default

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-23 Thread Corinna Vinschen
On Sep 23 14:43, Corinna Vinschen wrote: > On Sep 23 13:34, Andy Koppe wrote: > > 2009/9/23 Corinna Vinschen: > > > I have a local patch ready to use the ANSI codepage by default in the > > > "C" locale.  It appears to work nicely and has the additional positive > > > side effect to simplify the co

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-23 Thread Corinna Vinschen
On Sep 23 13:34, Andy Koppe wrote: > 2009/9/23 Corinna Vinschen: > > I have a local patch ready to use the ANSI codepage by default in the > > "C" locale.  It appears to work nicely and has the additional positive > > side effect to simplify the code in a few places. > > > > If I only new that east

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-23 Thread Andy Koppe
2009/9/23 Corinna Vinschen: > I have a local patch ready to use the ANSI codepage by default in the > "C" locale.  It appears to work nicely and has the additional positive > side effect to simplify the code in a few places. > > If I only new that eastern language users could happily live with > th

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-23 Thread Corinna Vinschen
On Sep 22 19:07, Corinna Vinschen wrote: > On Sep 22 17:12, Andy Koppe wrote: > > True, but that's an implementation issue rather than a design issue, > > i.e. the ^N conversion needs to do the UTF-8 conversion itself rather > > than invoke the __utf8 functions. Shall I look into creating a patch?

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-23 Thread Andy Koppe
2009/9/22 Corinna Vinschen: >> >> Therefore, when converting a UTF-16 Windows filename to the current >> >> charset, 0xDC?? words should be treated like any other UTF-16 word >> >> that can't be represented in the current charset: it should be encoded >> >> as a ^N sequence. (I started writing thi

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-22 Thread Corinna Vinschen
On Sep 22 17:12, Andy Koppe wrote: > 2009/9/22 Corinna Vinschen: > >> Therefore, when converting a UTF-16 Windows filename to the current > >> charset, 0xDC?? words should be treated like any other UTF-16 word > >> that can't be represented in the current charset: it should be encoded > >> as a ^N

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-22 Thread Andy Koppe
2009/9/22 Corinna Vinschen: >> > As you might know, invalid bytes >= 0x80 are translated to UTF-16 by >> > transposing them into the 0xdc00 - 0xdcff range by just or'ing 0xdc00. >> > The problem now is that readdir() will return the transposed characters >> > as if they are the original characters.

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-22 Thread Corinna Vinschen
On Sep 21 19:54, Andy Koppe wrote: > 2009/9/21 Corinna Vinschen: > > As you might know, invalid bytes >= 0x80 are translated to UTF-16 by > > transposing them into the 0xdc00 - 0xdcff range by just or'ing 0xdc00. > > The problem now is that readdir() will return the transposed characters > > as if

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-21 Thread Andy Koppe
2009/9/21 Corinna Vinschen: >> % cat t.c >> int main() { >>     fopen("a-\xF6\xE4\xFC\xDF", "w"); //ISO-8859-1 >>     fopen("b-\xF6\xE4\xFC\xDFz", "w"); >>     fopen("c-\xF6\xE4\xFC\xDFzz", "w"); >>     fopen("d-\xF6\xE4\xFC\xDFzzz", "w"); >>     fopen("e-\xF6\xE4\xFC\xDF\xF6\xE4\xFC\xDF", "w"); >>

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-21 Thread Corinna Vinschen
On Sep 16 00:38, Lapo Luchini wrote: > Andy Koppe wrote: > > Hmm, we've lost the \xDF somewhere, and I'd guess it was when the > > filename got translated to UTF-16 in fopen(), which would explain what > > you're seeing > > More data: it's not simply "the last character", is something more > compl

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-15 Thread Lapo Luchini
Andy Koppe wrote: > Hmm, we've lost the \xDF somewhere, and I'd guess it was when the > filename got translated to UTF-16 in fopen(), which would explain what > you're seeing More data: it's not simply "the last character", is something more complex than that. % cat t.c int main() { fopen("a-

Re: [1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-10 Thread Andy Koppe
2009/9/10 Lapo Luchini: > But the real problem with that test is not really what shows and how, > the biggest problem is that it seems that filenames created with a > "wrong" filename are quite limited in usage and can't seemingly be deleted. > > % export LANG=en_EN.UTF-8 > % cat t.c > #include >

[1.7] Invalid UTF8 while creating a file -> cannot delete?

2009-09-10 Thread Lapo Luchini
After a few problems with monotone's unit tests on Cygwin-1.7, I began searching and experimenting a bit with new 1.7 support for wide chars. I also read the full thread about its last change: http://www.cygwin.com/ml/cygwin/2009-05/msg00344.html which really makes some sense to me (when I create