Re: Unicode in filenames support? (FAQ update needed)
On 6/7/05, Christopher Faylor wrote: > >I've been off of the developer list for a while now, and now the > >archives are subscriber only. :-( > > Joshua, any chance I could get a FAQ entry about this? I've updated "What Cygwin mailing lists can I join?" with a better description. Old language was "If you are going to help develop the Cygwin library by volunteering for the project, you will want to subscribe to the Cygwin developers list, called cygwin-developers." New language is "There is also a low-volume list called cygwin-developers which is reserved for knowledgeable people who regularly contribute to the Cygwin DLL. Please do not ask for read-only access to this mailing list." > >However, it was NTFS-specific and Cygwin went a different > >route (which has path length limitations, but I digress). > > And, Joshua could I get a FAQ entry about this, too? OK, I added some about managed mounts. I've never really used them myself, but this is the example I came up with for the FAQ that seems to work fine: mkdir /managed-dir mount -o managed c:/cygwin/managed-dir /managed-dir cd /managed-dir/ touch makefile touch Makefile Are managed mounts prime-time enough to be put in the --help statement and users guide with caveats? -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support?
"Christopher Faylor" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Fri, Jun 10, 2005 at 10:08:30PM -0400, Beman Dawes wrote: >>I've been in contact with Newlib people working on the problem in, which >>is >>where the problem needs to be solved. They really need encouragement that >>people do care about wide character support, and that not having it is a >>black eye for an otherwise excellent and highly appreciated Cygwin effort. >>IMO of course. > > I've been reading the newlib list and they do not need "encouragement". > Newlib is like any other free software project. They need someone to do > the work. Please do not send "me toos" to the newlib list. They are > not required. > > If you want something in newlib, then please submit a patch to make it > happen. That's how it works. That was my original intent, but I'm holding off doing anything because I got private email indicating others had already done part of the work, although stalled at the moment. So I'm hoping that with a bit of encouragement they might finish the job. --Beman Dawes -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support?
On Fri, Jun 10, 2005 at 10:08:30PM -0400, Beman Dawes wrote: >I've been in contact with Newlib people working on the problem in, which is >where the problem needs to be solved. They really need encouragement that >people do care about wide character support, and that not having it is a >black eye for an otherwise excellent and highly appreciated Cygwin effort. >IMO of course. I've been reading the newlib list and they do not need "encouragement". Newlib is like any other free software project. They need someone to do the work. Please do not send "me toos" to the newlib list. They are not required. If you want something in newlib, then please submit a patch to make it happen. That's how it works. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support?
"Jaeho Shin" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] >I'm having problem with accessing files that have Unicode in their >filenames. > >... The Boost Filesystem library (www.boost.org/libs/filesystem) release version does not currently support Unicode or other wide-character filenames. The "i18n" branch in the Boost CVS does provide that support. It can be configured to traffic externally in wide-character Unicode filenames (on NTFS or other file systems with direct wide-character support) or multi-byte narrow-character encodings such as UTF-8. A mini-review of the internationalized version of Boost.Filesystem should begin on the Boost developer's mailing list in a week or so. I also plan to propose the library later this year to the C++ standards committee for the second library technical report. The library works nicely with GCC on other platforms, but there is a problem with the C library shipped with GCC/Cygwin. It doesn't support wide characters, and that in turn prevents C++ std::wstring from working. And that prevents Boost.Filesystem from working. I've been in contact with Newlib people working on the problem in, which is where the problem needs to be solved. They really need encouragement that people do care about wide character support, and that not having it is a black eye for an otherwise excellent and highly appreciated Cygwin effort. IMO of course. --Beman Dawes -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Unicode in filenames support? (FAQ update needed)
I wrote: >> [...] If a disclaimer is all that you want, I'm sure you/I can get >> it. In fact, as long as they know about the uncopyrighted code and >> don't do anything about it, they've given up rights to it. Christopher Faylor wrote: > And you prove that they don't know anything about it by...? Realistically, probably an e-mail from somebody in the legal department (just not necessarily a signed document). Or you could force the issue by sending a certified letter referring to the files on SourceForge. :-) It'd be a shame if you aren't able to use public domain files due to legal concerns. I thought "taking the high ground" by entirely dropping the copyright would maximize usefulness to everybody. gsw -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support? (FAQ update needed)
On 6/7/05, Christopher Faylor wrote: > On Tue, Jun 07, 2005 at 02:17:02PM -0400, Williams, Gerald S (Jerry) wrote: > >Corinna Vinschen wrote: > >>Not that I know of. We're discussing to convert Cygwin's path handling > >>to use Unicode for a while now, but it will take time. Don't expect > >>this any time soon. > > > >I've been off of the developer list for a while now, and now the > >archives are subscriber only. :-( > > Joshua, any chance I could get a FAQ entry about this? > > >However, it was NTFS-specific and Cygwin went a different > >route (which has path length limitations, but I digress). > > And, Joshua could I get a FAQ entry about this, too? This has got to be > at least the fifth time that someone has felt compelled to make the > observation that the current implementation of managed mode has path > length limitations. Maybe a managed mode section would be useful in > general. Sure, though unfortunately it will be a few days since I'm moving right now. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support? (FAQ update needed)
On Thu, Jun 09, 2005 at 05:24:57PM -0400, Williams, Gerald S (Jerry) wrote: >Christopher Faylor wrote: >> But releasing something to the public domain doesn't help >> Cygwin. [...] The problem is that you still have to verify >> that the sources are truly public domain and how do you do >> that without getting a disclaimer from a person's employer? >[...] >> I truly hate all of this assignment stuff that is required for >> contributions to FSF programs and Cygwin. I think it's time >> for someone to come up with an online way to do this. > >My employer authorized the release into the public domain, making the >code explicitly not protected by copyright. The lawyer-types don't >trust the assignments though, so online forms therefore wouldn't help >anyway. If a disclaimer is all that you want, I'm sure you/I can get >it. In fact, as long as they know about the uncopyrighted code and >don't do anything about it, they've given up rights to it. And you prove that they don't know anything about it by...? cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Unicode in filenames support? (FAQ update needed)
Christopher Faylor wrote: > But releasing something to the public domain doesn't help > Cygwin. [...] The problem is that you still have to verify > that the sources are truly public domain and how do you do > that without getting a disclaimer from a person's employer? [...] > I truly hate all of this assignment stuff that is required for > contributions to FSF programs and Cygwin. I think it's time > for someone to come up with an online way to do this. My employer authorized the release into the public domain, making the code explicitly not protected by copyright. The lawyer-types don't trust the assignments though, so online forms therefore wouldn't help anyway. If a disclaimer is all that you want, I'm sure you/I can get it. In fact, as long as they know about the uncopyrighted code and don't do anything about it, they've given up rights to it. Of course, IANALATEIHSMBSI (http://cygwin.com/acronyms/#IANAL and http://cygwin.com/acronyms/#YANALATEYHSMBSI). gsw -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support? (FAQ update needed)
On Thu, 9 Jun 2005, Christopher Faylor wrote: > On Thu, Jun 09, 2005 at 02:28:28PM -0400, Williams, Gerald S (Jerry) wrote: > >>Of course we would be glad to have more people working on the DLL (and > >>sign the copyright assignment, sigh), > > > >Yes, the assignment was/is a hurdle for me. It turns out to be much > >easier to release something into the public domain (at least at my > >company), thus my approach. I had actually made some progress with the > >assignment, but it went back to ground zero when my old group was > >disbanded. > > But releasing something to the public domain doesn't help Cygwin. I did > ask the Red Hat lawyer if accepting public domain sources was ok and he > said "Yes, but..." The problem is that you still have to verify that the > sources are truly public domain and how do you do that without getting a > disclaimer from a person's employer? > > I truly hate all of this assignment stuff that is required for > contributions to FSF programs and Cygwin. I think it's time for someone > to come up with an online way to do this. I asked a (very) technically > savvy lawyer acquaintance about this once and he said "Hmm..." but he > never came up with anything workable... FWIW, here's something I proposed to our lawyers that they found reasonable: paste the fingerprint hash of the digital signature on the manager's e-mail message approving the copyright assignment into the assignment form. That way, if there's ever doubt, the message can be referred to and it could be verified to be the correct message. I don't know how workable that is, but it could be a start... Of course, it could be the ravings of a madman on a dark September night, in which case feel free to ignore or drop a hippo on me (in the appropriate list). Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_[EMAIL PROTECTED] ZZZzz /,`.-'`'-. ;-;;,_[EMAIL PROTECTED] |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! "The Sun will pass between the Earth and the Moon tonight for a total Lunar eclipse..." -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support? (FAQ update needed)
On Thu, Jun 09, 2005 at 02:28:28PM -0400, Williams, Gerald S (Jerry) wrote: >>Of course we would be glad to have more people working on the DLL (and >>sign the copyright assignment, sigh), > >Yes, the assignment was/is a hurdle for me. It turns out to be much >easier to release something into the public domain (at least at my >company), thus my approach. I had actually made some progress with the >assignment, but it went back to ground zero when my old group was >disbanded. But releasing something to the public domain doesn't help Cygwin. I did ask the Red Hat lawyer if accepting public domain sources was ok and he said "Yes, but..." The problem is that you still have to verify that the sources are truly public domain and how do you do that without getting a disclaimer from a person's employer? I truly hate all of this assignment stuff that is required for contributions to FSF programs and Cygwin. I think it's time for someone to come up with an online way to do this. I asked a (very) technically savvy lawyer acquaintance about this once and he said "Hmm..." but he never came up with anything workable... cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Unicode in filenames support? (FAQ update needed)
> Of course we would be glad to have more people working on > the DLL (and sign the copyright assignment, sigh), Yes, the assignment was/is a hurdle for me. It turns out to be much easier to release something into the public domain (at least at my company), thus my approach. I had actually made some progress with the assignment, but it went back to ground zero when my old group was disbanded. > but what you wrote sounds somewhat like a special solution > which requires lots of new "if (is_ntfs)" tests, roughly. Sort of, although I approached it as a set of services that could replace Windows file operations with extended versions that could be selected dynamically. I had come up with the following list of functions to replace (which perhaps might be of some use to you): CopyFile CopyFileEx CreateDirectory CreateDirectoryEx CreateFile DeleteFile FindFirstChangeNotification FindFirstFile FindFirstFileEx GetBinaryType GetFileAttributes GetFileAttributesEx GetFullPathName GetLongPathName GetShortPathName MoveFile MoveFileEx MoveFileWithProgress RemoveDirectory ReplaceFile SearchPath SetCurrentDirectory SetFileAttributes SetFileSecurity gsw -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support? (FAQ update needed)
On Jun 8 18:20, Williams, Gerald S (Jerry) wrote: > (I don't necessarily expect that there will be any interest > in my solution, but I thought that I should mention it just > in case. As I said, there are other ways to deal with this > without imposing path length limitations, and I don't even > know how much of a concern such limits are in general.) Of course we would be glad to have more people working on the DLL (and sign the copyright assignment, sigh), but what you wrote sounds somewhat like a special solution which requires lots of new "if (is_ntfs)" tests, roughly. Sure it only works on file systems supporting that (which is NTFS, basically), but the code already contains way too many different routes due to OS/FS differences. Our vague ideas how to implement this are more along the lines of "always use the fooW (on Win32 level) or _U (on NT level) functions and drop back to something else only on 9x, if necessary". This would also automatically cover the managed mounts in terms of path length. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader mailto:cygwin@cygwin.com Red Hat, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Unicode in filenames support? (FAQ update needed)
I wrote: >> However, it was NTFS-specific and Cygwin went a different >> route (which has path length limitations, but I digress). Christopher Faylor wrote: > And, Joshua could I get a FAQ entry about this, too? This > has got to be at least the fifth time that someone has felt > compelled to make the observation that the current > implementation of managed mode has path length limitations. Sorry, poor wording choice. To be honest, I don't even know if managed mounts still have those limitations since I don't use them, although that was my understanding at the time. My approach was to use underlying NT services that bypass normal Windows naming restrictions, allowing more or less arbitrary Unicode strings as file names. It had path length limitations, but they were no worse than what Windows has already. It was my understanding that Cygwin managed mounts did this by escaping such characters into multi-character sequences, which of course would cause you to run into the Windows limits sooner. There are other ways to accomplish this, so the mechanism may have changed for all I know. (I don't necessarily expect that there will be any interest in my solution, but I thought that I should mention it just in case. As I said, there are other ways to deal with this without imposing path length limitations, and I don't even know how much of a concern such limits are in general.) gsw -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support?
Shaddy Baddah wrote: Hi, Williams, Gerald S (Jerry) wrote: Corinna Vinschen wrote: Not that I know of. We're discussing to convert Cygwin's path handling to use Unicode for a while now, but it will take time. Don't expect this any time soon. I've been off of the developer list for a while now, and now the archives are subscriber only. :-( How are you thinking about doing this? I too am interested in the discussion. I looked into the associated code recently, even going so far as looking into the source to see what Sun Java 1.5.0 does in this area. It may be relevant to know that Unicode support was added, for at least the java.awt.FileDialog class, under the Windows JVM, between 1.4.2 and 1.5.0. Support for the Microsoft Layer for Unicode (unicows.dll) dll is was also added (and the dll distributed with the binary package). Are the cygwin developers contemplating support for this dll as well? On an aside, I respect the policy on the developers mailing list being subscriber only, but I disagree with the "there is no middle ground" principal when it comes to read-only access to the mailing list. Theorising that the reason for it is read-only access would encourage an increase in "add me please" requests to the list. Is that such a big price to pay? Regards, Shaddy Nope, that's not a big price at all. The big price is when you tire CGF, the man behind this all. http://www.cygwin.com/ml/cygwin/2005-05/msg00870.html -- Carlo Florendo Astra Philippines Inc. www.astra.ph -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support?
Jaeho Shin wrote: I'm having problem with accessing files that have Unicode in their filenames. 1. I use Windows XP Korean version (so the codepage must be 949?). 2. I use iTunes to listen to my music. 3. Files in iTunes Library have filenames in the following format: "{Artist}/{Album}/{Track#} {Title}.mp3" where names inside braces are values from its ID3-tag. 4. Some of my mp3s have Japanese or Latin characters, e.g. é (Latin small letter e with acute). In ID3-tags, those characters seem to be in UCS-2 encoding or so, but not in CP949 or EUC-KR. 5. I want to rsync those files to my other Linux machine. 6. But rsync complains some files (whose name contains such special/Unicode characters perhaps?) have vanished! :'( With Windows Explorer, I can copy them to a Samba share (with utf-8 encoding) without any problem. However, from the Cygwin environment, it seems that there is no way I can access those files. I tried the "mount -o managed" option which escapes capitals and other non-ascii characters in filenames. It wasn't a solution for me since iTunes (not Cygwin) mainly manages the files. Since I really want to use rsync, I hope Cygwin to be able to access Unicode filenames. It would be great if I could mount a filesystem with a charset or encoding specified. Is there any nice way already I can solve this problem? Some time ago I wrote a patch for Cygwin that converted Unicode files to UTF-8 and back. Maybe you can dig that up and see if you can get it working with the latest Cygwin code. Chris -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support?
Hi, Williams, Gerald S (Jerry) wrote: > Corinna Vinschen wrote: > >>Not that I know of. We're discussing to convert Cygwin's path >>handling to use Unicode for a while now, but it will take time. >>Don't expect this any time soon. > > > I've been off of the developer list for a while now, and > now the archives are subscriber only. :-( > > How are you thinking about doing this? I too am interested in the discussion. I looked into the associated code recently, even going so far as looking into the source to see what Sun Java 1.5.0 does in this area. It may be relevant to know that Unicode support was added, for at least the java.awt.FileDialog class, under the Windows JVM, between 1.4.2 and 1.5.0. Support for the Microsoft Layer for Unicode (unicows.dll) dll is was also added (and the dll distributed with the binary package). Are the cygwin developers contemplating support for this dll as well? On an aside, I respect the policy on the developers mailing list being subscriber only, but I disagree with the "there is no middle ground" principal when it comes to read-only access to the mailing list. Theorising that the reason for it is read-only access would encourage an increase in "add me please" requests to the list. Is that such a big price to pay? Regards, Shaddy -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support? (FAQ update needed)
On Tue, Jun 07, 2005 at 02:17:02PM -0400, Williams, Gerald S (Jerry) wrote: >Corinna Vinschen wrote: >>Not that I know of. We're discussing to convert Cygwin's path handling >>to use Unicode for a while now, but it will take time. Don't expect >>this any time soon. > >I've been off of the developer list for a while now, and now the >archives are subscriber only. :-( Joshua, any chance I could get a FAQ entry about this? >However, it was NTFS-specific and Cygwin went a different >route (which has path length limitations, but I digress). And, Joshua could I get a FAQ entry about this, too? This has got to be at least the fifth time that someone has felt compelled to make the observation that the current implementation of managed mode has path length limitations. Maybe a managed mode section would be useful in general. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Unicode in filenames support?
Corinna Vinschen wrote: > Not that I know of. We're discussing to convert Cygwin's path > handling to use Unicode for a while now, but it will take time. > Don't expect this any time soon. I've been off of the developer list for a while now, and now the archives are subscriber only. :-( How are you thinking about doing this? At one point, I created a framework that supported this. Unicode support was actually just a side-effect--my real goal was to let you use two files whose names differ only by case or use files with otherwise illegal names such as "aux". I even went so far as to create a project on SourceForge so that I could release it into the public domain. However, it was NTFS-specific and Cygwin went a different route (which has path length limitations, but I digress). I did finally get my company's permission to release the code, but there was little point by then. (I also had to scramble to survive a reorg at that time and didn't have any time at all for quite a while afterwards.) If there is interest in my NTFS-specific solution, please let me know. (Actually, it's not necessarily specific to NTFS, though it probably is in practice. It definitely doesn't support FATxx or Win9x.) gsw -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Unicode in filenames support?
On Jun 7 16:08, Jaeho Shin wrote: > I'm having problem with accessing files that have Unicode in their > filenames. > [...] > Since I really want to use rsync, I hope Cygwin to be able to access > Unicode filenames. It would be great if I could mount a filesystem with > a charset or encoding specified. Is there any nice way already I can > solve this problem? Not that I know of. We're discussing to convert Cygwin's path handling to use Unicode for a while now, but it will take time. Don't expect this any time soon. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader mailto:cygwin@cygwin.com Red Hat, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Unicode in filenames support?
I'm having problem with accessing files that have Unicode in their filenames. 1. I use Windows XP Korean version (so the codepage must be 949?). 2. I use iTunes to listen to my music. 3. Files in iTunes Library have filenames in the following format: "{Artist}/{Album}/{Track#} {Title}.mp3" where names inside braces are values from its ID3-tag. 4. Some of my mp3s have Japanese or Latin characters, e.g. é (Latin small letter e with acute). In ID3-tags, those characters seem to be in UCS-2 encoding or so, but not in CP949 or EUC-KR. 5. I want to rsync those files to my other Linux machine. 6. But rsync complains some files (whose name contains such special/Unicode characters perhaps?) have vanished! :'( With Windows Explorer, I can copy them to a Samba share (with utf-8 encoding) without any problem. However, from the Cygwin environment, it seems that there is no way I can access those files. I tried the "mount -o managed" option which escapes capitals and other non-ascii characters in filenames. It wasn't a solution for me since iTunes (not Cygwin) mainly manages the files. Since I really want to use rsync, I hope Cygwin to be able to access Unicode filenames. It would be great if I could mount a filesystem with a charset or encoding specified. Is there any nice way already I can solve this problem? -- 신재호 | Jaeho Shin <[EMAIL PROTECTED]> | http://netj.org/ Programming Research Laboratory, Seoul National University signature.asc Description: OpenPGP digital signature