And for what it's worth, I submitted this "solution" using the Win32 API calls FindFirstFileW, FindNextFileW in late January: http://aspn.activestate.com/ASPN/Mail/Message/2996684
More generally, the built-in directory and file operators don't work with directory and filenames that contain unicode characters, there's nothing in the Win32 Perl documentation that warns about this limitation, it's easy to burn lots of time trying to figure out why a script that should work, that's been working, doesn't work (ok, at least it was easy for me to burn lots of time on this...), the Win32API::File module doesn't include this functionality (though it would seem to be the right place for it), I've seen nothing on the Perl5 porters list or anywhere else that indicates whether this limitation will ever be fixed, ...
One of these days I need to write a Perl module / CPAN distribution to do this....
Regards,
... Dewey
"Timothy Johnson"
<[EMAIL PROTECTED]>
04/17/2006 08:38 PM |
|
If you follow the links inside the link you just posted, you’ll eventually come to this:
http://groups.google.com/group/perl.unicode/browse_thread/thread/21b8a3cde8e54b8f/86ab5af239975df7?#86ab5af239975df7
Peter Gordon wrote:
> Hi Guys.
> I need some help with a project that
I have. I have to copy files using
> Perl to different places and the filenames may be in Hebrew, Chinese,
> Korean etc.
> The problem is, that filenames, when
using opendir, are returned as
> question marks. In the DOS box I have set the codepage to 862. So
DIR
> returns accented characters, but Perl still returns question marks.
I
> have also set "use utf8", but that didn't help either.
> So the problem I have is how to proceed.
Should I give up with Perl and
> use Java or C? Any suggestions gratefully received.
I don't think you have to give up using Perl.
Something like this should work:
#!/usr/bin/perl
use strict;
use warnings;
use Encode;
use IO::Dir;
# Let perl know that we want to output cp862
on STDOUT
binmode( STDOUT, ':encoding(cp862)' );
my $dir = IO::Dir->new('.')
or die("Failed to open dir : $!");
while ( my $entry = $dir->read ) {
next if $entry =~ /^\.{1,2}$/;
# Decode octets into perl's internal
unicode encoding
$entry = Encode::decode_utf8($entry, 1);
printf( "%s\n", $entry
);
}
$dir->close;
Regards
Christian
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of D D Allen
Sent: Monday, April 17, 2006 2:05 PM
To: Ng, Bill; Trevor Joerges
Cc: perl-win32-users@listserv.ActiveState.com
Subject: Re: Quick Q [-d test with unicode/wide directory names]
A quick word of caution about the -d directory operator under Win32 (and -e and opendir and ...). If the Win32 directory name contains unicode / wide characters, the -d operator will always return "false". As I understand it, Win32 Perl uses the A (ansi) version of the Win32 API directory calls -- which don't work with file and directory names that include unicode characters. So even if you pass "-d" a string variable in Windows native UTF16LE format that correctly matches the directory name (with unicode characters) on disk, if will return "false". Same goes for the "-e" file existence test -- for filenames that contain unicode characters. "opendir" suffers from the same problem.
<snip>
Apparentlly, there was a "USING_WIDE" macro / switch in the Perl source that attempted to deal with Win32 file and directory names that contain unicode characters (that would make Perl use the Win32 API "W" version directory calls) but it was disabled with Perl 5.8. See: http://aspn.activestate.com/ASPN/Mail/Message/perl5-porters/2933666
[Yes, there are workarounds to this problem... all of which involve not using Perl's built-in directory functions.]
_______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs