Hello!
Consider a small perl code below.
It should copy text file with removing leading and trailing spaces.
while (<>) {
s/^s+//; s/s+$//;
print $_, "n"; # say;
}
I run it with "shell" redirection
perl copy.pl <src.txt >dst.txt
It works well for Windows ansi and utf-8 text files. But when I have
tried an unicode (ucs-2le) source file containing
"anb" this is
FF FE 61 00 0D 00 0A 00 62 00 0D 00 0A 00
in hex (with Byte Order Mark) I get characters in hex
FF FE 61 00 0D 00 0D 0A 00 62 00 0D 00 0D 0A 00 0D 0A
.
I have attached these files but I am not sure what Mail Agents do with
them.
Variable PERL_UNICODE is not set.
I have tried add -CS to the command line, but got info about Malformed
UTF-8 character.
I have tried adding each of these pragmas to the beginning
use open ':encoding(UCS-2LE)';
use open IO => ':encoding(UCS-2LE)';
use open ':std' => ':encoding(UCS-2LE)';
but without desired goal. I tried to combine the pragma with -CS
option.
I have tried use feature qw/say/; say; instaed of print $_, "n"; but
without correct results.
perl --version
This is perl 5, version 18, subversion 2 (v5.18.2)
built for MSWin32-x86-multi-thread-64int
What is the correct way to set stdin/out to UCS-2LE, please?
What is the correct way to print "encoding independent" new line
character, please?
What is the correct way to say that s should match the "UCS-2LE way",
please?
In addition, is there a standardised way to auto-detect input encoding
(legacy(8bit)/utf-8/ucs-2), please?
Best regards
Hans Ginzel
ÿþa
b
ÿþa
b
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/