Mohammad Yaseen [EMAIL PROTECTED] writes:
Hi,
I'm using perl-5.8.7.
What is PLAN9?
An operating system.
Ans what is plan9 directory is meant for in the source directory.
Building perl for/on a plan9 system
Thanks and Regards
Yaseen
John Delacour [EMAIL PROTECTED] writes:
use MIME::QuotedPrint;
$qp = encode_qp ($_, '');
print =?UTF-8?Q?$qp?= . $/;
That isn't quite right.
MIME::QuotedPrint does NOT encode space or tab.
RFC2047 says:
The Q encoding is similar to the Quoted-Printable content-
transfer-encoding defined
Wing [EMAIL PROTECTED] writes:
John Delacour [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
At 12:42 am +0800 28/12/05, wing wrote:
I need to encode the subject line in a MIME header in UTF8 (something like
Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=). I know that
this
David Olsson [EMAIL PROTECTED] writes:
What is the easiest way to install Encode for a single
user?
Same as any other CPAN module.
perl Makefile.PL PREFIX=/home/cedric/perl_modules
make
make install
then:
#!/usr/bin/perl
use lib '/home/cedric/perl_modules';
# or if script is relative to
Rajarshi Das [EMAIL PROTECTED] writes:
Hi,
The following two line script gives an error on z/OS : Unknown encoding
'iso-2022-
jp' at line ...
-
use Encode; use encoding 'iso-2022-jp';
On an EBCDIC platform like z/OS that is going to be one strange
Rajarshi Das [EMAIL PROTECTED] writes:
I run the following on an ebcdic platform
(perl-5.8.6),
$BOM = chr(0xFEFF);
open(UTF_PL, :raw:encoding(utf16le), utf.pl)
or die utf.pl($enc,$tag): $!;
print UTF_PL $BOM;
print UTF_PL 1;
should the data that is written using PerlLIO_write,
be \xFF
Stuart Hughes [EMAIL PROTECTED] writes:
Hi everyone,
I've run into problems matching the regex [^\s] on RedHat 8/9 and the
version of perl shipped with it (5.8.0).
It isn't 5.8.0 is 5.8.0-with-RedHatBugs :-(
To be fair to them it is some development track thing - there was
an experimental
Radoslaw Zielinski [EMAIL PROTECTED] writes:
Hello,
What's the point of lines 151 and 167 in Encode.pm? Respectively:
# sub encode
$_[1] = $string if $check;
# sub decode
$_[1] = $octets if $check;
I really can't see a point in overwriting the input value... Why only
if
Bjoern Hoehrmann [EMAIL PROTECTED] writes:
Now that we have this problem, introducing more places where one needs
to carefully check the documentation what is considered UTF-8 does not
seem like the best option, having decode_utf8() and decode(utf8=...)
mean some- thing different is likely
Bjoern Hoehrmann [EMAIL PROTECTED] writes:
* Bjoern Hoehrmann wrote:
Enocde 2.08, PerlIO::scalar 0.02, ActivePerl 5.8.2,
#!perl -w
use strict;
use warnings;
use Encode;
my $string = encode(UTF16 = );
for (qw/UTF-8 UTF-16LE UTF-16BE UTF-32LE UTF-32BE/)
{
my $backup =
Rick Measham [EMAIL PROTECTED] writes:
That being the case, I grab the charset and use Encode's decode function
to turn it into 'perl's internal format' .. which in 5.8.5 is utf8
right?
As it happens the answer is maybe, but it is the _internal_ form it is none
of your
business ;-) - so
Paul Bijnens [EMAIL PROTECTED] writes:
I have a program that reads and writes (among others) strings that
should be utf8 encoded. I say should, because somewhere deep
inside the dark corners of that program, sometimes, the utf8 flag on
a string is lost. (I'm still investigating where, tips to
Aaron Siladi [EMAIL PROTECTED] writes:
I have a UTF-8 string which I want to output as ascii and have the UTF8
characters converted to numeric character references.
I tried using Encode with the FB_HTMLCREFS fail back option enabled, but for
the 2 byte UTF8 characters, 2 incorrect char refs
Dan Kogai [EMAIL PROTECTED] writes:
On Oct 25, 2004, at 03:01, Nick Ing-Simmons wrote:
But as Dan said at the start \xF6 on its own (say as 1023 octet
in a 0..1023 1024-octet buffer is not a fail.
Changing that will make :encoding() layer have problems as buffer
boundaries can occur
Bjoern Hoehrmann [EMAIL PROTECTED] writes:
Hi,
What is currently the best way to resolve charset names to use them
with Encode.pm? I would have expected that e.g.
Encode::decode('ebcdic-cp-us', '')
would just work but it does not appear to know that alias. Then I've
tried to use
Dan Kogai [EMAIL PROTECTED] writes:
On Oct 23, 2004, at 01:04, Bjoern Hoehrmann wrote:
C12a in Unicode 4.0.1 notes
[...]
For example, in UTF-8 every code unit of the form 110 must be
followed by a code unit of the form 10xx. A sequence such as
110x 0xxx is illformed and
Dan Kogai [EMAIL PROTECTED] writes:
On Oct 24, 2004, at 06:41, Rafael Garcia-Suarez wrote:
Dan Kogai wrote:
Within less than 24hrs I resorted to release version 2.07. What the
heck. 5.8.6 is soon
I applied 2.07 to bleadperl, and looks like something is broken in
PerlIO::encoding.
More
Rafael Garcia-Suarez [EMAIL PROTECTED] writes:
Dan Kogai wrote:
This makes perl-5.8.6 happy but the problem is that I have made
Encode::utf8 so that it accepts fallback values like Encode::XS (upon
the request by Bjoern Hoehrmann via RT). Encode::utf8 used to return
immediately at partial
Dan Kogai [EMAIL PROTECTED] writes:
On Oct 24, 2004, at 18:34, Rafael Garcia-Suarez wrote:
Welcome to backward compatibility hell :)
Hell it was but seems like I came up with a way out (yay).
I just want Encode::utf8-decode() to make sure Encode:RETURN_ON_ERR
is
on when the callar is
Rick Measham [EMAIL PROTECTED] writes:
G'day Unicode Gurus and other assorted members of the perl Unicode
community.
I have a script that attempts to collect translations from Babelfish.
I've posted it below.
It uses LWP::Useragent to turn an English phrase into Japanese (or any
other language
Rafael Garcia-Suarez [EMAIL PROTECTED] writes:
I have a problem to avoid Mailformed UTF-8 caracter messages when I use the
Switch.pm module on SuSE 9.1 Profesional with english or german language
settings.
Could we see a snippet of code that demonstrates the problem ?
The version of perl you
Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] writes:
$ perl -e 'use encoding ISO-8859-2; use open :encoding(ISO-8859-2); print
ord($ARGV[0]), chr(260), $ARGV[0], \n' \x{00a1} does not map to iso-8859-
2 at -e line 1. 260\x{00a1}
I don't understand it: ord($ARGV[0]) is 260, chr(260) can be
Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] writes:
Should strings without the UTF8 flag be interpreted in the default
encoding of the current locale or in ISO-8859-1?
This is a tricky question - and status quo is likely to remain
for compatibility reasons.
Perl treats them inconsistently. On
Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] writes:
W licie z pon, 16-08-2004, godz. 11:16 +0100, Nick Ing-Simmons napisa:
Perl treats them inconsistently. On one hand they are read from files
and used as filenames without any recoding, which implies that they are
assumed to be in some
Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] writes:
But there is a simple workaround for that, as perluniintro would tell
you: the encoding pragma.
The encoding pragma partially works. It doesn't influence assumed
encoding of files opened without specifying the encoding, nor handling
of
Marcin 'Qrczak' Kowalczyk [EMAIL PROTECTED] writes:
W licie z pon, 16-08-2004, godz. 16:54 +0300, Jarkko Hietaniemi
napisa:
The encoding pragma partially works. It doesn't influence assumed
encoding of files opened without specifying the encoding, nor handling
of filenames, and it needs to
Nicholas Clark [EMAIL PROTECTED] writes:
On Mon, Jun 21, 2004 at 08:46:07AM -0700, Jan Dubois wrote:
I think it is possible, but it requires someone to both do the work and
to argue for it on P5P. Without this champion, I don't see it
happening at all.
Nor do I. But P5P isn't big on arguing
Marco Baroni [EMAIL PROTECTED] writes:
A few days ago, I queried this list about my problems with a script
that finds the charset of Japanese web pages and translates their text
into utf-8.
The following solution, proposed by Nick Ing-Simmons, worked for my
case:
binmode STDOOUT,:utf8
Erland Sommarskog [EMAIL PROTECTED] writes:
Jarkko Hietaniemi ([EMAIL PROTECTED]) writes:
Nick Ing-Simmons wrote:
This thread started as complaint that perl5 can't read a
script saved as UCS-2/UTF-16 or whatever Windows uses.
Uh, really? Perl 5.8+ should be able to do that, automatically
Erland Sommarskog [EMAIL PROTECTED] writes:
Nick Ing-Simmons ([EMAIL PROTECTED]) writes:
Erland Sommarskog [EMAIL PROTECTED] writes:
I would really expect someone to have done this already, but I see no
reference to such a module. Or layer-directive like :use-bom to open
the file. And then some
: Nick Ing-Simmons [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, April 13, 2004 11:13 AM
Subject: Re: Decoding more languages
Octavian Rasnita [EMAIL PROTECTED] writes:
Hello all,
I want to transform a text that contains words in more languages (it is a
course for learning a foreign
Octavian Rasnita [EMAIL PROTECTED] writes:
I have tried the following script:
#!/perl/bin/perl -wC
use Encode;
my $text = Encode::decode('latin2', 'mta');
binmode(STDOUT, :utf8);
print Content-Type: text/html; Charset=UTF-8\n\n;
print Encode::encode('utf8', $text);
You have double-encoded
Erland Sommarskog [EMAIL PROTECTED] writes:
It seems that the only way out, is to first open the file in plain mode,
binmode I suspect.
look at the first three bytes, and if it is BOM, close the file, open
again with the appropriate options and discard the BOM.
You don't have to close it just
Erland Sommarskog [EMAIL PROTECTED] writes:
open (F, ':encoding(ucs-2le)', 'rkmacka-ucs2.txt');
And one things seems just plain wrong to me: The \n is written as
0A 0D to the file, not 000A, 000D. But may there is some more manual
reading I need to do find out how to do it.
0A 0D is
Sebastian Lehmann [EMAIL PROTECTED] writes:
Hello,
i use a perl script to search different files. The search values are given
from a HTML page, the results are displayed on this page, too. The files are
saved in the UTF16LE format, therefore i will open them with the following
open command:
Andreas Jaekel [EMAIL PROTECTED] writes:
Dear Perl Dieties!
I've been trying to figure this out for myself for a couple
of hours now, but I got to the point were I gave up and decided
that I'll have to bother you. Hope you don't mind.
My task is the following, and I'm running out of ideas:
//
Guido Flohr [EMAIL PROTECTED] writes:
ALexander N. Treyner wrote:
Hello All,
I'm using utf-8 Postgres database, where I save strings in many languages.
I have to match the database with strings encoded in mime base64 or
quoted-printable format. Like next:
ALexander N. Treyner [EMAIL PROTECTED] writes:
Hi John,
Your code works perfect.
But I found one strange thing.
For example I have next string:
hello hello world
that converted by the mail client to
hello =?windows-1255?Q?=F9=EC=E5=ED_hello_world?=
After
Brad Guillory [EMAIL PROTECTED] writes:
Last spring someone committed a patch to fix the tests on windows
platforms (see Change 18966 by [EMAIL PROTECTED] on 2003/03/14 04:20:51).
This broke the tests on my Redhat box. Here is a compromise patch:
--- t/enc_module.t.orig 2004-01-28
Eric Cholet [EMAIL PROTECTED] writes:
Le 1 janv. 04, 17:50, Rafael Garcia-Suarez a crit :
+(However, and as a limitation of the current implementation, using
+C\w or C\W Iinside a C[...] character class will still match
+with byte semantics.)
I don't think it applies to \w, only \W. \x{df}
Jarkko Hietaniemi [EMAIL PROTECTED] writes:
Let's not 'fix' it (not carve it on a stone), but offer a few
well-thought-out options. For instance, Perl may offer (not that these
are particularly well-thought-out) 'just treat this as a sequence of
octets', 'locale', and 'unicode'. 'locale' on
Jungshik Shin [EMAIL PROTECTED] writes:
Then, he should switch to en_GB.UTF-8.
I probably will.
Besides, he implied that
he still uses ISO-8859-1 for files whose names can be covered by
ISO-8859-1, which is why I wrote about mixing up two encodings
in a single file system _under_ his
Jarkko Hietaniemi [EMAIL PROTECTED] writes:
What I wish is that the whole current locale system would curl up and
die.
As you'd agree, it's only 'encoding' part that has to die.
Oh no, there are plenty of parts in it that I wish would die :-)
(though the coupling of encoding is a major
Jungshik Shin [EMAIL PROTECTED] writes:
That will work if there's en_GB.UTF-8 available for him in his
particular Unixes and assuming using UTF-8 locales won't break other
things.
Just so we get this clear. A year or so back I - as a Unicode advocate - tried
to switch to en_GB.utf8. Within
Dana Sharvit - M [EMAIL PROTECTED] writes:
Hi ,
I am using the Encode module (perl 5.8)to convert a string from utf8 to big
5.
There is something that I do not understand that I thought you may help
with:
The input to the program is a file that contains a utf8 string,
The encoding works properly
Edward Batutis [EMAIL PROTECTED] writes:
Also each character when I view it via character
listing of IME pad, it has three hex numbers.
Seeing three hex numbers per character is a sure sign you've got utf8. You
need to convert the characters to the platform encoding before using 'open'.
In
Jarkko Hietaniemi [EMAIL PROTECTED] writes:
a year ago, there was a discussion on this list about Encode not
recognizing TIS-620 as alias for iso-8859-11:
http://nntp.x.perl.org/group/perl.unicode/1656
In the latest release of Encode::Alias (1.38 from Encode 1.9801,
included in Perl
SADAHIRO Tomoyuki [EMAIL PROTECTED] writes:
Hello.
For round-trip fidelity, Mac OS CJK encodings include many characters
with mapping a single character in a Mac OS encoding
to a sequence of standard Unicode characters.
(cf. ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/README.TXT )
In the
Hank Tt [EMAIL PROTECTED] writes:
Hi,
I'm trying to make a UCM file to feed to enc2xs. The legacy encoding for
Taiwanese romanization *must* have its code points mapped to Unicode
character sequences, for the simple reason that the UCS lacks the
corresponding precomposed characters (and is
Hank Tt [EMAIL PROTECTED] writes:
Hi,
I'm trying to make a UCM file to feed to enc2xs. The legacy encoding for
Taiwanese romanization *must* have its code points mapped to Unicode
character sequences, for the simple reason that the UCS lacks the
corresponding precomposed characters (and is
John Delacour [EMAIL PROTECTED] writes:
At 11:31 am +0100 16/9/03, [EMAIL PROTECTED] wrote:
Dear PERLists,
I am running Perl 5.8. and trying to filter out some invalid Unicode
characters from Unicoded texts of some South Asian languages. There
are 28 such characters in my data (all control
Jarkko Hietaniemi [EMAIL PROTECTED] writes:
On Thu, Aug 28, 2003 at 03:16:20PM +0100, [EMAIL PROTECTED] wrote:
Does the existing perl5.8.* Unicode support have a way to efficently
determine which script(s) or block (in unicode sense) a code point belongs
to?
use Unicode::UCD
Dan Kogai [EMAIL PROTECTED] writes:
But that is not good enough for cases below because...
(Hiragana | Katakana | Han) = 'jisx0208.1990-0'
This is very wrong because jisx0208.1990-0 only contains \p{Han} that
appears in Japanese (JIS X 0208, to be exact). On the other hand,
jisx0208.1990-0
Owen Taylor [EMAIL PROTECTED] writes:
You might want to look at what we did for Pango - see
pango/modules/basic/tables-big.i in
ftp://ftp.gtk.org/pub/gtk/v2.2/pango-1.2.5.tar.gz.
[There may come a time when I just give up Tcl/Tk and implement
perl/Tk OO interface on top of gtk instead. But not
Owen Taylor [EMAIL PROTECTED] writes:
On Fri, 2003-08-29 at 11:14, Nick Ing-Simmons wrote:
We're dropping support for this code and for core X fonts
in the next release of Pango,
In favour of what? (FreeType on client side?)
Yes, using the Xft and fontconfig libraries. (http
Jungshik Shin [EMAIL PROTECTED] writes:
If you want, you can take a look at nsFontMetricsGTK.cpp file
of mozilla.
Can you pass on my admiration to the Mozilla team - its
handling of these issues in version 1.4 is so much better
than ye-olde Netscape.
You can view that huge file (over 6,000
Simon Cozens [EMAIL PROTECTED] writes:
[EMAIL PROTECTED] (Simon Cozens) writes:
Can someone give me a few quick examples of creating Encode::XS objects
to do simple transcoding, from XS?
I think I expressed myself badly. Perhaps I don't mean creating Encode::XS
objects, but instantiating them.
expect it to be in 5.8.1 as well.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Nick Ing-Simmons [EMAIL PROTECTED] writes:
Martin J. Evans [EMAIL PROTECTED] writes:
A socket is a file handle so :
binmode($sock,:utf8);
should work.
I'm obviously missing something rather fundamental here.
Not you - us.
How can we have got this far without someone discovering
' encoding
uses core's SvUTF8 scheme - which is just fine if it _IS_ UTF-8
What we need for Encode::* to have its _own_ UTF-8 and UTF-EBCDIC
encode/decode independant of what core is using...
Thanks
Brian
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
email using Encode::'s euc-cn
and Unicode fonts, but as I can't read many chineese characters
this was mainly just as an exercise.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
v1.2.1 (FreeBSD)
iD8DBQE+NvePtLPdNzw1AaARAm5fAJ9cURDB+e2FO88Aa+ULzJxACOWwAACfSiy0
i/vf6NBdmU5ynqXHU66nRso=
=keaI
-END PGP SIGNATURE-
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Attached is patch that implements -decode and -encode of
Encode::utf8 as XS code that obeys all the rules that Encode::XS does.
This allows :encoding(UTF-8) to handle partial chars at end of buffers
correctly.
Submited as
//depot/perlio/...18032
--
Nick Ing-Simmons
http://www.ni-s.u
you to that conclusion?
how can I
flatten it to binary? I tried with unpack without success.
Nadim.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
to be more usable (less embedded or at
least more systematic-looking punctuation, more familiar from e-mail
and HTTP headers etc.) We can revisit that if people think it would
help.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
to their
preferred MIME names, all in lowercase. Maybe the unique ID number
(MIBenum) could also be taken into account.
I have no objection to that - and I doubt Dan will either.
Would you care to at least enumerate the cases we fail - or ideally
provide patch(es) ?
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
remains
empty and maybe we can make use of it
I probably will - there are a whole slew of Encode-oid issues with
body part of MIME.
Dan the Encode Maintainer
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
it on my machine.
To be pedantic it is not an Encoding it is a non-encoding ;-)
I would recommend using encode and decode rather than from_to
in such cases.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
($_),split(//,$string)));
print $ord;
But, this gives a 3-character string 怜 (with the decimal values
230, 164 and 156). Could anyone please point me to the right direction
on how to get the decimal number 26908 instead?
Thanks in advance.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
);
}
'
combining with i: \x{00ee}
combining with dotless i: \x{0131}\x{0302}
What do you think?
Makes sense to me. U+00EE is LATIN SMALL LETTER I WITH CIRCUMFLEX
not LATIN SMALL LETTER DOTLESS I WITH CIRCUMFLEX
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
it is UTF-8 encoded
^^^
Why is that step necessary? encode_utf8() should do that itself on the way ...
$self-{CONTENT} = Encode::encode_utf8($self-{CONTENT}); # make octets
}
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
xmodmap.
Generally this sort of thing needs to be handled below Readline or TK or
whatever.
I think it is do-able as readline or Tk - or even a PerlIO layer:
binmode(STDIN,:via(combine_accents));
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
in design.
I quite agree - which is why Encode works the same way :-)
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Guido Flohr [EMAIL PROTECTED] writes:
Hi,
On Thu, Jul 11, 2002 at 12:15:30PM +0100, Nick Ing-Simmons wrote:
For my Tk application of encode the in-place form causes unnecessary
copies. e.g. I need the original and the form encoded into the encoding
required by the font, or I have to copy
had this problem?
Does your $ENV{LANG} match /utf-?8/i ?
If so then perl5.7.3+ will have assumed utf8 on your behalf...
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
something that is so generic.
Paul
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
far, but passing a variable that contains undef is more
common.
Can this be detected silenced?
/Paraphrase
Yes it could but we don't for very good reasons.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
::JIS2K
! lib/Encode/Guess.pm
POD fix by Miyagawa-kun
Message-Id: [EMAIL PROTECTED]
Dan the Encode Maintainer
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
in theory there can be bits for
A. Update src string
B. Use fallbacks
C. Partials as bad chars
D. Use perl QQ
E. Warn on error
F. Croak on error
H. ;-) Use HTML entities as fallbacks
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
rather than a passive bit?
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
//depot/perlio/ext/Encode/Encode.pm#64 -
/home/p4work/perl/perlio/ext/Encode/Encode.pm
Index: perlio/ext/Encode/Encode.pm
--- perlio/ext/Encode/Encode.pm.~1~ Sat Apr 20 20:36:47 2002
+++ perlio/ext/Encode
as there is no certainty that lib / archlib
relative paths work like that. Will tweak Tk's Makefile.PL configure
to hunt down encode.h.
Will do a spelling patch on the pod(s) when I get a chance.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
--- Encode.xs.ship Fri Apr 19 19:25:26 2002
designed so that you can rely on CRLF
to split the stream.
Dan the Encode Maintainer.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Dan Kogai [EMAIL PROTECTED] writes:
On Friday, April 12, 2002, at 02:30 , Nick Ing-Simmons wrote:
Having hacked RFC2047 support into tkmail I have now seen some
non-latin1 characters in a real perl/Tk app.
There seem to be a few snags with mime's iso-2022-jp:
- It failed to demand load
Encode() atitude ( which is
fine, just not my style ), I guess there isn't much to go on for me. ;)
que sera sera
--d
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
a single-byte encoding, this is still
possible without bloating the UCM.
Dan the Encode Maintainer
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
have to beware of is UTF-8 encoding codepoints
in the surogate range, rather than de-surogating and encoding the
real code point.
The fixed UCS-2BE works for Tk - but is still a little slower than
it could be. I suggest we do UTF-16XE properly as XS code.
--
Nick Ing-Simmons
http://www.ni-s.u
Perl_utf16_to_utf8_reversed(pTHX_ U8* p, U8* d, I32 bytelen, I32 *newlen)
Should be a good starting point for the XS version ;-)
which does first a byteswap and then calls the non-reversed version).
I also can see that the Perl_utf16_to_utf8 is non-EBCDIC aware...
--
Nick Ing-Simmons
http://www.ni-s.u
convert UTF-8 sequences for sequences of
characters - but .ucm would need tweaking to allow
multiple U:
UU \x
We would have to be sure that Unicode was normalized as well.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
reason .enc's were installed was for Encode::Tcl.
Dan the Encode Maintainer
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Dan Kogai [EMAIL PROTECTED] writes:
On Monday, April 1, 2002, at 07:33 , Nick Ing-Simmons wrote:
Dan Kogai [EMAIL PROTECTED] writes:
I think I have found the reason why some of the encodings were
missing
from Tcl's *.enc, which later turned into *.ucm.
Apple makes use of Unicode
to come.
Dan the Encode Maintainer
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Ing-Simmons
http://www.ni-s.u-net.com/
MacRumother ...
FYI, I do check83.pl before the release since 0.99 or so
Dan
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
) that the data file is invalid in some
way and that current decode mistakes that for incomplete character...
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Autrijus Tang [EMAIL PROTECTED] writes:
And then you'll ahve to disambiguate between that and encoding.pm...
Why aren't we extending encoding.pm instead?
That was my thought as well - that there is overlap with Jarkko's
work use encoding.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
.
It would be good to have some algorithmic encodings to use as
examples. The only ones we have at present are UCS-2 (as perl code)
and UTF-8 (C but buried in perl's core).
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
)
iEYEARECAAYFAjyhnoMACgkQtLPdNzw1AaB1gQCghITGqkt9MQWL/5Rozdq+KOEa
fJkAnRDSvdwxJMVmREw7MlRr3XvdujEt
=Oykx
-END PGP SIGNATURE-
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
directory the file name must be unique
if truncated to 8.3 - not that all file names must be 8.3
I am fairly sure that is what the check83.pl polices.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
is still the long one.
Well, I didn't really enjoy renaming files myself
Dan the Man with Too Many Files to Watch Over
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
1 - 100 of 138 matches
Mail list logo