Mohammad Yaseen <[EMAIL PROTECTED]> writes:
>I'm trying to build IO-Tty-1.02 on z/os using perl-5.8.7, i'm getting this
>error messages
>
> Now let's see what we can find out about your system
>(logfiles of failing tests are available in the conf/ dir)...
>FSUM7332 syntax error: got (, expecti
Mohammad Yaseen <[EMAIL PROTECTED]> writes:
> Hi,
>
> I'm using perl-5.8.7.
> What is PLAN9?
An operating system.
> Ans what is plan9 directory is meant for in the source directory.
Building perl for/on a plan9 system
>
> Thanks and Regards
> Yaseen
>
>
>
>
John Delacour <[EMAIL PROTECTED]> writes:
>use MIME::QuotedPrint;
>$qp = encode_qp ($_, '');
>print "=?UTF-8?Q?$qp?=" . $/;
That isn't quite right.
MIME::QuotedPrint does NOT encode space or tab.
RFC2047 says:
" The "Q" encoding is similar to the "Quoted-Printable" content-
transfer-encodin
David Olsson <[EMAIL PROTECTED]> writes:
>What is the easiest way to install Encode for a single
>user?
Same as any other CPAN module.
perl Makefile.PL PREFIX=/home/cedric/perl_modules
make
make install
then:
#!/usr/bin/perl
use lib '/home/cedric/perl_modules';
# or if script is relative to in
Wing <[EMAIL PROTECTED]> writes:
>"John Delacour" <[EMAIL PROTECTED]> wrote in message
>news:[EMAIL PROTECTED]
>> At 12:42 am +0800 28/12/05, wing wrote:
>>
>>>I need to encode the subject line in a MIME header in UTF8 (something like
>>>Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=).
Rajarshi Das <[EMAIL PROTECTED]> writes:
> Hi,
>
> The following two line script gives an error on z/OS : "Unknown encoding
> 'iso-2022-
> jp' at line ..".
> -
> use Encode; use encoding 'iso-2022-jp';
>
On an EBCDIC platform like z/OS that is going to be one st
Rajarshi Das <[EMAIL PROTECTED]> writes:
>I run the following on an ebcdic platform
>(perl-5.8.6),
>
>$BOM = chr(0xFEFF);
>open(UTF_PL, ">:raw:encoding(utf16le)", "utf.pl")
>or die "utf.pl($enc,$tag): $!";
>print UTF_PL $BOM;
>print UTF_PL "1";
>
>
>
>should the data that is written using
Rajarshi Das <[EMAIL PROTECTED]> writes:
>Hi,
>
>I have a basic doubt regarding unicode and z/OS
>(ebcdic : ibm-1047).
>
>$a = chr(0x00A1);
>
>$b = chr(0xA1);
>
>Should $a and $b be equal or yield different results
>on ebcdic ?
As far as I know they should be the same.
chr() takes a number and t
Stuart Hughes <[EMAIL PROTECTED]> writes:
>Hi everyone,
>
>I've run into problems matching the regex [^\s] on RedHat 8/9 and the
>version of perl shipped with it (5.8.0).
It isn't 5.8.0 is 5.8.0-with-RedHatBugs :-(
To be fair to them it is some development track thing - there was
an experimenta
Paul Bijnens <[EMAIL PROTECTED]> writes:
>Can anyone explain what I'm doing wrong?
As I recall HTML::Entities has a build-time option as to whether it handles
Unicode - do you know if yours has that turned on?
What locale are you in (i.e. is it something that has â as a native
8-bit coding (Window
Radoslaw Zielinski <[EMAIL PROTECTED]> writes:
>Hello,
>
>What's the point of lines 151 and 167 in Encode.pm? Respectively:
>
># sub encode
>$_[1] = $string if $check;
>
># sub decode
>$_[1] = $octets if $check;
>
>I really can't see a point in overwriting the input value... Why
Bjoern Hoehrmann <[EMAIL PROTECTED]> writes:
>
>>> Now that we have this problem, introducing more places where one needs
>>> to carefully check the documentation what is considered UTF-8 does not
>>> seem like the best option, having decode_utf8() and decode(utf8=>...)
>>> mean some- thing differe
Bjoern Hoehrmann <[EMAIL PROTECTED]> writes:
>* Bjoern Hoehrmann wrote:
>> Enocde 2.08, PerlIO::scalar 0.02, ActivePerl 5.8.2,
>>
>> #!perl -w
>> use strict;
>> use warnings;
>> use Encode;
>>
>> my $string = encode(UTF16 => "");
>>
>> for (qw/UTF-8 UTF-16LE UTF-16BE UTF-32LE UTF-32BE/)
Rick Measham <[EMAIL PROTECTED]> writes:
>That being the case, I grab the charset and use Encode's decode function
>to turn it into 'perl's internal format' .. which in 5.8.5 is utf8
>right?
As it happens the answer is "maybe", but it is the _internal_ form it is none
of your
business ;-) - so
Paul Bijnens <[EMAIL PROTECTED]> writes:
>I have a program that reads and writes (among others) strings that
>should be utf8 encoded. I say "should", because somewhere deep
>inside the dark corners of that program, sometimes, the utf8 flag on
>a string is lost. (I'm still investigating where, tips
Piyush Shourie <[EMAIL PROTECTED]> writes:
>Hi,
>
>
>
>I am not able to compile Encode module, as one of the pre-requisites of
>Encode module, Text::Iconv does not compile on Windows platform.
When did that happen?
>I am
>currently using ActiveState Perl 5.6.1, and cannot upgrade to the newer
>
Aaron Siladi <[EMAIL PROTECTED]> writes:
>I have a UTF-8 string which I want to output as ascii and have the UTF8
>characters converted to numeric character references.
>
>
>
>I tried using Encode with the FB_HTMLCREFS fail back option enabled, but for
>the 2 byte UTF8 characters, 2 incorrect char
Martin 'Kingpin' Thurn <[EMAIL PROTECTED]> writes:
> It seems to me that the main problem is that Encode does not use IANA
>registered names.
It is supposed to have IANA names as aliases.
>And ebcdic-cp-us didn't work because of a bug in
>I18N::Charset (sorry about that).
> The proper solutio
Dan Kogai <[EMAIL PROTECTED]> writes:
>On Oct 25, 2004, at 03:01, Nick Ing-Simmons wrote:
>> But as Dan said at the start \xF6 on its own (say as 1023 octet
>> in a 0..1023 1024-octet buffer is not a fail.
>> Changing that will make :encoding() layer have problems as buf
Dan Kogai <[EMAIL PROTECTED]> writes:
>On Oct 24, 2004, at 18:34, Rafael Garcia-Suarez wrote:
>> Welcome to backward compatibility hell :)
>
>Hell it was but seems like I came up with a way out (yay).
>
>>> I just want Encode::utf8->decode() to make sure Encode:RETURN_ON_ERR
>>> is
>>> on when the
Rafael Garcia-Suarez <[EMAIL PROTECTED]> writes:
>Dan Kogai wrote:
>> This makes perl-5.8.6 happy but the problem is that I have made
>> Encode::utf8 so that it accepts fallback values like Encode::XS (upon
>> the request by Bjoern Hoehrmann via RT). Encode::utf8 used to return
>> immediately a
Dan Kogai <[EMAIL PROTECTED]> writes:
>On Oct 24, 2004, at 06:41, Rafael Garcia-Suarez wrote:
>> Dan Kogai wrote:
>>> Within less than 24hrs I resorted to release version 2.07. What the
>>> heck. 5.8.6 is soon
>>
>> I applied 2.07 to bleadperl, and looks like something is broken in
>> PerlIO:
Dan Kogai <[EMAIL PROTECTED]> writes:
>On Oct 23, 2004, at 01:04, Bjoern Hoehrmann wrote:
>> C12a in Unicode 4.0.1 notes
>>
>> [...]
>> For example, in UTF-8 every code unit of the form 110 must be
>> followed by a code unit of the form 10xx. A sequence such as
>> 110x 0xxx is
Bjoern Hoehrmann <[EMAIL PROTECTED]> writes:
>Hi,
>
> What is currently the best way to resolve charset names to use them
>with Encode.pm? I would have expected that e.g.
>
> Encode::decode('ebcdic-cp-us', '')
>
>would just work but it does not appear to know that alias. Then I've
>tried to use I
Rick Measham <[EMAIL PROTECTED]> writes:
>G'day Unicode Gurus and other assorted members of the perl Unicode
>community.
>
>I have a script that attempts to collect translations from Babelfish.
>I've posted it below.
>
>It uses LWP::Useragent to turn an English phrase into Japanese (or any
>other l
Rafael Garcia-Suarez <[EMAIL PROTECTED]> writes:
>> I have a problem to avoid "Mailformed UTF-8 caracter" messages when I use the
>> Switch.pm module on SuSE 9.1 Profesional with english or german language
>> settings.
>
>Could we see a snippet of code that demonstrates the problem ?
>The version
Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes:
>$ perl -e 'use encoding "ISO-8859-2"; use open ":encoding(ISO-8859-2)"; print
>ord($ARGV[0]), chr(260), $ARGV[0], "\n"' Ä "\x{00a1}" does not map to iso-8859-
>2 at -e line 1. 260Ä\x{00a1}
>
>I don't understand it: ord($ARGV[0]) is 260, chr(260
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons wrote:
>> Once we had
>>
>> use encoding qw(locale);
>>
>> But it did not work well as not all locale implementations
>> give the API to return the encoding.
>> (And even en_GB
Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes:
>W liÅcie z pon, 16-08-2004, godz. 16:54 +0300, Jarkko Hietaniemi
>napisaÅ:
>
>> > The encoding pragma partially works. It doesn't influence assumed
>> > encoding of files opened without specifying the encoding, nor handling
>> > of filenames, a
Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes:
>> But there is a simple workaround for that, as perluniintro would tell
>> you: the encoding pragma.
>
>The encoding pragma partially works. It doesn't influence assumed
>encoding of files opened without specifying the encoding, nor handling
>o
Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes:
>W liÅcie z pon, 16-08-2004, godz. 11:16 +0100, Nick Ing-Simmons napisaÅ:
>
>> >Perl treats them inconsistently. On one hand they are read from files
>> >and used as filenames without any recoding, which
Dominic Mitchell <[EMAIL PROTECTED]> writes:
>Marcin 'Qrczak' Kowalczyk wrote:
>> This leaves chr() ambiguous, so there should be some other function for
>> making Unicode code points, as chr should probably be kept for
>> compatibility to mean the default encoding.
>
>In the past when I've needed
Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes:
>Should strings without the UTF8 flag be interpreted in the default
>encoding of the current locale or in ISO-8859-1?
This is a tricky question - and status quo is likely to remain
for compatibility reasons.
>
>Perl treats them inconsistently
Erland Sommarskog <[EMAIL PROTECTED]> writes:
>Jean-Michel Hiver ([EMAIL PROTECTED]) writes:
>> Erland Sommarskog wrote:
>>>I working with an XS module that passes queries to MS SQL Server and
>>>returns data back using SQLOLEDB. MS SQL Server stores Unicode data
>>>as UTF-16. Also, all metadata is
Frank Krout <[EMAIL PROTECTED]> writes:
>I'm trying to support a legacy multilingual website that has been upgraded
>to perl58 and now using PERLIO to properly encode html output. (STDOUT is
>mapped via binmode)
I have had this marked as needing a detailed/reasoned reply now for over a year,
so I
Nicholas Clark <[EMAIL PROTECTED]> writes:
>On Mon, Jun 21, 2004 at 08:46:07AM -0700, Jan Dubois wrote:
>
>> I think it is possible, but it requires someone to both do the work and
>> to argue for it on P5P. Without this "champion", I don't see it
>> happening at all.
>
>Nor do I. But P5P isn't big
Erland Sommarskog <[EMAIL PROTECTED]> writes:
>Jarkko Hietaniemi ([EMAIL PROTECTED]) writes:
>> Nick Ing-Simmons wrote:
>>> This thread started as complaint that perl5 can't read a
>>> script saved as UCS-2/UTF-16 or whatever Windows uses.
>>
>>
Marco Baroni <[EMAIL PROTECTED]> writes:
>A few days ago, I queried this list about my problems with a script
>that finds the charset of Japanese web pages and translates their text
>into utf-8.
>
>The following solution, proposed by Nick Ing-Simmons, worked for my
&
Marco Baroni <[EMAIL PROTECTED]> writes:
>Thanks for your advice... the output does look different, this time,
>but it still doesn't look like utf8... (I get the same error with
>recode).
>
>If somebody could suggest a way to convert to another encoding, or a
>better way to identify the encodin
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons wrote:
>
>> Larry Wall <[EMAIL PROTECTED]> writes:
>>
>>>Right now, the meaning of "text" is subject to severe distortions
>>>due to legacy issues. But in the long run, &qu
Larry Wall <[EMAIL PROTECTED]> writes:
>
>Right now, the meaning of "text" is subject to severe distortions
>due to legacy issues. But in the long run, "text" is going to mean
>Unicode, and that probably means a UTF-8 file encoding at least in
>the western world,
Microsoft seem to be somewhat fo
Erland Sommarskog <[EMAIL PROTECTED]> writes:
>I have to admit that I have not completely researched what the documentation
>has to say, but this is not only a question on how, but also on which way
>to take.
>
>I'm working on an XS module that will interact with the SQL Server OLE DB
>Provider, th
Erland Sommarskog <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons ([EMAIL PROTECTED]) writes:
>> Erland Sommarskog <[EMAIL PROTECTED]> writes:
>>>I would really expect someone to have done this already, but I see no
>>>reference to such a module. Or layer-di
ote that you cannot (in general) "print" the combined string as
either 8859-1 or 8859-2
>
>Thank you.
>
>
>- Original Message -
>From: "Nick Ing-Simmons" <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Sent: Tuesday, April 13, 2004 11:13 AM
>Subje
Octavian Rasnita <[EMAIL PROTECTED]> writes:
>I have tried the following script:
>
>#!/perl/bin/perl -wC
>
>use Encode;
>
>my $text = Encode::decode('latin2', 'mÃta');
>
>binmode(STDOUT, ":utf8");
>
>print "Content-Type: text/html; Charset=UTF-8\n\n";
>print Encode::encode('utf8', $text);
>
You ha
Cremers LMG <[EMAIL PROTECTED]> writes:
>I've used your clear description in 'Encode.html' to convert from cp1252 to
>utf8,
>using the lines:
>
>use Encode;
>open (INPUT,"<:encoding(cp1252)","$in")|| die "FileOpen fail: $in $!\n";
>open (OUT,">:utf8","$out") || die "FileOpen 1 failed: $out : $!\n";
Erland Sommarskog <[EMAIL PROTECTED]> writes:
>
>It seems that the only way out, is to first open the file in plain mode,
binmode I suspect.
>look at the first three bytes, and if it is BOM, close the file, open
>again with the appropriate options and discard the BOM.
You don't have to close it
Erland Sommarskog <[EMAIL PROTECTED]> writes:
>
> open (F, '<:encoding(ucs-2le)', 'rÃkmacka-ucs2.txt');
>
>And one things seems just plain wrong to me: The "\n" is written as
>0A 0D to the file, not 000A, 000D. But may there is some more manual
>reading I need to do find out how to do it.
0
Larry Wall <[EMAIL PROTECTED]> writes:
>On Wed, Feb 25, 2004 at 06:19:02PM +0100, Sebastian Lehmann wrote:
>: For this example the search value will be "Ibaïez". Because of the search
>: isn't case-sensitive, all letters should be uppercased, using the uc method.
>
>I don't think this is your probl
Sebastian Lehmann <[EMAIL PROTECTED]> writes:
>Hello,
>
>i use a perl script to search different files. The search values are given
>from a HTML page, the results are displayed on this page, too. The files are
>saved in the UTF16LE format, therefore i will open them with the following
>open command
Andreas Jaekel <[EMAIL PROTECTED]> writes:
>Dear Perl Dieties!
>
>I've been trying to figure this out for myself for a couple
>of hours now, but I got to the point were I gave up and decided
>that I'll have to bother you. Hope you don't mind.
>
>My task is the following, and I'm running out of ide
ALexander N. Treyner <[EMAIL PROTECTED]> writes:
>Hi John,
>Your code works perfect.
>But I found one strange thing.
>For example I have next string:
>
> hello hello world
>
>that converted by the mail client to
>
> hello =?windows-1255?Q?=F9=EC=E5=ED_hello_world?=
>
Guido Flohr <[EMAIL PROTECTED]> writes:
>ALexander N. Treyner wrote:
>> Hello All,
>> I'm using utf-8 Postgres database, where I save strings in many languages.
>> I have to match the database with strings encoded in mime base64 or
>> quoted-printable format. Like next:
>> =?utf-8?B?15TXoNeUINee16
Brad Guillory <[EMAIL PROTECTED]> writes:
>Last spring someone committed a patch to fix the tests on windows
>platforms (see Change 18966 by [EMAIL PROTECTED] on 2003/03/14 04:20:51).
>This broke the tests on my Redhat box. Here is a compromise patch:
>
>--- t/enc_module.t.orig 2004-01-28 11:34
Eric Cholet <[EMAIL PROTECTED]> writes:
>Le 1 janv. 04, Ã 17:50, Rafael Garcia-Suarez a Ãcrit :
>
>> +(However, and as a limitation of the current implementation, using
>> +C<\w> or C<\W> I a C<[...]> character class will still match
>> +with byte semantics.)
>
>I don't think it applies to \w, only
Jungshik Shin <[EMAIL PROTECTED]> writes:
>> That will work if there's en_GB.UTF-8 available for him in his
>> particular Unixes and assuming using UTF-8 locales won't break other
>> things.
Just so we get this clear. A year or so back I - as a Unicode advocate - tried
to switch to en_GB.utf8. Wi
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>>> What I wish is that the whole current locale system would curl up and
>>> die.
>>
>> As you'd agree, it's only 'encoding' part that has to die.
>
>Oh no, there are plenty of parts in it that I wish would die :-)
>(though the coupling of encoding i
Jungshik Shin <[EMAIL PROTECTED]> writes:
>
> Then, he should switch to en_GB.UTF-8.
I probably will.
>Besides, he implied that
>he still uses ISO-8859-1 for files whose names can be covered by
>ISO-8859-1, which is why I wrote about mixing up two encodings
>in a single file system _under_ his
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>>Let's not 'fix' it (not carve it on a stone), but offer a few
>> well-thought-out options. For instance, Perl may offer (not that these
>> are particularly well-thought-out) 'just treat this as a sequence of
>> octets', 'locale', and 'unicode'. 'l
Ed Batutis <[EMAIL PROTECTED]> writes:
>
>The point I'm trying to make (agreeing with most perl 5 porters I suspect)
>is that supporting Shift-JIS in Perl5 is hopeless.
I seem to recall my Japanese collegues at TI using it years ago...
just treating it as octets and with a 'jperl' which did a lit
Jungshik Shin <[EMAIL PROTECTED]> writes:
>On Mon, 22 Dec 2003, Jarkko Hietaniemi wrote:
>
>> (AFAIK) W2K and later _are able_ to use UTF-16LE encoded Unicode for
>> filenames,
>> but because of backward compatibility reasons using 8-bit codepages is
>> much
>> more likely.
>
> No. _Both_ NTFS (on
Ed Batutis <[EMAIL PROTECTED]> writes:
>"Jarkko Hietaniemi" <[EMAIL PROTECTED]> wrote in message
>news:[EMAIL PROTECTED]
>
>> You do know that ...
>Yes.
>
>If wctomb or mbtowc are to be used, then Perl's Unicode must be converted
>either to the locale's wide char or to its multibyte.
Locale is pe
Dana Sharvit - M <[EMAIL PROTECTED]> writes:
>Hi ,
>I am using the Encode module (perl 5.8)to convert a string from utf8 to big
>5.
>There is something that I do not understand that I thought you may help
>with:
>The input to the program is a file that contains a utf8 string,
>The encoding works pr
Edward Batutis <[EMAIL PROTECTED]> writes:
>> Also each character when I view it via character
>> listing of IME pad, it has three hex numbers.
>
>Seeing three hex numbers per character is a sure sign you've got utf8. You
>need to convert the characters to the platform encoding before using 'open'
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>> a year ago, there was a discussion on this list about Encode not
>> recognizing "TIS-620" as alias for "iso-8859-11":
>>
>> http://nntp.x.perl.org/group/perl.unicode/1656
>>
>> In the latest release of Encode::Alias (1.38 from Encode 1.9801,
>> inc
SADAHIRO Tomoyuki <[EMAIL PROTECTED]> writes:
>Hello.
>
>For round-trip fidelity, Mac OS CJK encodings include many characters
>with mapping a single character in a Mac OS encoding
>to a sequence of standard Unicode characters.
>(cf. ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/README.TXT )
Sadahiro Tomoyuki <[EMAIL PROTECTED]> writes:
>> Are the Unicode character sequences in [1] normalized?
>> Can you explain what the diacritics mean I assume '`^ etc. are tone marks?
>> What do the macron and dot and dots-below signify?
>
>Apparently POJ system uses ten vowels
>(a, e, i, m, ng, o, o
Hank Tt <[EMAIL PROTECTED]> writes:
>Hi,
>
>I'm trying to make a UCM file to feed to enc2xs. The legacy encoding for
>Taiwanese romanization *must* have its code points mapped to Unicode
>character sequences, for the simple reason that the UCS lacks the
>corresponding precomposed characters (and i
Hank Tt <[EMAIL PROTECTED]> writes:
>Hi,
>
>I'm trying to make a UCM file to feed to enc2xs. The legacy encoding for
>Taiwanese romanization *must* have its code points mapped to Unicode
>character sequences, for the simple reason that the UCS lacks the
>corresponding precomposed characters (and i
John Delacour <[EMAIL PROTECTED]> writes:
>At 11:31 am +0100 16/9/03, [EMAIL PROTECTED] wrote:
>>Dear PERLists,
>>
>>I am running Perl 5.8. and trying to filter out some invalid Unicode
>>characters from Unicoded texts of some South Asian languages. There
>>are 28 such characters in my data (all
Owen Taylor <[EMAIL PROTECTED]> writes:
>On Fri, 2003-08-29 at 11:14, Nick Ing-Simmons wrote:
>> >
>> >We're dropping support for this code and for core X fonts
>> >in the next release of Pango,
>>
>> In favour of what? (FreeType on client
Jungshik Shin <[EMAIL PROTECTED]> writes:
>
> If you want, you can take a look at nsFontMetricsGTK.cpp file
>of mozilla.
Can you pass on my admiration to the Mozilla team - its
handling of these issues in version 1.4 is so much better
than ye-olde Netscape.
>You can view that huge file (over 6
Owen Taylor <[EMAIL PROTECTED]> writes:
>On Fri, 2003-08-29 at 11:14, Nick Ing-Simmons wrote:
>> >
>> >We're dropping support for this code and for core X fonts
>> >in the next release of Pango,
>>
>> In favour of what? (FreeType on client
Owen Taylor <[EMAIL PROTECTED]> writes:
>You might want to look at what we did for Pango - see
>pango/modules/basic/tables-big.i in
>ftp://ftp.gtk.org/pub/gtk/v2.2/pango-1.2.5.tar.gz.
[There may come a time when I just give up Tcl/Tk and implement
perl/Tk OO interface on top of gtk instead. But
Dan Kogai <[EMAIL PROTECTED]> writes:
>
>But that is not good enough for cases below because...
>
(Hiragana | Katakana | Han) => 'jisx0208.1990-0'
>
>This is very wrong because jisx0208.1990-0 only contains \p{Han} that
>appears in Japanese (JIS X 0208, to be exact). On the other hand,
>ji
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>On Thu, Aug 28, 2003 at 03:16:20PM +0100, [EMAIL PROTECTED] wrote:
>>
>> Does the existing perl5.8.* Unicode support have a way to efficently
>> determine which script(s) or block (in unicode sense) a code point belongs
>> to?
>
> use Unicode::
<[EMAIL PROTECTED]> writes:
>On Wed, Aug 27, 2003 at 06:04:48PM +0200, Guido Flohr wrote:
>> Hi,
>>
>> [EMAIL PROTECTED] wrote:
>> >I'm working with a byte oriented protocol, and need to extract byte n1
>> >through
>> >byte n2 from a string.
No problem (honest;-)) (At least in perl5.8 ...)
A b
Dan Kogai <[EMAIL PROTECTED]> writes:
>On Saturday, Aug 9, 2003, at 00:08 Asia/Tokyo, Simon Cozens wrote:
>> This is sad and I ought to know the answer, but...
>>
>> Can someone give me a few quick examples of creating Encode::XS objects
>> to do simple transcoding, from XS?
>
>You should check the
Simon Cozens <[EMAIL PROTECTED]> writes:
>[EMAIL PROTECTED] (Simon Cozens) writes:
>> Can someone give me a few quick examples of creating Encode::XS objects
>> to do simple transcoding, from XS?
>
>I think I expressed myself badly. Perhaps I don't mean "creating" Encode::XS
>objects, but instantia
merged my patched version into mainline (5.9.*) and I would
expect it to be in 5.8.1 as well.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Nick Ing-Simmons <[EMAIL PROTECTED]> writes:
>Martin J. Evans <[EMAIL PROTECTED]> writes:
>>>
>>> A socket is a file handle so :
>>>
>>> binmode($sock,":utf8");
>>>
>>> should work.
>>I'm obviously mi
Martin J. Evans <[EMAIL PROTECTED]> writes:
>Dan Kogai wrote:
>> On Tuesday, July 1, 2003, at 05:49 PM, Martin J. Evans wrote:
>>
>>> Nick Ing-Simmons wrote:
>>>
>>>> Martin J. Evans <[EMAIL PROTECTED]> writes:
>>>> A socke
;, 'anything') or
>binmode (FH, ":utf8"))?
A socket is a file handle so :
binmode($sock,":utf8");
should work.
>
>I'm using Perl 5.8.0.
>
>Thanks.
>
>Martin
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
e only thing it means these days is "my script is in UTF-8".
And even that is a potential dead-end - scripts in other encodings
don't have a unique pragma so why does UTF-8 ?
>For "all the other" things, I think there can't ever be a consensus
>for "all those
he UTF-EBCDIC stuff. The snag being that perl's 'utf8' encoding
uses core's SvUTF8 scheme - which is just fine if it _IS_ UTF-8
What we need for Encode::* to have its _own_ UTF-8 and UTF-EBCDIC
encode/decode independant of what core is using...
>
>Thanks
>Brian
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
e success displaying email using Encode::'s euc-cn
and Unicode fonts, but as I can't read many chineese characters
this was mainly just as an exercise.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
modes>
>+hash. Its keys are the caller's packages (or the second-level calling
>+package if the caller is C); the values are hash references
>+with two keys: C holds the input mode, and C for the output
>+mode.
>
> If you have a legacy encoding, you can use the C<:encoding(...)> tag.
>
>-BEGIN PGP SIGNATURE-
>Version: GnuPG v1.2.1 (FreeBSD)
>
>iD8DBQE+NvePtLPdNzw1AaARAm5fAJ9cURDB+e2FO88Aa+ULzJxACOWwAACfSiy0
>i/vf6NBdmU5ynqXHU66nRso=
>=keaI
>-END PGP SIGNATURE-
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
kes NI-XS to fix the prob
The partial char stuff needs the encoding to use same rules as Encode::XS
will take a look if it isn't fixed yet.
>
>Dan the Encode Maintainer
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
s undefined.
>
>Maybe I am misunderatanding Encode's conversion operations, so
>maybe it is a problem with the documentation not being clear about
>this behavior. But IMHO, what I am getting appears to be incorrect.
And IMHO you are getting what I "designed" it to produce ;-)
I strongly recommend doing conversions in two steps explcitly - that way
you can get whatever you want.
I am also willing to concede that documentation could be improved :-)
>
>--ewh
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
is (almost) by design - i.e. it happened that way and
I decided it made a kind of sense. Using ASCII is considered as
asking for 7-bit ness. If you want one of 8-bit super-sets use the
one you want (iso8859-1 aka latin1 most likely, but perhaps one
of the windows ones with smart quotes, m-dash etc.)
There is a good case for a "latin-guess" or latin-superset or ...
which trys to do the right thing.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Dan Kogai <[EMAIL PROTECTED]> writes:
>On Monday, Nov 4, 2002, at 19:17 Asia/Tokyo, Nick Ing-Simmons wrote:
>> Someone could/should write a generic test that pushes all codepoints
>> supported by a .ucm file both ways through the generated encoder
>> and checks for c
and checks for correctness. This would be a pointless thing to do
as part of perl's "make test" as once the "compiler" works it works,
but would be useful for folk working on the compile process.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
think it would be
useful to have something which will print them out from the internal
form.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Dan Kogai <[EMAIL PROTECTED]> writes:
>On Sunday, Oct 20, 2002, at 22:49 Asia/Tokyo, Nick Ing-Simmons wrote:
>> Attached is patch that implements ->decode and ->encode of
>> Encode::utf8 as XS code that obeys all the rules that Encode::XS does.
>> This allows :
ke the new tables.
Tcl/Tk can fight its own battles (though once I have a solid Tk804
I will be offering them patches). I don't think cp932 is going
to affect any fonts (Windows fonts being Unicode indexed and
X11 fonts needing a fixed width encoding).
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Attached is patch that implements ->decode and ->encode of
Encode::utf8 as XS code that obeys all the rules that Encode::XS does.
This allows :encoding(UTF-8) to handle partial chars at end of buffers
correctly.
Submited as
//depot/perlio/...@18032
--
Nick Ing-Simmons
http://www.
Nadim <[EMAIL PROTECTED]> writes:
>On Sunday 13 October 2002 14:45, Nick Ing-Simmons wrote:
>> I am using 5.6.3 on windows from activestate. I do the
>> >following.
>>
>> I don't think you are. As far as I am aware there is only perl5.6.1
>> there i
so horrible I can't recall it).
For perl5.8 this is easy - it was a major goal of perl5.8.
>3/ compare both strings and act upon the comparison
Once you have two Unicode strings this is easy.
>
>if the string I get from ole _is_ unicode (and it seems so)
What leads you to that conclusion?
>how can I
>flatten it to binary? I tried with unpack without success.
>
>Nadim.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
ENCODE_LEAVE_SRC)
>is just what I wanted, because it LEAVEs those chars in SRC that
>ENCODE_NOREP... but unfortunately no, it leaves all source string
>untouched unconditionally.
>
>Thanks in advance for any clues.
>
>If my English and/or my question is far from clear, please tell me and
>I'll do my best to rewrite it in other words.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
default to their
>preferred MIME names, all in lowercase. Maybe the unique ID number
>("MIBenum") could also be taken into account.
I have no objection to that - and I doubt Dan will either.
Would you care to at least enumerate the cases we fail - or ideally
provide patch(es) ?
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
1 - 100 of 239 matches
Mail list logo