Thank you Oldes,
That's much more than what I expected !
Anyway checking the old version was an occasion to improve my understandi=
ng
(I'm new also to unicode...).
Cheers,
Alain.
----- Original Message -----
From: "rebOldes" <[EMAIL PROTECTED]>
To: "Alain Goy=E9" <[EMAIL PROTECTED]>
Sent: Wednesday, October 20, 2004 2:35 PM
Subject: [REBOL] Re: UTF-8
>
> Hello Alain,
>
> Sunday, October 17, 2004, 7:27:41 PM, you wrote:
>
> AG> Hi all,
>
> AG> I got interested in manipulating Unicode with REBOL and tried the
UTF-8 script by Jan Skibinski.
>
> AG> It seems there is an error in the encode function which did not
convert correctly my test case : the 1st letter of Khmer alphabet which c=
ode
is U+1780, should become #{E19E80} in UTF-8, according
> AG> to my understanding (based on
http://www.zvon.org/tmRFC/RFC2279/Output/chapter2.html).
>
> AG> In case it may be helpful to someone this version should work (thou=
gh
not optimized and tested only with k=3D2 on U+1780 :-) :
>
> Hi, it looks that you were using some older version. Here is available
> my latest utf-8.r script:
>
> http://oldes.multimedia.cz/rss/projects/utf-8_latest.rip (4kB)
>
> I removed the to-ucs2 function as I'm using this ucs2.r script:
>
> http://oldes.multimedia.cz/rss/projects/ucs2_latest.rip ( 2.5MB !!!)
>
> The archive is pretty large as it includes all available charmaps
> which I collected with already pre-generated appropriate Rebol parsing
rules.
>
> I use only cp1250 and ISO-8859-2 so I'm not sure if the others are
> good working, but they should be if the included charmap sources are
correct.
>
> So if I need to encode a text which was written using 'cp1250' to utf-8=
I
do:
>
> ucs2/load-rules "cp1250"
> utf-8/encode-2 ucs2/encode "text with special char =8A"
>
>
> Theoretically I can also change encoding of the text:
>
> ucs2/load-rules "cp1250"
> ucstext: ucs2/encode "text with special char =8A"
> ucs2/load-rules "iso-8859-2"
> to-string ucs2/decode ucstext
>
> =3D=3D "text with special char =A9"
>
> (but I never used this so it's not tested at all and there may be
> problem if you have some unicode chars which the decoder rule doesn't
know)
>
> I the UCS2 archive there is also a script which creates PHP code for
> ucs2 encoding (according charmap you need) as I was missing this in my
> PHP build.
>
> Isn't Rebol great tool? :)
>
> Feel free to let me know if you would have some troubles.
>
> Cheers, Oldes
>
> PS: I'm still unicode newbie! I just made a script which is working
> as I need it, that's all.
>
> --
> To unsubscribe from the list, just send an email to rebol-request
> at rebol.com with unsubscribe as the subject.
>
>
>
--
To unsubscribe from the list, just send an email to rebol-request
at rebol.com with unsubscribe as the subject.