[REBOL] Re: utf8-encode

2002-06-07 Thread Bohdan R. Rau

On Jun 07 at 00:04 RebOldes wrote:

> R>   nobody has utf8-encoder?
> R>   probably will have to write one by myself
[cut code]

Very clever... but it's not utf-8. Don't you think you should
translate ISO-2 characters into UTC-2 encoding?

ethanak


-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.




[REBOL] Re: utf8-encode

2002-06-07 Thread Christopher Dicely


--- RebOldes <[EMAIL PROTECTED]> wrote:

> shift: func [
> "Takes a base-2 binary string and shifts bits"
> data [string! binary!] places [integer!] /left
> /right
> ][
> data: enbase/base data 2
> either right [
> remove/part tail data negate places
> data: head insert/dup head data #"0" places
> ][
> remove/part data places
> insert/dup tail data #"0" places
> ]
> return debase/base data 2
> ]

Why convert to a binary string? Why not something
like:

shift: func [ 
  "shifts bits in an integer, by default to the right"
  data [integer!] places [integer!] /right /left
] [
  return 
data * ( 2 ** either left [ places ] [ 0 - places
]
]

=


__
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.




[REBOL] Re: utf8-encode

2002-06-07 Thread Volker Nitsch

Am Freitag, 7. Juni 2002 00:04 schrieb RebOldes:
> R>   nobody has utf8-encoder?
> R>   probably will have to write one by myself
>
> ok... why there is NO native right/left shift function in Rebol?!
>

because it can be replaced by multiplication/division with powers of 2? is 
rarely needed, and performance is not so critical. something like [ shift: [ 
1 2 4 8 16 ..]   my-number * shift/2 ] ?

> here is how utf8 works:
> putwchar(c)
> {
>   if (c < 0x80) {
> putchar (c);
>   }
>   else if (c < 0x800) {
> putchar (0xC0 | c>>6);
> putchar (0x80 | c & 0x3F);
>   }
>   else if (c < 0x1) {
> putchar (0xE0 | c>>12);
> putchar (0x80 | c>>6 & 0x3F);
> putchar (0x80 | c & 0x3F);
>   }
>   else if (c < 0x20) {
> putchar (0xF0 | c>>18);
> putchar (0x80 | c>>12 & 0x3F);
> putchar (0x80 | c>>6 & 0x3F);
> putchar (0x80 | c & 0x3F);
>   }
> }
>
> and here is my Rebol version:
>
> rebol [
> title: "UTF-8 encode"
> purpose: {Encodes the string data to UTF-8}
> author: "oldeS"
> email: [EMAIL PROTECTED]
> date: 7-Jun-2002/0:03:27+2:00
> usage: {
>
> >> utf8-encode "czech chars: ìšèøžýáíé"
>
> == "czech chars: ìšèøžýáíé"}
> comment: {More info: http://czyborra.com/utf/ }
> ]
> shift: func [
> "Takes a base-2 binary string and shifts bits"
> data [string! binary!] places [integer!] /left /right
> ][
> data: enbase/base data 2
> either right [
> remove/part tail data negate places
> data: head insert/dup head data #"0" places
> ][
> remove/part data places
> insert/dup tail data #"0" places
> ]
> return debase/base data 2
> ]
>
> utf8-encode: func[
> "Encodes the string data to UTF-8"
> str [any-string!] "string to encode"
> /local c
> ][
> str: to-binary str
> forall str [
> if #{79} < c: to-binary to-char first str [
> remove str
> insert str join (#{c0} or shift/right c 6) (c and
> #{3F} or #{80}) str: next str
> ]
> ]
> to-string head str
> ]

--
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the
subject, without the quotes.




[REBOL] Re: utf8-encode

2002-06-06 Thread RebOldes

R>   nobody has utf8-encoder?
R>   probably will have to write one by myself

ok... why there is NO native right/left shift function in Rebol?!

here is how utf8 works:
putwchar(c)
{
  if (c < 0x80) {
putchar (c);
  }
  else if (c < 0x800) {
putchar (0xC0 | c>>6);
putchar (0x80 | c & 0x3F);
  }
  else if (c < 0x1) {
putchar (0xE0 | c>>12);
putchar (0x80 | c>>6 & 0x3F);
putchar (0x80 | c & 0x3F);
  }
  else if (c < 0x20) {
putchar (0xF0 | c>>18);
putchar (0x80 | c>>12 & 0x3F);
putchar (0x80 | c>>6 & 0x3F);
putchar (0x80 | c & 0x3F);
  }
}

and here is my Rebol version:

rebol [
title: "UTF-8 encode"
purpose: {Encodes the string data to UTF-8}
author: "oldeS"
email: [EMAIL PROTECTED]
date: 7-Jun-2002/0:03:27+2:00
usage: {
>> utf8-encode "czech chars: ìšèøžýáíé"
== "czech chars: ìšèøžýáíé"}
comment: {More info: http://czyborra.com/utf/ }
]
shift: func [
"Takes a base-2 binary string and shifts bits"
data [string! binary!] places [integer!] /left /right
][
data: enbase/base data 2
either right [
remove/part tail data negate places
data: head insert/dup head data #"0" places
][
remove/part data places
insert/dup tail data #"0" places
]
return debase/base data 2
]

utf8-encode: func[
"Encodes the string data to UTF-8"
str [any-string!] "string to encode"
/local c
][
str: to-binary str
forall str [
if #{79} < c: to-binary to-char first str [
remove str
insert str join (#{c0} or shift/right c 6) (c and #{3F} or 
#{80})
str: next str
]
]
to-string head str
]


-- 
>>do [send to-email join 'oliva [EMAIL PROTECTED] "BESsssT REgArrrD, RebOldes"]


-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.