Re: [racket-users] Constructing unicode surrogates

2016-10-08 Thread Jens Axel Søgaard
2016-10-08 20:00 GMT+02:00 Ryan Culpepper :

> Does one of the `string-normalize-*` functions do what you want?
>

No. I am basically implementing the part of the reader that lexes string
literals.
I want to allow full Racket syntax in the string literals in my infix
package.
One solution is to use (read (open-input-string lexeme)) but I am afraid
that will
be slow.

The job is therefore to turn "\\ud800\\udc00" into "\ud800\udc00" (the
latter a legal string literal).

Since

(string (integer->char (string->number "d800" 16))
  (integer->char (string->number "dc00" 16)))

produces the error

integer->char: contract violation
  expected: (and/c (integer-in 0 #x10) (not/c (integer-in #xD800
#xDFFF)))
  given: 55296

it is necessary to produce the character directly.

/Jens Axel

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Constructing unicode surrogates

2016-10-08 Thread Jens Axel Søgaard
Thanks, I can work with that.

/Jens Axel


2016-10-08 19:44 GMT+02:00 Jon Zeppieri :

>
>
> On Sat, Oct 8, 2016 at 1:06 PM, Jens Axel Søgaard 
> wrote:
>
>> Hi All,
>>
>> The following interaction shows how the reader can be used to construct a
>> surrogate character:
>>
>> > (string-ref "\ud800\udc00" 0)
>> #\
>>
>> Given the two hexadecimal numbers d800 and dc00 how do I
>> construct the surrogate character directly?
>>
>> /Jens Axel
>>
>>
> I think you have to do the arithmetic yourself. Something like:
>
> (define (utf-16-surrogate-pair->char hi lo)
>   (integer->char
>(+ #x1
>   (arithmetic-shift (bitwise-and hi #x03ff) 10)
>   (bitwise-and lo #x03ff
>
>
>
>


-- 
-- 
Jens Axel Søgaard

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Constructing unicode surrogates

2016-10-08 Thread Ryan Culpepper

Does one of the `string-normalize-*` functions do what you want?

Ryan


On 10/08/2016 01:06 PM, Jens Axel Søgaard wrote:

Hi All,

The following interaction shows how the reader can be used to construct
a surrogate character:

 > (string-ref "\ud800\udc00" 0)
 #\

Given the two hexadecimal numbers d800 and dc00 how do I
construct the surrogate character directly?

/Jens Axel

--
You received this message because you are subscribed to the Google
Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to racket-users+unsubscr...@googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Constructing unicode surrogates

2016-10-08 Thread Jon Zeppieri
On Sat, Oct 8, 2016 at 1:06 PM, Jens Axel Søgaard 
wrote:

> Hi All,
>
> The following interaction shows how the reader can be used to construct a
> surrogate character:
>
> > (string-ref "\ud800\udc00" 0)
> #\
>
> Given the two hexadecimal numbers d800 and dc00 how do I
> construct the surrogate character directly?
>
> /Jens Axel
>
>
I think you have to do the arithmetic yourself. Something like:

(define (utf-16-surrogate-pair->char hi lo)
  (integer->char
   (+ #x1
  (arithmetic-shift (bitwise-and hi #x03ff) 10)
  (bitwise-and lo #x03ff

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.