Re: [Bug-apl] How do I convert a byte sequence to Unicode?

Elias Mårtenson Sun, 27 Apr 2014 22:01:17 -0700

To convert byte values to code points, you need to apply an encoding
algorithm, and that's kind of messy.

(I believe the rest of GNU APL kind of assumes that UTF-8 is the standard
encoding used, which does make things simpler).

I have a suggestion: Make ⎕UCS support a dyadic form where the left-hand
side specifies the encoding to use. I.e:

*'UTF-8' ⎕UCS 99 100 101 102*

Handling multiple encodings is easily done through the *libiconv* library.
I worked on it when I made some improvements to its Common Lisp
integration. It's quite simple to use.

Regards,
Elias

On 28 April 2014 12:49, David B. Lamkins <dlamk...@gmail.com> wrote:

> That's close, but libfileio[8] returns a sequence of byte values; not
> code points.
>
> On Mon, 2014-04-28 at 12:19 +0800, Elias Mårtenson wrote:
> > Use the quad function ⎕UCS:
> >
> >
> >       ⎕UCS 'foo⍉bar'
> > 102 111 111 9033 98 97 114
> >       ⎕UCS 102 111 111 9033 98 97 114
> > foo⍉bar
> >
> >
> > Regards,
> > Elias
> >
> >
> > On 28 April 2014 12:17, David B. Lamkins <dlamk...@gmail.com> wrote:
> >         I can use lib_file_io to read a sequence of byte values from a
> >         file
> >         containing Unicode text.
> >
> >         How do I convert that sequence back to a Unicode string in GNU
> >         APL?
> >
> >
> >
> >
> >
>
>
>

Re: [Bug-apl] How do I convert a byte sequence to Unicode?

Reply via email to