I take it that this table describes the encoding of the byte stream: http://en.wikipedia.org/wiki/UTF-8#Description
(I might actually attempt this in APL, just to see whether I can do it while waiting for a built-in translation...) On Sun, Apr 27, 2014 at 10:00 PM, Elias Mårtenson <loke...@gmail.com> wrote: > To convert byte values to code points, you need to apply an encoding > algorithm, and that's kind of messy. > > (I believe the rest of GNU APL kind of assumes that UTF-8 is the standard > encoding used, which does make things simpler). > > I have a suggestion: Make ⎕UCS support a dyadic form where the left-hand > side specifies the encoding to use. I.e: > > *'UTF-8' ⎕UCS 99 100 101 102* > > > Handling multiple encodings is easily done through the *libiconv* library. > I worked on it when I made some improvements to its Common Lisp > integration. It's quite simple to use. > > Regards, > Elias > > > On 28 April 2014 12:49, David B. Lamkins <dlamk...@gmail.com> wrote: > >> That's close, but libfileio[8] returns a sequence of byte values; not >> code points. >> >> On Mon, 2014-04-28 at 12:19 +0800, Elias Mårtenson wrote: >> > Use the quad function ⎕UCS: >> > >> > >> > ⎕UCS 'foo⍉bar' >> > 102 111 111 9033 98 97 114 >> > ⎕UCS 102 111 111 9033 98 97 114 >> > foo⍉bar >> > >> > >> > Regards, >> > Elias >> > >> > >> > On 28 April 2014 12:17, David B. Lamkins <dlamk...@gmail.com> wrote: >> > I can use lib_file_io to read a sequence of byte values from a >> > file >> > containing Unicode text. >> > >> > How do I convert that sequence back to a Unicode string in GNU >> > APL? >> > >> > >> > >> > >> > >> >> >> > -- "The secret to creativity is knowing how to hide your sources." Albert Einstein http://soundcloud.com/davidlamkins http://reverbnation.com/lamkins http://reverbnation.com/lcw http://lamkins-guitar.com/ http://lamkins.net/ http://successful-lisp.com/