Re: [Jprogramming] RFC: unicode

Raul Miller Mon, 21 Mar 2022 17:00:50 -0700

On Mon, Mar 21, 2022 at 6:05 PM Elijah Stone <[email protected]> wrote:
> On Mon, 21 Mar 2022, Raul Miller wrote:
> >> 2. That ability would not be removed.  Functionality for interacting
> >> with OS interfaces would, generally, interpret data received as being
> >> utf8-encoded, and decode it.
> >
> > In other words, exactly what the current implementation is doing.
> > Except, that's not what you proposed.
> >
> > You have instead proposed that for interacting with OS interfaces, the
> > general case would be that literal (8 bit character) data is treated as
> > ascii+latin1 with utf-8 as a special case exception.
>
> That's not true.  That's backwards.
>
> The current implementation does no automatic encoding or decoding.


Be careful here -- "implementation" and "automatic" are most likely
synonyms unless we make very careful distinctions.

But, also, "encoding" and "encoded" need not refer to the same concept here.

> It might be helpful to consider that, under my proposal, the distinction
> between 1-, 2-, and 4-byte characters is purely an implementation detail
> and an optimization, like using one byte per boolean.  If the
> implementation used solely 4-byte characters, the observable behaviour
> would not change.

That's also a description of the current system. So the thing I look
for in your proposal is: what would have changed?

> I am also not sure why you are talking about OS interfaces, but also say
> you are ignoring fread and fwrite.  I did not say anything about any other
> OS interfaces.

Consider bin/jconsole -- all input and output to and from bin/jconsole
goes through the operating system.

Or, consider the 15!:n family of foreigns.

> It seems to me that this thread is not going anywhere constructive, so I
> will probably desist from it now.

Ok.

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] RFC: unicode

Reply via email to