Thank you.

I specifically object to this:

:: Therefore I
:: suggest an alternate solution, at least for the interim: foreigns (scary
:: and obscure, per above) that will _intentionally misinterpret_ data from
:: the outside world as 'UCS-1' and represent it compactly (or do the
:: opposite).

I think that this statement is too broad. As I think you previously pointed out, interpretation is the domain of the programmer, not the language.

I did not mean to imply that it was _solely_ the domain of the programmer. It is a delicate balancing act. J places more of the burden on the programmer than many other programming languages, and I do not think that is a bad thing.

But there are limits. I do not need to carefully track which words are integers and which are floats. I can imagine a version of j in which this were the case; in which, for instance, + would interpret its arguments as integers and add them, and +. would interpret its arguments as floats and add them, and you would have to be very careful to always choose the correct addition routine.

That language would be more expressive than the one we have, but I think it would be easy to write buggy code, and it would be much fun to use. Having to manually track how characters are encoded seems not too different from having to manually track how numbers are encoded.

And it is possible to make a software floating-point implementation in j, effectively interpreting integers as floating-point numbers. That is effectively what I propose to permit, but as a performance hack rather than for any desirable semantic properties.


I also object to any proposal to retract language primitives where we do not have demonstrated working replacements whose merits we are judging. (I could go into more detail on how I think these issues should be approached, if you like.)

Please do.


Specifically, removing u: does not seem like it would solve any of the
problems you highlighted with your examples involving x, y and z. (I
think bill lam's post --
http://www.jsoftware.com/pipermail/programming/2022-March/060355.html
-- gives a succinct description of the *language* (as opposed to foreigns or the operating system).)

I agree that removing u: is not sufficient.  But:

1. I do think that removing u: is necessary.

2. Insofar as we are considering only language issues (rather than
   issues of interoperability), I do not see why it is desirable to
   promote as primary routines for translating between encodings.

3. The language is not consistent; ": treats text as if it were
   utf-encoded, but promotion (} , " etc.) treats text as if it were
   ucs-encoded.

4. I think that issues of interoperability _are_ important, and still
   worthy of discussion.  For instance, more consistent display (and this
   is arguably not a foreign issue, but ":'s fault) would make clear what
   was happening with x, y, and z.


Taking a step back to your opening example in this thread:

Your z was not a valid utf-32 sequence. It would represent a valid utf-8 sequence, *if* the programmer treated it as such. When treated as a utf-32 sequence, the result would be invalid.

That's part of the problem. z is a perfectly good, valid utf-32 sequence. When interpreted as a utf-32 sequence, it does not represent the same thing as x does when interpreted as a utf-8 sequence. But there is nothing wrong with aób, nor with a utf-32 representation of it.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to