Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

vanisaac Wed, 02 Jun 2010 04:37:46 -0700

From: William_J_G Overington ([email protected])
> On Tuesday 1 June 2010, John H. Jenkins <[email protected]> wrote: 
>   
> > First of all, as Michael says, this 
> > isn't character encoding. 
>   
> Well, it is a collection of portable interpretable object code items encoded 
> within a character encoding as if the items were characters.


There is a monumental gap between "items encoded ... is if [they] were 
characters" and actual characters. This gap is the gap between Unicode and not 
Unicode.

> > You're not interchanging plain text. 
>   
> True, but the items are interchanged as if they were plain text items within 
> the structure of the way that plain text is interchanged. 

Lots of things are interchanged. Machine code is interchanged, Scalable Vector 
Graphics are interchanged, executables are interchanged. None of these are 
plain text. They should not be interpreted as plain text. They should not be 
displayed as plain text, except for providing a way for those who understand 
the "text" as merely a representation of bytes of data that have a non-plain 
text meaning, so they can check the data. Object code is not Unicode, it is 
something else.

> > This is essentially machine language 
> > you're writing here, and there are entirely different venues 
> > for developing this kind of thing. 
>   
> Well, it is an object code for a virtual machine rather than a machine code 
> for a virtual machine as external name links can be included. Also, it has 
> high level language style constructs of while loops and repeat loops rather 
> than the jump to an address instructions of a typical machine code. Also, it 
> is relocatable in relation to the underlying memory structure of the host 
> computer: some machine codes can be relocatable as well, so I am not claiming 
> relocatablity as a distinguishing feature from machine code, I am just 
> mentioning the relocatability feature of the portable interpretable object 
> code. 

There is not difference between a virtual machine code and a physical machine 
code toa CHARACTER encoding standard. The fact that it has a high level 
language style means nothing, absolutely nothing. C code is C code, whether it 
is encoded as ASCII, Unicode, ISCII, Big 5, Shift-JIS, or anything else. The 
details of object code are immaterial to it being fundamentally a form of 
machine language, not a character.

> > Secondly, I have virtually no idea what problem this is 
> > attempting to solve unless it's attempting to embed a text 
> > rendering engine within plain text.  If so, it's both 
> > entirely superfluous (there are already projects to provide 
> > for cross-platform support for text rendering) and woefully 
> > inadequate and underspecified.  Even if this were 
> > sufficient to be able to draw a currently unencoded script, 
> > the fact of the matter is that it doesn't allow for doing 
> > anything with the script other than drawing.  
> > (Spell-checking?  Sorting?  Text-to-speech?) 
>   
> The portable interpretable object code is intended to be a system to use to 
> program software packages to solve problems of software globalization, 
> particularly in relation to systems that use software to process text. 
>    
> > Unicode and ISO/IEC 10646 are attempts to solve a basic, 
> > simply-described problem:  provide for a standardized 
> > computer representation of plain text written using existing 
> > writing systems. 
>   
> Well, that might well be the case historically,

It is the case now.

>  yet then the emoji were invented and they were encoded.

Every writing system was invented.

> The emoji existed at the time that they 
> were encoded, yet they did not exist at the time that the standards were 
> started.

Immaterial. The question is whether they ARE plain text that is used as Plain 
text.

> So, if the idea of the portable interpretable object code 
> gathers support, then maybe the defined scope of the standards will 
> become extended. 

No. Unicode encodes plain text. Period. Emoji are no different. They are 
exchanged as plain text, and act as plain text. They were not encoded before 
they were exchanged as plain text, they were only encoded ONCE they were used 
as plain text. The key word here, and everywhere else, is "Plain text". If 
it's not Plain text, it is not, has never been, and never will be germaine.

> > That's it.  Any attempt to use 
> > the two to do something different is not going to fly. 
>   
> Well, I appreciate that the use of the phrase "not going to fly" is a 
> metaphor and I could use a creative writing metaphor of it soaring on 
> thermals above olive groves, yet to what exactly are you using the 
> metaphor "not going to fly" to refer please? 

You know perfectly well what it means, seeing as you speak native, colloquial 
English. You may think it's cute, but the people who have responded to you are 
serious people who have dedicated their lives to addressing the real issues of 
globalization, and it is both disrespectful and counterproductive to make 
comments like this.

> I know of no reason to think that a person "skilled in the art" would be 
> unable 
> to write an iPad app to receive a program written in the portable 
> interpretable 
> object code arriving within a Unicode text message and then for the program 
> to 
> run in a virtual machine within the app, displaying a graphical result on the 
> screen of the iPad. Could such an app be written based on the information in 
> the 
> paper_draft_005.pdf document? 

Which is the single most important evidence to say that this is NOT plain 
text, and is completely, totally, and undeniably not Unicode.

> The Unicode Technical Committee considers proposals. If a proposal for 
> encoding a 
> portable interpretable object code becomes placed before them, then the 
> Unicode 
> Technical Committee will be able to assess the proposal in accordance with 
> their 
> rules as those rules stand at the time. 

The people who have responded to you have immeasurable experience with the 
Unicode Technical Committee. They are telling you EXACTLY what will be said in 
a UTC meeting. There is no subterfuge. They aren't hazing the newbie. This is 
the knowledge of experts who know the Standard inside and out, because they 
helped write it. Many of the people on this list actually ARE members of the 
UTC, and if they disagreed in the slightest, they would instantly chime in. No 
one is coming to your side and saying "there is a small chance that this could 
happen".

> > Creating new writing systems, directly embedding language, 
> > directly embedding mathematics or machine language--all of 
> > these are entirely outside of Unicode's purview and WG2's 
> > remit.  They simply will not be adopted. 
>   
> Well, the emoji is a new writing system and that is being encoded. The 
> encoding of 
> the emoji has made me realize that the encoding of the portable interpretable 
> object code is not an impossibility. 

Emoji were not created, then suggested to Unicode so they'd be interchanged: 
they were created and promulgated, as plain text - not as if they were plain 
text, but actually as plain text - and then encoded in Unicode. If you are 
pinning your hopes on Emoji as a precedent, you are sadly unaware of even the 
most basic tenets of the Standard.

> > Your enthusiasm may be commendable, but you're spending 
> > your energy developing something which is not appropriate 
> > for inclusion within Unicode. 
>   
> Thank you for your first remark, yet whether the portable interpretable 
> object 
> code is or is not appropriate for inclusion within Unicode is a matter that 
> is 
> not decided at this time. 

Actually, it is. Whether you wish to accept it is the only undecided matter.

> There was a time when emoticons were not regarded as appropriate for 
> inclusion in 
> Unicode, yet they are now being encoded. That is an important precedent that 
> what 
> is appropriate depends upon the circumstances at the time, not on what was 
> the 
> policy previously. 

Emoticons (as emoji) are exchanged as plain text. The only consideration that 
changed was whether they should be considered as markup or not. Eventually, it 
became clear that they no longer do classify as markup, but as plain text. 
This was not a change inpolicy, it was a development in evidence.

> Plane 12 is empty at present and I am unaware of any other plans for its use. 
> Rather 
> than a phrase such as "not appropriate" being used I feel that the approach 
> could be 
> that there is plane 12, someone is suggesting using it for a futuristic idea, 
> so let 
> us have a look at the idea, let us study the idea and try to improve it so as 
> to get 
> the best possible result and then, as long as it is possible to demonstrate 
> that 
> implementing the idea will do no harm, let us implement it. 

Planes 15 and 16 are Private Use planes. Nobody cares what you do there. 
That's what they're for. The only thing that Unicode has to say about them is 
that they are for Private Use, and public use by private agreement. Just 
because there are no CURRENT plans for a plane does not mean they are open for 
you to do whatever the heck you want, just as you shouldn't use the empty 
spots in the Greek block. Reserved means "don't do anything here". If you want 
to fulfill your craziest desires, like writing in Klingon pIqaD, use the 
Private Use areas.

> William Overington 
>   
> 2 June 2010 

If you wish to discuss this further, please do so by private email, not on the 
list.

Van Anderson

Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

Reply via email to