Re: The Unicode Standard, Version 3.1

William Overington Sun, 01 Apr 2001 04:29:22 -0700
I am running a PC that has Windows 95, Word 97 and Internet Explorer 4.

I downloaded the zip file and unzipped it and got the font file.  I then
used Word 97, set the font to Code 2001 and the size to 24 point.

I added a letter a to make sure it was working and then used Insert Symbol
to add three of the non-everyday characters.  There were not a lot from
which to choose, so I expect that most of the characters in the font are not
available with the computer system that I am using.

I added a character that looks vaguely like a Greek theta, a character that
looks like an ornate capital N and a character that looks as if it is a
runic character something like a mirror image of a letter K.

I saved as HTML from Word 97 and looked at the source code provided.

The four characters are coded as

a&#63153;&#8469;&#63737;

Two of the characters, the &#63153 and the &#63737 appear to be from the
private use area.

I wonder if I might put forward a concept that I have temporarily called
nanoconformance.  The idea that I am trying to express is that an individual
sat at one particular PC or other computer is trying to establish whether
that system either does or does not have certain unicode facilities
available.  For example, please consider that someone has obtained the Code
2001 font and wishes to establish whether the system that he or she is using
can handle Gothic or can be modified so that it handles Gothic.  He or she
tries this and that and is looking for a result.  Would it be helpful to
have what might be called "nanoconformance criteria" such that one might,
given conforming items such as a font file obtained from elsewhere, be able
to say that the local computer system is conforming, at an everyday working
level, not as a testing centre testing formal conformance to standards.  For
example, specifying one character from the Gothic font, one that looks
rather distinctive yet may be roughly described in everyday terms that if
one can get that character on the screen then one has managed to get Gothic
working properly.  The nanoconformance criteria might be specified on a
website showing the character in illustration, its U+ value, the value as a
decimal and the character expressed as a surrogate pair of 16 bit
characters, so that the information would be readily available so that that
set of information would be often used by people trying to test to see if
Gothic is available.  For example, one of the nanoconformance criteria for
Gothic might be to test whether a browser could handle Gothic and might
suggest adding &# and some particular decimal number into some HTML code and
trying to display the code in the browser.  If it works, the browser will
display as a character the image shown in the gif file accompanying the
nanoconformance criteria.

I have added further comments below within the copy of the previous posting.

William Overington

1 April 2001

-----Original Message-----
From: James Kass <[EMAIL PROTECTED]>
To: Unicode List <[EMAIL PROTECTED]>
Cc: Peter Constable <[EMAIL PROTECTED]>; Michael (michka) Kaplan
<[EMAIL PROTECTED]>
Date: Saturday, March 31, 2001 9:59 AM
Subject: Re: The Unicode Standard, Version 3.1


>
>Michael (michka) Kaplan wrote:
>
>...
>> It is not
>> obsolete in pratical terms until there is widespread support in the way
of
>> fonts, keyboards, IMEs, and the other important items that help bring
>> characters to the user.
>>

What is an IME please?

What are the other important items please?

>
>Here is a freeware Plane One font for testing:
>http://home.att.net/~jameskass/code2001.htm
>
>Included are Old Italic, Deseret, and Gothic, as well as a few other
>items extrapolated from the Roadmap and preliminary proposals.
>Constructive comments are welcome.  (I know that the math
>letter variants are incomplete.)  It works in W2K with the
>word processor, but haven't been able to display any
>supplementary characters in the HTML browser yet.
>
>As far as keyboards/IME, if anyone has a notion of what a Deseret
>or Gothic keyboard should look like (and a need for one), please
>let me know.

You might like to have a look at my softboard toolbar system that I use as a
demonstration example in my documents on 1456 object code at
www.users.globalnet.co.uk/~ngo which is our family webspace in England.

In the event that someone might like to have a Deseret or Gothic keyboard,
perhaps such a technique might be useful.

The specific documents are www.users.globalnet.co.uk/~ngo/14561500.htm and
www.users.globalnet.co.uk/~ngo/14561600.htm and the documents
www.users.globalnet.co.uk/~ngo/14561900.htm and
www.users.globalnet.co.uk/~ngo/14562000.htm might also be of interest.

Various matters arise before a Deseret or Gothic keyboard could be produced.

Deseret and Gothic are in The Supplementary Multilingual Plane, that is,
Plane 1.  So although 1456 object code could handle them internally and
produce a surrogate pair for each character, the matter arises as to what
will happen if an attempt is made to display them by initiating calls to
Java system service commands.  Firstly, I have thus far only been able to
get my program to access sanserif, serif and monospaced fonts, so the first
problem is to whether the Code 2001 font can be accessed by my system.
Maybe I can learn some part of Java that will enable me to do this or maybe
Java needs to alter the facilities available for fonts, I do not know at
present.

The next thing is as to what will happen if a Windows 95 system running
Internet Explorer 4 receives from a Java applet a request to print a string
of characters and there is a surrogate pair within that string.  Will the
system respond as two unknown characters or try to find the correct 21 bit
character?

If Internet Explorer tries to find the correct character, will the Windows
95 operating system be able to resolve the character from the font file?

I may not have asked the right questions in the right order, and indeed if
anyone cares to say something about these issues in detail then that would
be helpful.

I feel that set ups such as Windows 95 running Internet Explorer 4 and Word
95 and Word 97 are likely to be in use for many years, maybe not in the
cutting edge front offices of big corporations but given the trend to not
throw older yet still usable computers in the bin but to pass them to
colleges, training centres, libraries and so on, it might be helpful if
there were a centralized collection of information available as to what
updates are possible and what is totally impossible and a collection of
software updates where these are available.

In a later version of the 1456 Engine that I have not yet published I have
included a facility for introducing 21 bit unicode characters from the
software stream using =uhhhhhh using six hexadecimal characters.  I have
also added a command to convert a 21 bit unicode character to either one or
two sixteen bit unicode characters as appropriate and to produce a 21 bit
unicode character from one or two sixteen bit unicode characters as
appropriate.  The link flag is set or reset to indicate what has happened.

I feel that 1456 object code is an interesting example of whether 21 bit
unicode can be used in practice.  The 1456 object code, with the new =u
command and the associated commands, can handle 21 bit unicode characters.
So 1456 object code could accept Gothic characters from its software stream
and store them and manipulate them.  I have available the Code 2001 font
that contains the Gothic characters.  So the question arises, is it
presently possible, on the system that I am using or some other system, from
a 1456 object code program, using Java system services to display the Gothic
characters from the Code 2001 font on the screen of a computer?

Although the 1456 Engine currently available in our family webspace does not
have the =uhhhhhh facility and the associated commands, if anyone would like
to try to get Gothic working from 1456 object code it is possible to do so
because he or she could enter the character as a surrogate pair using two
'uhhhh commands, because when using the 21 bit form using =uhhhhhh the
intention would be to first convert a Gothic character to a surrogate pair
of 16 bit characters before attempting to display the character.

On another aspect of the move to 21 bit unicode, does anyone know what Sun
intends to do, or perhaps has already done, about entering 21 bit unicode
characters into Java source code please?

>
>Best regards,
>
>James Kass.
>
>
>
>
Re: The Unicode Standard, Version 3.1

Reply via email to