Hi William,
I don't fully understand your proposed encoding scheme (e.g., Is there a
namespace each encoding scheme is bound to? How do namespaces get
encoded? How are syntax strictures encoded?), but even then, presuming
it's sound, you've said in the message before that this encoding space
will enhance interoperability. What mechanism is in place to make my
encoding space interoperable with yours? Perhaps, independent of each
other, you bind !123 to a character semantically identical to one I've
bound to !234. What rules are in place to allow interchangeability? What
about one-to-many or many-to-many or vague or ambiguous mappings across
encoding schemes, or mappings that we might reasonably contest?
Or maybe you're not so much concerned about interoperability as are you
are with extending the PUA block beyond its current limits? Something
like SGML/XML entities? Couldn't you simply capitalize on the rules that
already exist for entities?
Best wishes,
jk
--
Joel Kalvesmaki
Director, Text Alignment Network
http://textalign.net
On 2020-02-14 15:52, wjgo_10...@btinternet.com via Unicode wrote:
The solution is to invent my own encoding space. This sits on top of
Unicode, could be (perhaps?) called markup, but it works!
It may be perilous, because some software may enforce the strict
official code point limits.
I have now realized that what I wrote before is ambiguous.
When I wrote "sits on top of Unicode" I was not meaning at some code
points above U+10FFFF in the Unicode map, though I accept that it
could quite reasonably be read as meaning that.
My encoding space sits on top of Unicode in the sense that it uses a
sequence of regular Unicode characters for each code point in my
encoding space.
For example
∫⑦⑧①
or
!781
or
a character sequence of a base character, followed by a tag
exclamation mark followed by three tag digits and a cancel tag.
All three examples above have the same meaning.
∫⑦⑧① is useful as more unlikely otherwise than !123, though !123 is
easier to use and could be used in a GS1-128 barcode.
The tag sequence has the potential to become incorporated into Unicode
for universal standardization of unambiguous interoperability
everywhere. That is a long term goal for me.
The example above uses a three-digit code number. My encoding space
allows for various numbers of digits, with a minimum of three digits
and a much larger theoretical maximum. The most digits in use at
present in my research project in any one code number is six.
William Overington
Friday 14 February 2020