Re: Private Use areas

Mark E. Shoulson via Unicode Mon, 27 Aug 2018 17:46:57 -0700

But there's nothing wrong with proposing a higher-level protocol;indeed, that's what Ken Whistler was saying: you need a protocol totransmit this information. It's metadata, so it will perforce be ahigher-level protocol of some kind, whether transmitting actuallyout-of-band or reserving a piece of the file for metadata. That'sfine. I'm not sure what the advantage is of using circled charactersinstead of plain old ascii. You have to set off your reserved areasomehow, and I don't think using circled chars is the least obtrusiveway to do it. You could use XML; that would be pretty well-suited tothe task, but maybe it's overkill. If all you need is to reference some"standard" PUA interpretation (per James Kass' take on this, not WilliamOverington's), then just a header like "[PUA00001]" would work justfine. (Compare emacs with things like "-*- encoding: utf-8 -*-" orwhatever.)

For larger chunks of meta-info, XML might be a good choice, but eventhen, it could be an XML *header* to an otherwise ordinary text file. Yes, you'd have to delimit it somehow, and probably have a top header (a"magic number") to signal the protocol, but that's doable. Forapplications not supporting this protocol, such a setup is probablyeasier for the eye to skip past (even if it's long) than a bunch ofcircled letters.

A protocol like that is outside of Unicode's scope (just like XML is),but it's certainly something you could write up and try to standardizeand get used, with or without the support of ISO. People are coming upwith file formats all the time (and if you really want to used circledcharacters, go ahead. That's something for you to consider in thedesign phase of the project).


~mark


On 08/27/2018 05:20 PM, Rebecca Bettencourt via Unicode wrote:

            > That sounds like a non-conformant use of characters in
            the U+24xx block.

            Well, you are an expert on these things and I do not
            understand as to with what it would be non-conformant.
A conformant process must interpret ⓅⓊⒶⒹⒶⓉⒶ as the characters ⓅⓊⒶⒹⒶⓉⒶand not as a signal to process what follows as anything other thanplain text.
What you are proposing is a higher-level protocol, whether you realizeit or not. Unfortunately your higher-level protocol has a serious flawin that it cannot represent the string "ⓅⓊⒶⒹⒶⓉⒶ". Also, seeing a bunchof circled alphanumeric characters in a document ⓘⓢ◯ⓕⓐⓡ◯ⓕⓡⓞⓜ◯ⓤⓝⓞⓑⓣⓡⓤⓢⓘⓥⓔ.
There are plenty of already-existing higher-level protocols (youmentioned one: XML) that could be used to provide information aboutPUA characters, and they are all much better suited to that purposethan what you are proposing.

Re: Private Use areas

Reply via email to