Hi Ken (and all), Thanks for your time and patience with this.
On Thu, 19 Jul 2018 18:10:49 -0700 Ken Whistler via Unicode <unicode@unicode.org> wrote: > On 7/19/2018 12:38 AM, Shai Berger via Unicode wrote: > > If I cannot trust that > > people I communicate with make the same choices I make, plain text > > cannot be used. > > Here is a counterexample [a table rendered in plain text, which is > only truly legible using a fixed-width font]. > > It isn't that "plain text cannot be used" to convey this content. The > content is certainly "legible" in the minimal sense required by the > Unicode Standard, and it is interchangeable without data corruption. > The problem is that for optimal display and interpretation as > intended, I also need to convey (and/or have the reader guess) the > higher-level protocol requirement that this particular plain text > needs to be displayed with a monowidth font. > If I understand correctly, you are rejecting my claim that directionality is an issue of content, and claiming that, just like the crumbling-down of your table, it is an issue of display. But that argument is clearly disproved by the mere presence of the directionality-setting characters (RLM, LRE, etc) in the Unicode character set; in other words, your example would be convincing if Unicode included characters like "start table row" and "close table cell", AND there was an annex saying that your lines (for whatever reason) are to be treated as table rows unless a higher-level-protocol said otherwise. I believe this is not the case. > > If the Unicode standard does not impose a > > universal default, it does not define interchangeable plain text. > > And that is simply not the case. If your text is <a, b, c, !> (<L, L, > L, > ON>), that will display as {abc!} in a LTR paragraph directional > ON>context and as {!abc} in a RTL paragraph directional context. > [...] if plain text doesn't forcefully carry with it and > require how it must be displayed, well, then it isn't really > interchangeable. > > But that isn't what the Unicode Standard means by plain text. And > isn't what it requires for interchangeability of plain text. If I understood your argument correctly, it amounts to a claim that Unicode defines plain text as a component in a data format, but not to be used as a full document. If that is correct, then there is much to fix -- I think that quite a lot of existing technology assumes the opposite (e.g. the use of "Content-Type: text/plain; charset=UTF-8" in MIME should be strongly discouraged, if the people who designed Unicode and UTF-8 think it is not appropriate for full documents). If I misunderstood, please correct me. > > > > My main point, whose rejection baffles me to no end, is that it > > should. > > Well, I'm not expecting that I can make you feel good about the > situation. ;-) But perhaps the UTC position will seem a little less > baffling. As I hope I've shown above, there's plenty of reason for bafflement. The UTC defines code points to encode directionality, but then refuses to treat directionality as content when it comes to paragraph directionality; it defines a higher-level-protocol as an agreement, and then turns around and says the word "agreement" actually means "decision". I can guess reasons for why the things are the way they are, but not justifications. I stay baffled. Thanks, Shai.