On Mon, 16 Jul 2018 17:40:50 -0700 Ken Whistler via Unicode <unicode@unicode.org> wrote: > > So your complaint seems to boil down to the claim that if you > transmit "Hello, world!" to a process which then renders it > conformantly according to the Unicode Standard (including > UBA), then that process must somehow know *and honor* > your intent that it display in a LTR directional context. That > information, however, is explicitly *not* contained in > the plain text string there, and has to be conveyed by means of a > higher-level protocol. > (E.g. HTML markup as dir="ltr", etc.) > I believe this is an inaccurate description, but indeed the discrepancy is at the root of the issue here.
The UBA defines a default algorithm for determining the directionality of plain text paragraphs. My claim is that in the absence of an agreed or conveyed higher-level protocol, this default must be respected. > If the receiving process, by whatever means, has raised its hand and > says, effectively, "I assume a RTL context for all text display", > that is its right. You can't complain if it displays your "Hello, > world!" as shown above. Well, you *can* complain, but you wouldn't be > correct. Basically, you and the receiving process do not share the > same assumptions about the higher-level protocol involved which > specifies paragraph direction. > This, essentially, boils down to a claim that the default is not really a default, but itself must be the subject of agreement between sides. My view is that expressed by FAQ #bidi7 -- a higher-level protocol is an agreement. It can be explicit (e.g. HTML) or implicit (e.g. the convention that log files are to be read LTR), but it cannot be applied in a void, or else interoperability is lost. > OR, you are just unhappy about the bidirectional > rendering conundrums > of some edge cases for the UBA. I wish they were -- while the "Hello, World!" example is a bit of a contrition, the "SESU RETHO DNA email ROF plaintext REFERP I" example is quite cental to the UBA, and represents an extremely common case; Hebrew paragraphs with embedded English words are at least whole percents of all paragraphs written in Hebrew about technology, for example. On Mon, 16 Jul 2018 21:51:32 -0700 Asmus Freytag via Unicode <unicode@unicode.org> wrote: > [The Unicode Standard's] conformance clause is written to allow > implementations to solve real-world issues without becoming formally > non-conformant. I accept that this was the intention; I claim that, as things are currently written, they cause more real-world issues than they solve. The only example given here of a real-world issue served by abolishing the UBA defaults is performance degradation on some special files -- which are just as easy to treat specially, as Eli described in the case of Emacs and logs. One other consideration raised boils down to, "it's better to make some texts completely unreadable, then to present some other texts readably, but with the wrong alignment". The trade-off you seem to prefer is to make the "plain text is universally readable" idea from the core Unicode definition, not applicable to BiDi text. Why? Thanks, Shai