|
On 10/31/2018 10:32 AM, Janusz S. Bień
via Unicode wrote:
Let me remind what plain text is according to the Unicode glossary: This definition becomes tautological only when you try to invoke it in making encoding decisions, that is, if you couple it with the statement that only "elements of plain text" are ever encoded. For that purpose, you need a number of other definitions of "plain text". Including the definition that plain text is the "backbone" to which you apply formatting and layout information. I personally believer that there are more 2D notations where it's quite obvious to me that what is "placed" is a text element. More like maps and music and less like a circuit diagram, where the elements are less text like (I deliberately include symbols in the definition of text, but not any random graphical line art). Another definition of plain text is that which contains the "readable content" of the text. As we've discussed here, this definition has edge cases; some content is traditionally left to styling. Example: some of the small words in some Scandinavian languages are routinely italicized to disambiguate their reading. Other languages use accents for this purpose - sometimes without recognizing either the accented letter as part of the alphabet, or the accented form as a dictionary entry. Which nicely shows, that this level disambiguation is intuitively viewed as less orthographic, something that applies to the cases where italics are used for the same purpose. In some contexts (Western Math) the scope of readable content is different than that of ordinary text. Therefore, this definition of "plain text" isn't universal. In principle, you could argue that your definition of readable content should apply; however, as a standard, Unicode will insist on limiting the encoding to text elements required by some common, widely shared and reasonably agreed-upon definition of plain text -- corresponding to a particular division between text elements and styling. So far, we have ordinary text, math and phonetics, but we don't have an agreement that reproducing all variations in manuscripts is in scope. A./ |
- Re: A sign/abbreviation for &q... Marcel Schneider via Unicode
- Re: A sign/abbreviation for &q... Asmus Freytag via Unicode
- Re: A sign/abbreviation for &q... Marcel Schneider via Unicode
- Re: A sign/abbreviation for &q... Richard Wordingham via Unicode
- Re: A sign/abbreviation for &q... Asmus Freytag via Unicode
- Re: A sign/abbreviation for &q... Khaled Hosny via Unicode
- Re: A sign/abbreviation for &q... Asmus Freytag (c) via Unicode
- Re: A sign/abbreviation for "magister&... James Kass via Unicode
- Re: A sign/abbreviation for "magi... James Kass via Unicode
- Re: second attempt Janusz S. Bień via Unicode
- Re: second attempt Asmus Freytag via Unicode
- Re: second attempt Ken Whistler via Unicode
- Re: A sign/abbreviation for &q... James Kass via Unicode
- Re: A sign/abbreviation for &q... Janusz S. Bień via Unicode
- Re: A sign/abbreviation for &q... Asmus Freytag via Unicode
- Re: A sign/abbreviation for &q... Janusz S. Bień via Unicode
- Re: A sign/abbreviation for &q... Asmus Freytag via Unicode
- Re: A sign/abbreviation for &q... Janusz S. Bień via Unicode
- Re: A sign/abbreviation for &q... Richard Wordingham via Unicode
- Re: A sign/abbreviation for &q... James Kass via Unicode
- Re: A sign/abbreviation for &q... James Kass via Unicode

