On Tue, 27 Nov 2018 at 20:47, Grant Taylor via cctalk <cctalk@classiccmp.org> wrote: > > I don't think that HTML can reproduce fixed page layout like PostScript > and PDF can. It can make a close approximation. But I don't think HTML > can get there. Nor do I think it should.
There are a wider panoply of options to consider. For instance, Display Postscript, and come to that, arguably, NeWS. Also, modern document-specific markups. I work in DocBook XML, which I dislike intensely. There's also, at another extreme, AsciiDoc (and Markdown (in various "flavours")), Restructured Text, and similar "lightweight" MLs: http://hyperpolyglot.org/lightweight-markup But there are, of course, rivals. DITA is also widely-used. And of course there are things like LyX/LaTeX/TeX, which some find readable. I am not one of them. But I get paid to do Docbook, I don't get paid to do TeX. Neal Stephenson's highly enjoyable novel /Seveneves/ contains some interesting speculations on the future of the Roman alphabet and what close contact with Cyrillic over a period will do to it. Aside: [[ > I'm not personally aware of any cases where ASCII limits programming > languages. But my ignorance does not preclude that situation from existing. APL and ColorForth, as others have pointed out. > I have long wondered if there are computer languages that aren't rooted > in English / ASCII. https://en.wikipedia.org/wiki/Qalb_(programming_language) More generally: https://en.wikipedia.org/wiki/Non-English-based_programming_languages Personally I am more interested in non-*textual* programming languages. A trivial candidate is Scratch: https://scratch.mit.edu/ But ones that entirely subvert the model of using linear files containing characters that are sequentially interpreted are more interesting to me. I blogged about one family I just discovered last week: https://liam-on-linux.livejournal.com/60054.html The videos are more or less _necessary_ here, because trying to describe this in text will fail _badly_. Well worth a couple of hours of anyone's time. ]] Anyway. To return to text encodings. Again I wish to refer to a novel; to Kim Stanley Robinson's "Mars trilogy", /Red Mars/, /Green Mars/ and /Blue Mars/. Or as a friend called them, "RGB Mars" or even "Technicolor Mars". A character presents an argument that if you try to summarise many things on a scale -- e.g. for text encodings, from simplicity and readability, to complexity and capability -- you can't encapsulate any sophisticated system. He urges a 4-cornered system, using the example of the "four humours": phlegm, bile, choler and sang. The opposed corners of the diagram are as important as the sides of the square; characteristics form the corners, but the intersections between them are what defines us. So. There is more than one scale here. At one extreme, we could have the simplest possible text encoding. Something like Morse code or Braille, which omits almost all "syntax" -- almost no punctuation, no carriage returns or anything like that, which are _metadata_, they are information about how to display the content, not content themselves. Not even case is encoded: no capitals, no minuscule letters. But of course a number of alphabets don't have that distinction, and it's not essential in the Roman alphabet. Slightly richer, but littered with historical baggage from its origins in teletypes: ASCII. Much richer, but still not rich enough for all the Roman-alphabet-using-languages: ANSI. Insanely rich, but still not rich enough for all the written languages: Unicode. (What plane? What encoding? What version, even?) At the other extreme, markup languages that either weren't really intended for humans but often are written by them -- e.g. the SGML/XML family -- or are only usable by relatively few humans -- e.g. the TeX family -- or that are almost never used by humans, e.g. PostScript, or HP PCL. And what I find a fairly happy medium -- AsciiDoc, say. Perfectly readable by untrained people as plain ASCII, can be written with mere hours of study, if that, but also can be processed and rendered into something much prettier. The richer the encoding, the harder it is for *humans* to read, and the more complex the software to handle it needs to be. So, yes, ASCII is perhaps too minimal. ANSI is just a superset. But I'd argue that there _should_ be a separation between at least 2, maybe 3 levels, and arguably more. #1 Plain text encoding. Ideally able to handle all the characters in all forms of the Latin alphabet, and single-byte based. Drop ASCII legacy baggage such as backspace, bell, etc. #2 Richer text, with simple markup, but human-readable and human-writable without needing much skill or knowledge. Along the lines of Markdown or *traditional* /email/ _formatting_ perhaps. #3 Formatted text, with embedded control codes. The Oberon OS does this. #4 Full 1980s word-processor-style document, with control codes, formatting, font and page layout features, etc. #5 Number 4, plus embedded objects, graphics. I'm thinking PDF or the like as my model. Try to collapse all these into one and you're doomed. -- Liam Proven - Profile: https://about.me/liamproven Email: lpro...@cix.co.uk - Google Mail/Hangouts/Plus: lpro...@gmail.com Twitter/Facebook/Flickr: lproven - Skype/LinkedIn: liamproven UK: +44 7939-087884 - ČR (+ WhatsApp/Telegram/Signal): +420 702 829 053