>[EMAIL PROTECTED] (ewitness - Ben Fowler) wrote: >[snip] >> I don't mind admitting that as an outsider to the XML standard, this >> looks like a bad, even a really bad, idea. >> >> My reading of your commentary is "Whitespace is sometimes respected, >> and only a langauge lawyer can tell you when". > >Well, in some sense you are right, there are a lot of "really >bad ideas" hidden in this area. However, you have to see this >in context.
I most certainly am looking at it in context. I was trying to do something simple and intuitive and it turned out gnarly and difficult. XML is meant to build on other things such as SGML, DSSSL and HTML by avoiding their mistakes. >A *real* typesetter doesn't care about whitespace and line feeds, >he thinks in paragraphs and columns and pages of flowing text, >with various indentations and margins and such. Exactly so, and he thinks of leading and line height, and he thinks of paragraphs with 'space before' and 'space after'. I am prepared to argue that FO is a 'real' typesetter here, and should 'think' the same way. >TeX was practically written to support this view, and this is >the default how FO processors work. Quoting from the XML-dev list, a gentleman wanted to play space cadets and we got unix, another gentleman wanted to distribute his phone list and we got the WW web. Pretty much every worthwhile advance in the computer field has come from one person with a problem to solve. TeX came about because Professor Knuth <URL: http://www-cs-faculty.stanford.edu/~knuth/ > knew that computers could aid typesetting: it was written with one practical aim, rather than supporting a view. I don't see how you can argue that because TeX has \newline and \par it follows that FO should not have a semantic <br /> or forced line-break. >The problem: not everybody is a typesetter, many people don't >even know about how to set indents and hanging indents and margins >and this stuff, but they have a space and an enter key sitting >squarely on their keyboard. I may have misread you, but I think that you have intertwined two, possibly three things. 1. Not everybody is a typesetter ... Exactly, this is why there is a division of skill or labour. Authors write and typesetters mark up and set text. This is TeX 101, exempli gratia <URL: http://www.ideography.co.uk/library/seybold/WYS_intro.html >, and <URL: http://www.ecn.wfu.edu/~cottrell/wp.html > The author of a text should, at least in the first instance, concentrate entirely on the first of these sets of tasks. That is the author's business. Adam Smith famously pointed out the great benefits that flow from the division of labor. Composition and logical structuring of text is the author's specific contribution to the production of a printed text. Typesetting is the typesetter's business. This division of labour was of course fulfilled in the traditional production of books and articles in the pre-computer age. The author wrote, and indicated to the publisher the logical structure of the text by means of various annotations. The typesetter translated the author's text into a printed document, implementing the author's logical design in a concrete typographical design. One only has to imagine, say, Jane Austen wondering in what font to put the chapter headings of Pride and Prejudice to see how ridiculous the notion is. Jane Austen was a great writer; she was not a typesetter. You may be thinking this is beside the point. Jane Austen's writing was publishable; professional typesetters were interested in laying it out and printing it. You and I are not so lucky; if we want a printed article we will have to do it ourselves (and besides, we want it done much faster than via traditional typesetting). Well, yes and no. We will in a sense have to do it ourselves (on our own computers), but we have a lot of help at our disposal. In particular we have a professional-quality typesetting program available. This program (or set of programs) will in effect do for us, for free and in a few seconds or fractions of a second, the job that traditional typesetters did for Shakespeare, Jane Austen, Sir Walter Scott and all the rest. We just have to supply the program with a suitably marked-up text, as the traditional author did. I am suggesting, therefore, that should be two distinct ``moments'' in the production of a printed text using a computer. First one types one's text and gets its logical structuration right, indicating this structuration in the text via simple annotations. This is accomplished using a text editor, a piece of software not to be confused with a word processor. (I will explain this distinction more fully below.) Then one ``hands over'' one's text to a typesetting program, which in a very short time returns beautifully typeset copy. 2. If misuse of the tab/space/enter keys is the problem, then always ignoring (unescaped) whitespace is part of the solution. Typists can add spaces and tabs as they think fit, see <URL: http://ricardo.ecn.wfu.edu/~cottrell/emacs-screen.jpg > (as if you need to), without this getting anywhere near the layout 'engine'. My problem is that whitespace is sometimes significant, I am amazed that nobody else sees this as a problem too. 3. Sometimes when typing a document one needs to end a line, and sometimes a paragraph. The return key is used to end a paragraph. It follows that some other means is needed to end a line, exempli gratia [SHIFT][RETURN], (perhaps you remember wordperfect...) and this should be stored properly in the file. >The correct way to express > >procedure foo(); > begin > dostuff:=false; > end > >would be something like: ><fo:block> > <fo:block>foo();</fo:block> > <fo:block margin-left="1em"> > <fo:block>begin</fo:block> > <fo:block margin-left="2em"> > <fo:block>dostuff:=false;</fo:block> > </fo:block> > <fo:block>end</fo:block> > </fo:block> ></fo:block> >but chances are you'll get it space- or even (shudder!) tab-indented. >(Take a postal address block for another, less IT-related example) (I did). Please don't think I intend to cause offence, but think that you have stumbled across a trap, and possibly into it. XML whilst written as, and intended for, structural mark-up, is presentation-neutral. By which I mean that incorporating presentational details in FO does not harm its XMLness, and it can still be handled with normal XML tools. There is a possible 'error of conflation' in the example you gave in the sense that, to use your phrase, the correct way to express a code procedure is something like <proc> <name>foo</foo> <body> <statement> <expression>dostuff:=false</expression> </statement> </body> </proc> and wouldn't involve fo: at all at any stage where you are thinking in terms of "getting it space or tab indented". (and note that plain text's being XML without the tags, a compiler capable of generating a parse tree, could generate that XML (and better, marking up the identifiers and operators, for example)). When we come to generate a FO file, we will definitely need a lot of line breaks for the short lines: the expressions and statements, and a lot of paragraph breaks at the end of procedures and control structures, and particularly at the end of comments which flow differently from the source itself. Your example was far to short to bring out the points under discussion. There is already a considerable body of expertise in storing code in XML format, see, for example, the Ant project. <URL: http://jakarta.apache.org/ant/index.html >, and <URL: http://craigc.com/pg/chap11.html >. There are probably better examples as well, but I don't have URLs to hand. I really don't see how code can be marked up for formatting (which is how I see FO), without using both line breaks and paragraph breaks, unless you remove all structure, emit only presention, glyphs - which is the paper equivalent of the single pixel gif for 'marking up' web pages - what you see is all you get. This is postscript/pdf type solution, where I am looking for a TeX type solution. If I am SOL, I will just have to pipe down. Is this how you see FO? I can't prove you wrong. I am asking for structure (lines and paragraphs) to co-exist with presentation. I have no refutation of a proposal that like Dante, we should abandon structure when we enter FO, FO will certainly work like that, just like galleys do. I merely think that you are rejecting the XMLness of FO, and I can't see why, and I certainly can't see any benefit. Cui bono? You seem to be making it difficult for the human reader and user (if any there be) of FO for no actual benefit. >[If i'd get a chance to correct the past, i probably kill the >inventor of the tab character before he commits his crime :-] That is a bit extreme. The tab or tabulator key enables one to create tables using tab stops instead of the space bar. When I used a typewriter (or wordperfect) I found the tab key (or its successor, tables) extremely useful. >There is a lot of whitespace formatted data out there, Which is why I attempt to claim that whitespace formatted data is XML without the tags. When I need to process that data, I would like to insert the tags, for the benefit of XSLT processing, and for the benefit of the human reader. This is why I need a line break tag. >You might have noted that in HTML+CSS <br> actually *is* redundant, I am not complaining about redundancy, given that the 'respect whitespace' mechanism is in the specifications, all I can ask is that we have two mechanisms (also redundancy). >it is just heavily (ab)used because it produces predictable results If the results are predictable, how can it be abuse? About the 7th of the commandments of useability is to never force ones users to choose between the easy way and the right way of doing something. >without fumbling with gnarly CSS settings. css settings??? >Especially if you have to bring already whitespace formatted data online >*quickly*. But we all do this, all the time. One good method is to ask authors with this background to use Star Office to create their pages their way (note the congruence of the easy way and the right way), save them as XML and convert to HTML (XHTML), PDF, RTF with XSLT. A perfect division of labour. (This came up on the WYLUG list recently. It works) <URL: http://wylug.org.uk/pipermail/wylug-discuss/2002-February/001846.html > >Typewriter habits are hard to get rid of, regardless how enraged >professionals are about this. Which is why I say [RETURN] for end of paragraph - </p>, and [SHIFT][RETURN] for end of line - <br />; to make the easy way the right way. Ben. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]