On Thu, Oct 24, 2013 at 7:30 AM, Sandro Magi <[email protected]> wrote:
> On 24/10/2013 10:10 AM, Jonathan S. Shapiro wrote: > >> >> One consequence is that HTML is not inherently a line-oriented format. >> You /can/ maintain a file in line-oriented form if you do so by hand, but >> the HTML editors generally won't. >> > > Seems silly, considering an LF doesn't take any more storage than a space > char, and the output is then human readable, and by other tools (like diff > as you mentioned). There's nothing *wrong* with emitting NL's, but the tool has no way to know when to insert them. Paragraph flow/fill is a consequence of rendering; it isn't part of the data model. You're walking a DOM tree, possibly without access to a DTD, so you don't necessarily know which elements have significant white space and which don't. And even if you do, the rendering properties of the space can be altered by the CSS white-space property. In XML, the standard specifically says that the lexer/parser *can't* remove white space. I'm not sure what the rule was in SGML. What I think is happening with the editing applications is that they are "cleaning" the input so that the editor can behave in a sane fashion. If NL is insignificant for rendering purposes, it can be a little mind bending to do cursor management properly. Did I click before or after the NL? Should the current selection be able to include something I can't see is there? For WYSIWYG purposes, does an NL render as a line break (which is what <br/> is for) or not? If it doesn't, how does the user know where they are? The smart move here is "show white space" for NL and TAB, but that confuses a lot of users. >From the editor perspective, you also don't want CR to be an input character. In most contexts, you want CR to mean "end all current elements out to the most closely containing vertical element, start a new vertical element of the same type, and insert any non-conditional elements that the DTD requires for that vertical element". Which is actually pretty tricky, since the DTD doesn't tell you what elements are vertical and what elements are horizontal. That's a CSS property, and no two CSS specifications need to agree about the answer (possibly with reason). All of which is a very long-winded way of saying that WYSIWYG editing [X]HTML/XML is a very hard problem with a lot of context ambiguity. It isn't an accident that useful free HTML editors still don't exist. shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
