Re: [whatwg] HTML spec incorrectly suggests that br can have its rendering changed with CSS
Daniel Holbert dholb...@mozilla.com wrote: So: to reflect reality, it might be better to specify br in a way that doesn't suggest it's as customizable with CSS. (for the white-space property in particular, but probably others as well) For reference, here's a page with a few testcases: http://people.mozilla.org/~dholbert/tests/br-tests.html The browsers that I tested[1] all agree on the rendering (basically, not honoring any of the br styling), with one minor exception[2]. Thanks, ~Daniel [1] I tested the following browsers: Firefox 26 Opera 12.16 Chrome 34.0.1788.0 dev IE 11 [2] I only noticed one rendering difference -- IE11 honors border on br, unlike the other browsers that I tested. (It still doesn't honor e.g. display/width/height, though.) I get different results on your test case for the bottom two tests. In Chrome 33 and Opera 12.16 (Linux), there is a line break; in Firefox 26 there isn't. This matches a fault report that we had from a customer a few years about a page that didn't lay out properly in our browser (but did in Opera) that I tracked down to being that we permitted br elements to be styled, just like Firefox (26.0) does. I've put a suitably anonymised version of the test case on my own website: http://www.metahusky.net/~stewart/css/br/br-rendering.html And yes, the real page really did have the first line of its stylesheet as: * { position: absolute; margin: 0px; float: left } -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] [Notifications] Constructor should not have side effects
Glenn Maynard gl...@zewt.org wrote: On Tue, Jan 29, 2013 at 7:36 AM, Charles McCathie Nevile cha...@yandex-team.ru wrote: Really? This doesn't seem like a good idea, so I'd be interested to know why. Is there an explanation laid out somewhere? Just to ask from another perspective: why doesn't it seem like a good idea? All the information required to activate the object may not be available at the point at which the object is constructed. For example, how can you add event listeners to something that doesn't exist yet? Particularly if you want to use a closure with the new object. Having objects that begin their job when constructed simply avoids an extra step for the user (telling it to start), and reduces the number of possible states (eg. eliminating the UNSENT state), which generally simplifies things. Supporting reuse of objects is generally not a useful optimization (in my experience), so not supporting it also simplifies things a bit. Reducing the number of different-but-equivalent ways of doing the same thing is also generally good API design. I agree - but I don't see what this has to do with separating construction from activation. -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] A plea to Hixie to adopt main
Steve Faulkner faulkner.st...@gmail.com wrote: Hi Cory, I don't know if this is relevant at all, but according to the spec (section 4.4.1), The body element represents the main content of the document. What would you say is the relation between this use of the term main and your use of the term here? Might it perhaps be more accurate to state, The body element represents the *entire* content of the document (or something like that). I don't really know -- just asking. I filed a bug about this recently: https://www.w3.org/Bugs/Public/show_bug.cgi?id=19967 It doesn't necessarily. I've come across pages that expect the head to be displayed too. e.g. tests at http://meyerweb.com/eric/css/tests/css3/ like http://meyerweb.com/eric/css/tests/css3/show.php?p=caption-side -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] A plea to Hixie to adopt main
Steve Faulkner faulkner.st...@gmail.com wrote: Stewart wrote: It doesn't necessarily. I've come across pages that expect the head to be displayed too. e.g. tests at http://meyerweb.com/eric/css/tests/css3/like http://meyerweb.com/eric/css/tests/css3/show.php?p=caption-side Is this a common mark up pattern? I've not gone looking for any other real-world examples - that's the only one I've seen. However, I can't think of any reason why it shouldn't work, as it's just a block box like the body element (usually) is. -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
Johan Sundström oyas...@gmail.com wrote: Hi everybody! Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. Does XMLSerializer().serializeToString(document) not meet your requirement? -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] [html5] Question on the structured cloning algorithm
Ian Hickson i...@hixie.ch wrote: On Tue, 24 May 2011, Stewart Brodie wrote: Do getters need to be called to obtain a value which can be stored (after being cloned itself) in the result? I'm not sure I follow the question. Can you elaborate? Are getters called during cloning? i.e. what do I get if I clone this: { get a() { return 1 } } Do I get { a: 1 } or do I get {} ? -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
[whatwg] [html5] Question on the structured cloning algorithm
The section on the structured cloning algorithm has a Note that says Property descriptors, setters, getters, and analogous features are not copied in this process. Is this note part of the normative definition of the algorithm, or just a non-normative helpful explanatory note? The typographic convention description set out in section 1.8.2 doesn't say either way. Do getters need to be called to obtain a value which can be stored (after being cloned itself) in the result? -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] Media elements statistics
Steve Lacey s...@chromium.org wrote: [Media elements] Another open question: what are sensible values if the information is not available. Zero seems wrong. This is a question that I have considered for some time for all the properties in HTMLMediaElement interface, not especially for your new proposed statistics. I have not come to any particular conclusions as yet. One option is to try to invent plausible values for the property values. e.g. you can seek anywhere. For us, with a third-party black box media player sitting at several levels of abstraction away (and sometimes on a separate processor), we do not have access to most of the information that is necessary to drive the algorithms. For example, network usage (if indeed there is any at all), which frames you have, seekable ranges, buffering statistics - all of these are unavailable. -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] [html5] Attaching option elements to select elements in different documents
Boris Zbarsky bzbar...@mit.edu wrote: On 3/3/10 12:11 PM, Stewart Brodie wrote: As far as I can tell, this affects: HTMLSelectElement.add(), HTMLOptionsCollection.add(), Node.appendChild(), Node.replaceChild(), Node.insertBefore(). Is it option-specific, though? Last I checked, various browsers implicitly adopted on append/insert/replace, period. Since when? I was sure that they didn't used to do this. DOM Core is extremely clear on this issue (both in level 2 and level 3). You appear to be correct: Firefox and Opera both just ignore the standard and get this wrong. Chrome just seems to get confused. -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] [html5] Attaching option elements to select elements in different documents
Anne van Kesteren ann...@opera.com wrote: On Thu, 04 Mar 2010 11:27:23 +0100, Stewart Brodie stewart.bro...@antplc.com wrote: Boris Zbarsky bzbar...@mit.edu wrote: On 3/3/10 12:11 PM, Stewart Brodie wrote: As far as I can tell, this affects: HTMLSelectElement.add(), HTMLOptionsCollection.add(), Node.appendChild(), Node.replaceChild(), Node.insertBefore(). Is it option-specific, though? Last I checked, various browsers implicitly adopted on append/insert/replace, period. Since when? I was sure that they didn't used to do this. DOM Core is extremely clear on this issue (both in level 2 and level 3). You appear to be correct: Firefox and Opera both just ignore the standard and get this wrong. Chrome just seems to get confused. This changed a while ago due to compatibility problems. Consensus at the time was to change DOM Core. Is this documented anywhere? By compatibility problems, presumably you mean bugs in Firefox that were then exploited by content authors who didn't know better? From Maciej's description of WebKit's behaviour, it looks like either they didn't know about this consensus or they didn't implement it compatibly. This definitely needs to be documented in HTML5. Are there any more retrospective changes to fundamental behaviour specified in DOM Core in the pipeline that I need to know about? I already know about the one in DOM Event about capturing listeners being called in the target phase. That leaves the issue of how adoptNode() affects the [[Prototype]] of the node objects, which is currently inconsistent between desktop browsers. Opera Chrome agree with each other (that the [[Prototype]] is unchanged); Firefox disagrees (it changes the [[Prototype]] to be that it would have been if the node had been created anew in the destination document). -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
[whatwg] [html5] Attaching option elements to select elements in different documents
The algorithm in the HTML5 specification for attaching an option element to a select element is incomplete, because it doesn't describe how to handle the case where the option element does not belong to the same document as the select element. It seems that HTMLOptionElement objects are immune to WRONG_DOCUMENT_ERR exceptions on any tree modifications. Thus the HTML5 specification also needs to note that it is overriding the rules from DOM Core about what may be attached to what. I've written some proposed changes further below. As far as I can tell, this affects: HTMLSelectElement.add(), HTMLOptionsCollection.add(), Node.appendChild(), Node.replaceChild(), Node.insertBefore(). My tests show that this isn't even confined to the cases where the new parent node is an HTML select element - any cross-document attachment of option elements operates as though the same-document check has been bypassed. In fact, the behaviour I'm seeing looks very much like an implicit adoptNode() call has occurred. I'm basing that suspicion on a comparison of the (equally inconsistent) behaviour of adoptNode() in different browsers[*] I'm testing this from ECMAScript in my test page which is at: http://www.metahusky.net/~stewart/css/html-options/ In all browsers, if the insertion of the option succeeds, then the inserted option element compares strictly equal to the option in the receiving select element. i.e. the option tree has not been cloned. In some browsers, the [[Prototype]] of the HTMLOptionElement is reset to be HTMLOptionElement.prototype of the receiving document's script context; in others, it does not get changed. However, in all browsers, all the nodes in the option's subtree are affected similarly (i.e. if the option's [[Prototype]] changes, so does the text node's) In some browsers, you can only insert the option element if the option element is not currently attached to anything else. In some browsers, the option isn't inserted at the right index into the receiving select, but I think that must just be a different bug. I propose the following changes to the specification: Change 1: Renumber existing step 7 to step 8 and insert a new step 7 in http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#htmloptionscollection 7. If _element_ does not belong to the same document as _parent_, then act as if the DOM Core adoptNode() method was invoked on the _parent_ node's ownerDocument with _element_ as the parameter. [Aside: whilst in the vicinity, shouldn't step 3 be using node rather than element i.e. If _before_ is a *node*, but that *node* ...? Otherwise, I could legitimately insert it before any text node anywhere in the document. Should it require that _before_ has to be an option or optgroup?] Change 2: Append some text to section 2.2.1 (Conformance Requirements - Dependencies) to indicate the change to DOM Core, and include a link to the text added by change 3: Some requirements in this specification are a wilful violation of constraints imposed by the DOM Core specification [DOMCORE]: * attaching _option_ elements to different documents is permitted Change 3: append explanatory text, linked from change 2's text to: http://www.whatwg.org/specs/web-apps/current-work/multipage/the-button-element.html#the-option-element If any attempt is made to attach an _option_ element to a node in a Document other than the Document of the _option_ element, then the user agent must not throw a _WRONG_DOCUMENT_ERR_ exception. If the tree change would otherwise succeed, then the user agent must behave as if a call to the DOM Core adoptNode() method has been made so that the Document of the _option_ element is updated. This affects the DOM Core appendChild(), insertBefore() and replaceChild() methods. Actually, all of these changes might have to say _option_ or _optgroup_. [*] Opera 10.10, Chrome 5.0.307.11 beta, Firefox 3.5.8, and our own ANT Galio 3.1.0 -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] [html5] Attaching option elements to select elements in different documents
Darin Adler da...@apple.com wrote: Was your testing done with option elements created with document.createElement(option) or new Option? I ask because I seem to recall the behavior being different for at least some types of elements. That's a good idea - I forgot to test that. I've updated my test so that it tries both. The behaviour seems to be the same, regardless of how the option is created. -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
[whatwg] [EventSource] Garbage collection rules
I've been reviewing the new EventSource draft. I'm very pleased to see it converted into a separate object, rather than being tacked onto everything that implements EventTarget. This is a huge improvement. However, there are some issues that I think need to be addressed, specifically in the area of lifetime management. The GC rules in section 9 seem overly permissive - if there is a listener for message events but the script forgets to call close() when the user navigates away, then the resources it is consuming cannot be reclaimed. There is a small chance that it may be reclaimed if the server terminates the connection and a GC occurs before the UA is able to re-establish the connection (i.e. during the reconnection delay or the reconnection), but I don't think it's wise to rely on this as it would allow malicious scripts to consume resource with no way for the user agent to recover. The simplest way to prevent this would be to modify the condition in section 9 slightly to insist that the event listener is callable, drawing on the text from HTML5's Calling scripts section 6.5.3.2#1. i.e. modify the text to say: An EventSource object with an open connection must not be garbage collected if there are any event listeners registered for message events and at least one of those listeners' global object is a Window object whose Document object is fully active. In other words, the automatic marking of the EventSource now requires that at least one of the event listeners must be callable. The only difference that this makes, I *think*, is that pages in the history lose unreferenced EventSource objects. Is this true and would it actually be a problem? -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] Exposing EventTarget to JavaScript
Alex Russell slightly...@google.com wrote: But if you addEventListener, you can have multiple listeners for a given event. The only caveat is that dispatch order is undefined. w Also a bug. It's not *actually* undefined, it's triangulated by libraries. Actually, it is defined. They are called in registration order, from oldest to newest. This is stated in both the latest D3E working draft, and the older versions dating back to 2003 (at least - I didn't go back any further) -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] getElementsByClassName case sensitivity
Anne van Kesteren ann...@opera.com wrote: On Tue, 13 Jan 2009 11:17:08 +0100, Anne van Kesteren ann...@opera.com wrote: Since my initial e-mail did not seem to have done it, could you please take a look at the source code of the respective test and tell me if you see a problem there? http://tc.labs.opera.com/apis/getElementsByClassName/014.htm To be perfectly clear, there is no discrepancy between CSS handling and the getElementsByClassName method and the test is testing that there is not. Wow, epic fail. I missed it should match two elements. The test is indeed out of date. * updates the test now. Excellent - thanks! -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] getElementsByClassName case sensitivity
Anne van Kesteren ann...@opera.com wrote: On Mon, 12 Jan 2009 15:25:33 +0100, Stewart Brodie stewart.bro...@antplc.com wrote: Ian Hickson i...@hixie.ch wrote (on 25 July 2008): I've made [getElementsByClassName] consistent with how classes work in CSS (case-insensitive for quirks and case-sensitive otherwise). I was looking for some tests for this API and found some from Opera (found at http://tc.labs.opera.com/apis/getElementsByClassName/) but given the dates on them predate the latest spec changes (which causes some to fail now), I was wondering if up to date versions are now kept somewhere else instead? The tests already take this change into account. It was agreed upon way earlier prolly over IRC or so, but the specification hadn't catched up with reality yet. I'm not sure what other tests you might believe to be out of date (and why) and would be interested in knowing being the author and all :-) Specifically: test 14 - tests for case-sensitivity in a document that is in quirks mode. Are you saying that this change has now been reversed and the comparisons are always case-sensitive, thus reintroducing the discrepancy between CSS's handling of classes and this new method? -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] getElementsByClassName case sensitivity
Ian Hickson i...@hixie.ch wrote (on 25 July 2008): I've made [getElementsByClassName] consistent with how classes work in CSS (case-insensitive for quirks and case-sensitive otherwise). I was looking for some tests for this API and found some from Opera (found at http://tc.labs.opera.com/apis/getElementsByClassName/) but given the dates on them predate the latest spec changes (which causes some to fail now), I was wondering if up to date versions are now kept somewhere else instead? -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] Revised Plan for Server-sent DOM events
Henry Mason [EMAIL PROTECTED] wrote: There's recently been some talk about completely removing HTML 5 section 6.2, Server-sent DOM events. I propose that rather than remove, we revise. The major concerns I've heard about section 6.2 include: - Unnecessary dependency on DOM Events Why is this a problem? It's a facility to cause DOM events to be dispatched. - Redundancy with already existing techniques, especially XMLHttpRequest There are quite a lot of additional behavioural requirements for server-sent DOM events that do noy apply to XMLHttpRequest, specifically the automatic binding to event-source elements, plus the automatic reconnection algorithms. - Complicated parsing of event fields The major problem is determining the type information for the fields of arbitrary events. In the end, I gave up on this and simply stuffed the data into the JS Event objects as strings and allowed the interpreter to worry about the numeric conversions, provided that the field name was validated. - Inability to support cross-domain events (without the as-of-yet unimplemented and untested Access-Control HTTP header mechanism) I don't see this as a particular problem - other facilities are waiting for that to be done too. I'd rather use the same mechanism everywhere. - Continued problems of the 2 connection limit on HTTP server scalability This might be alleviated somewhat, but not resolved by moving the connections to other servers. Does anybody implement the 2 connection limit in desktop browsers anyway? Last time I actually tested, most of them appeared to be using at least 4. I propose that we remove support for non-message events; that is, allow only events with MessageEvent interface. This will make implementations easier, as UAs will only need to parse the Bubbles, Cancelable, and data fields. The only existing implementation ... that you know of ... (Opera) seems to only use the message event part of the interface anyway. In the few rare instances where general DOM Event bindings are needed, JavaScript parsing of the data field of the message events could be used. I have an implementation - it does precisely that, as I mention above. The critically cool part, however, is that since MessageEvents store their domain and URI origin, it will be safe to allow for cross- domain messaging through this server-sent events. Section 6.1 already uses this system for this very purpose. Opera has already implemented it and it has been in WebKit's trunk for about a week. The removal of the same-origin restriction actually makes this interface dramatically more useful for developers. It provides a capability (messaging with a foreign host) which is not already available to XMLHttpRequest-using applications. It also makes it easier web developers to more easily offload the long-running HTTP connections needed for event streams to separate servers. This aides in application scalability and circumvents potential problems with the 2 HTTP connection limit. Not really - it's still possible for applications to cause problems by trying to create 3 event streams. My implementation permits 2 event streams to any given host in addition to any used for normal accesses. Additionally, we have a class of privileged applications for which all the usual restrictions (cross-domain scripting, same origin checks, connection limit, etc.) are relaxed, precisely because we need sometimes require things like cross-domain XHRs in our embedded environments. This change would make server-sent events easier to implement for both UA implementers and web application developers and would make the developers more likely to use it. Removing the requirement to support anything other than MessageEvent class of events would certainly be a tremendous simplification. I'm not sure whether or not it is a good idea - it would leave us needing to perform all sorts of string parsing in our JS if we wanted to issue other types of event. In fact, if this simplification were to be made, I'd probably have to retain this ability for compatibility with our existing applications. -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] more discussion regarding codecs (Was: whatwg Digest, Vol 45, Issue 16)
Ian Hickson [EMAIL PROTECTED] wrote: There is no way we can ever guarantee that there are no covering patents. Whether a patent covers a technology or not really has more to do with what the courts say than with what the patents say. If Apple say they don't want to implement Ogg, then we have to find another solution. (Similarly -- Opera, Mozilla, et al, don't want to implement H.264. So we have to find a solution other than H.264.) Is there any codec that would satisfy everybody? I doubt it, to be honest. -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] Parsing: Greater-than characters in doctype
Simon Pieters [EMAIL PROTECTED] wrote: All browsers terminate the doctype at the first character, even if it's inside the public identifier or system identifier. I see this sort of comment a lot - I think it would be really helpful if people could state which browsers they have actually tested, because you clearly cannot have tested all browsers. IE, Firefox, Safari and Opera aren't all browsers (especially if you only test one specific version) -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?
Robert Sayre [EMAIL PROTECTED] wrote: On 11/29/06, Lachlan Hunt [EMAIL PROTECTED] wrote: I do not think it's a good idea to make the trailing slash conforming. Although it is harmless, it provides no additional benefit at all and it creates the false impression that the syntax actually does something. It does do something, in systems that think they are using XML (whether they actually are is another matter). It's possible it will prevent many information-free validation errors, and give the HTML5 more credibility as a result. Warning people about img / in the validator is a waste of their time. It's not a good idea to confuse them any more by giving the impression that it works for some elements but not others. It's better to just say it doesn't work at all and forbid it in all cases. Better? This is an opinion, and it's not backed up by data. So far, it looks like Sam has the data on his side. People do it, and it tends to work interoperably. Except when it doesn't. For example, here's a fragment of hotmail.com's signup page, served as text/html. It's the only example I've come across to date: !DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.0 Strict//EN http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd; html xmlns=http://www.w3.org/1999/xhtml; dir=ltr ... select id=iRegion name=pff010004 / script.../script /select ... The script just document.write's loads of option tags (it's the country menu). It's hard to know what the author thought was going on. Did they think it was XHTML and just got stymied by the server configuration? I'm still in favour of permitting the trailing slash, personally. -- Stewart Brodie Software Engineer ANT Software Limited
[whatwg] [WA1] Missing step in formatting element algorithm
I've hit another issue in trying to getting the Adoption Agency Algorithm working. The issue arises with the following fragments: bodyb ab emdiv cd /b bodyb ab u xy emdiv cd /b The algorithm attempts to insert nodes that already have a parent node into another node. Does the instruction to insert a node carry an implication that you must first detach the node from its parent node, if it has one? If so, that should either be documented at the start of the whole section, or additional steps put into the algorithm to make it so. Steps 5 and 10 are much more explicit about detaching nodes from their parents. The problem is triggered by the EM having a single child - that being the DIV (the furthest block) that gets detached by step 5. Consequently, in step 7.4, 'node' (EM) does not have any children. Thus in step 7.5, the DIV is re-inserted into the original EM and we carry on around the steps in step 7 until we hit 7.2 where the two examples' behaviours diverge. In the first example, 7.2 terminates the loop and we hit step 8, where: last node is the EM; node is the B; and furthest block is the DIV. The condition holds, so we are instructed to insert the EM into the BODY. However, the EM is still a child node of the B at this point. In the second example, 7.4 clones the U and then (in 7.5) tries to insert the EM into that clone whilst the EM is still a child of the original U. Either the EM has to be cloned, or it has to be detached first. The detach first works better, I think, otherwise you end up with a useless EM leaf node, which I guess is what 7.4's condition is trying to avoid in the first place. Also, what is nearest block for? It's not used anywhere. It looks like it could be a remnant of an earlier version of the algorithm that didn't have a step 14 that looped the whole algorithm, perhaps? I think you can strike the first paragraph of step 3, and move the second paragraph of step 3 into step 4, and remove the sentence There will always be one ... from step 4, then delete step 3 completely. -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] [WA1] Formatting elements
Ian Hickson [EMAIL PROTECTED] wrote: On Mon, 17 Jul 2006, Stewart Brodie wrote: I tried dry-running the algorithm for handling mis-nested formatting elements, but I ended up with a tree that looked very odd. I can't believe that the output I ended up with is what the desired result of the algorithm is, so there is a mistake somewhere: either in my execution of the algorithm or in the algorithm itself. I took the following fragment of HTML: DIV abc B def I ghi P jkl /B mno /I pqr /P stu the result I ended up with was equivalent to: DIV abc B def I ghi /I /B I /I P I B jkl /B mno /I pqr /P stu /DIV Looks right. With that as input, my implementation outputs: 5: Parse error: missing document type declaration. 38: Parse error: mismatched b element end tag (misnested tags). 47: Parse error: mismatched i element end tag (misnested tags). 57: Parse error: mismatched body element end tag (premature end of file?). htmlhead/headbodydiv abc b def i ghi /i/bi/ipib jkl /b mno /i pqr /p stu/div/body/html Good - we do end up with exactly the same thing. I know it's hard to see when written out textually, but note that for the text node 'jkl', the I and B elements are the wrong way around! Wrong way with respect to what? They're the right way if you look at the end tags: /b closes first, so it must be innermost! ;-) I disagree because the 'jkl' is the bit I'm interested in here. Are you saying that the desirable tree order in defined in terms only of the closing tags rather than the open tags? In the original source, there haven't been any close tags at all at the time the 'jkl' is parsed, ignoring the other text nodes, the tree is: DIV B I P jkl (I don't really like the P being there, though, to be honest). At this point, jkl has a logical element hierarchy above it in the DOM tree that matches what was in the original HTML source. In CSS selector terms, DIV B I. The subsequent processing of the /B token causes such a selector to no longer match (it has now changed to DIV I B): DIV B I /I /B P I B jkl Surely it is reasonable to expect the jkl to retain its ancestry - i.e. be a child of the cloned I, which is a child of the cloned B, regardless of the tag closure (of the B) that's about to occur, which would convert it to ... DIV B I /I /B P B I jkl /I /B I (mno...) I suppose the root of my concern is how to apply CSS selector matching in a reasonable looking manner to the DOM tree if the parser has reversed the parentage of the formatting elements. The point is this is error-correction logic, there is no right way (well, until the spec is a standard, I guess). Indeed I suspect that it may not be possible to define the one true way in such a way that satisfies all content. It all seems to start going wrong for me in step 7 of the algorithm. During the handling of the /B tag, the clone of I gets created and that's the node that ends up being the childless I node that has the DIV as its parent (during step 5 of handling the /I tag when the I is cloned for a second time to be the child of the P and adopt the original children of the P) Firefox generates what I think I would expect and prefer: DIV abc B def I ghi /I /B P B I jkl /I /B I mno /I pqr /P stu /DIV It's the same number of tags, in this case. It gets more obviously bad to do what Mozilla does when you consider a case like: bp...p...p...p...p...p... ...which is very common. With that exact markup, Safari, IE7, and the spec all end up with the exact same DOM tree (from the body down, at least), and with the same number of element nodes (from body down, 8). Mozilla ends up with 13 nodes (from the body down). That doesn't scale -- there are pages with hundreds of nodes like this. And it gets much worse if it was all wrapped in a u and em too. The key is, as you mention in one of the blog entries linked below, that the behaviour differs depending on whether or not the content is well-formed in terms of matching order of start and end tags, or not. For comparison, Internet Explorer 6 on the other hand treats the P no differently to the B or I and ends up with: DIV abc B def I ghi P jkl /P /I /B I P mno /P /I P pqr /P stu /DIV Actually IE has only one P element (and only one B and only one I). Look closer and you'll find that the P element isn't closed -- it's just that the mno and pqr text nodes' parentNodes point to the P, while the DIV element's childNodes array actually also mentions those text nodes. Yes, IE generates DOM trees that aren't trees. See also: http://ln.hixie.ch/?start=1037910467count=1 http://ln.hixie.ch/?start=1138169545count=1 http://ln.hixie.ch/?start=1137740632count=1 http://ln.hixie.ch/?start=1026485588count=1 http://ln.hixie.ch/?start=1137799947count=1 Yes, I have already read many of your blog entries on this topic. I got
[whatwg] [WA1] Formatting elements
I tried dry-running the algorithm for handling mis-nested formatting elements, but I ended up with a tree that looked very odd. I can't believe that the output I ended up with is what the desired result of the algorithm is, so there is a mistake somewhere: either in my execution of the algorithm or in the algorithm itself. I took the following fragment of HTML: DIV abc B def I ghi P jkl /B mno /I pqr /P stu The DIV is chosen to provide a suitable context for testing everything else. B and I were chosen as formatting elements with short names, P was chosen as it has no special behaviour as an open tag when in in body state (possibly a mistake? I'm not certain). One filled whiteboard later, the result I ended up with was equivalent to: DIV abc B def I ghi /I /B I /I P I B jkl /B mno /I pqr /P stu /DIV I know it's hard to see when written out textually, but note that for the text node 'jkl', the I and B elements are the wrong way around! It all seems to start going wrong for me in step 7 of the algorithm. During the handling of the /B tag, the clone of I gets created and that's the node that ends up being the childless I node that has the DIV as its parent (during step 5 of handling the /I tag when the I is cloned for a second time to be the child of the P and adopt the original children of the P) Firefox generates what I think I would expect and prefer: DIV abc B def I ghi /I /B P B I jkl /I /B I mno /I pqr /P stu /DIV This behaviour would be consistent with disallowing non-phrasing and non-formatting elements on the stack of open elements when there are phrasing/formatting elements on the bottom of the stack. IOW, the P implicitly closes the B and I elements, leaving them in the list of active formatting elements, and then NOT executing reconstruct the active formatting elements before appending the new P element, leaving that for when the 'jkl' text node is encountered. For comparison, Internet Explorer 6 on the other hand treats the P no differently to the B or I and ends up with: DIV abc B def I ghi P jkl /P /I /B I P mno /P /I P pqr /P stu /DIV The problem here may simply be that appending any node due to opening any non-formatting/non-phrasing open tag when in in body should cause any formatting/phrasing elements to be popped off the stack of open elements, and then NOT execute reconstruct the active formatting elements (because it'll be executed automatically when opening the next formatting/phrasing element or text node anyway) -- Stewart Brodie Software Engineer ANT Software Limited
Re: [whatwg] [wa1] Status of tree construction section
Ian Hickson [EMAIL PROTECTED] wrote: On 7/7/06, Stewart Brodie [EMAIL PROTECTED] wrote: I thought I'd have a go at implementing the parsing algorithms, specifically the tree construction algorithms, to see what effect it had on the DOM trees that our parser creates. Has anybody else here actually implemented this tree construction algorithm? I'm finding one or two issues that I think may be (minor) mistakes, and I'd like to compare notes to see whether I've just misunderstood it or whether it is a mistake. I've been implementing it (to test the spec); I'd be quite happy to compare notes (either on this list or off-list, as you wish). Note that I'd definitely not consider that part of the spec done yet. I'm happy to post to the list. The first few issues are quite trivial, I think: In the main phase, section 'If the insertion mode is in row', the last option for 'anything else' says process ... as if ... in table. I think that should say as if ... in table body instead. That case will re-throw the token out to in table in any case if it doesn't handle it. The case immediately above that An end tag whose tag name is one of: body, caption, col, colgroup, html, td, th, tr. The /tr case is already handled by the second case. Remove 'tr' from the list here. In 'If the insertion mode is in cell', the absence of a case for an end tag for CAPTION looks odd. All the other table-related tags are handled here explicitly, so why is CAPTION so different (that it should be handled in the 'treat it as in body' way)? I've come to the conclusion that you need pictures to accompany the adoption agency algorithm. However, I'm not an artist. Indeed, I'm so bad at drawing pictures, that in the past, users often sent me replacement bitmap graphics for my programs because they found my attempts so distressing :-) With reference to that algorithm, I think that the text in point 1 should be re-organised somewhat after the second paragraph to make it a little clearer. I've re-organised it and I think it says exactly the same now, but simpler and with less potential for misunderstanding: If there is a _formatting element_; proceed immediately to step 2 Otherwise, there is no _formatting element_. If there is an element in the _list of active formatting elements_ that: o [same three steps, but with , and appended to the top one] then remove the last such element from the _list of active formatting elements_. In any case, abort these steps. In the various places where a given operation has to be described multiple times, you've macroed it (e.g. insert an HTML element, clear the list of active formatting elements up to the last marker). I suggest adding another this one that can be used during the Adoption Agency algorithm (I'm sure that I found I needed to perform this search in other places too - hence defining it separately - although I can't quite recall exactly where for the time being, ho hum): The _list of active formatting elements_ is said to *have an element in active formatting scope* when the following algorithm terminates in a match state: 1. If the _list of active formatting elements_ is empty, terminate in a failure state. 2. Initialise _entry_ to be the last (most recently added) entry in the _list of active formatting elements_. 3. If _entry_ is a marker, terminate in a failure state. 4. If _entry_ is an element with a tag name matching the target element name, terminate in a match state. 5. If there are further elements in the _list of active formatting elements_, set _entry_ to the previous entry and return to step 3. 6. Terminate in a failure state (there are no more entries) Step 6 in the original 14-step algorithm: relative position of the formatting element. Relative to what? The parsing quirks box lists several issues that I think are important. The script one in particular is so very common. Unfortunately, I had to cave in eventually and support that because it broke some customers' own sites. I have come across never-opened /br and /p too. I've never heard of % ... % before. Sometimes, it's really quite depressing the rubbish that people (and programs!) write out. I spent a long time trying to work out what I needed to store for each entry on both the stack of open elements and the list of active formatting elements. I think it should be stated up front because this is often an area of confusion, in my experience. I frequently get upset with co-workers over misuse of the terms element, tag and node, for example :-) Finally (for now ;-), right at the beginning of the tree construction section, it says that DOM Mutation events must not fire for changes caused by the UA parsing the document. I cannot decide whether or not I agree with that statement. My experimentation appears to show that this is indeed what happens in Firefox, at least. I put a script in the head of my document
[whatwg] [wa1] Status of tree construction section and SVN interface
I thought I'd have a go at implementing the parsing algorithms, specifically the tree construction algorithms, to see what effect it had on the DOM trees that our parser creates. Has anybody else here actually implemented this tree construction algorithm? I'm finding one or two issues that I think may be (minor) mistakes, and I'd like to compare notes to see whether I've just misunderstood it or whether it is a mistake. With the spec changing so frequently, I wanted to make sure I'm catching any updates to the relevant parts of the document, so I followed the link in WA1 to the http://svn.whatwg.org but I just get web pages with the text (literally) of the current revision of the specs, rather than access to the history logs that WA1 implies I should find there. Neither web browsers nor svn itself can talk to that URI. Am I doing something wrong or is it broken? -- Stewart Brodie Software Engineer ANT Software Limited