Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On Mon, 29 Oct 2012, Johan Sundstr�m wrote: > > Serializing a complete HTML document DOM to a string is surprisingly > hard in javascript. As a fairly seasoned javascript hacker I figured > this might do it: > > document.doctype + document.documentElement.outerHTML > > It doesn't. No browser has a useful window.DocumentType.prototype that > returns either the original document's before parsing � > or a semantically equivalent post-parsing one. If you know the document is always going to be in the no-quirks mode, then you can just stick "" at the start. If you need to be able to tell what the mode is but are ok with ignoring the "limited quirks" mode, then you can use document.compatMode to pick whether to use that string or none, as in: (document.compatMode == 'CSS1Compat' ? '' : '') + document.documentElement.outerHTML That will drop any comment nodes around the root element, in case that matters. If you want to get the actual DOCTYPE strings, you can make a simple serialisation function for doctype nodes that uses the three attributes on that object to string together the full thing (much as you do in the polyfill you mentioned). > I believe only Firefox implements "internalSubset" today Since the "internal subset" has no meaning in text/html, that's ok if your goal is just to be semantically equivalent. > The most useful implementation would IMO be a native one that > reproducing the doctype, as it was formatted in the source document. What's your use case, exactly? On Mon, 29 Oct 2012, Boris Zbarsky wrote: > > I thought there were plans to put innerHTML on Document. Did that go > nowhere? Lack of implementor interest killed it a while ago. On Mon, 29 Oct 2012, Ojan Vafai wrote: > On Mon, Oct 29, 2012 at 6:17 PM, Boris Zbarsky wrote: > > > > I thought there were plans to put innerHTML on Document. Did that go > > nowhere? > > There were plans to put in on DocumentFragment. That was a different plan, but yes, there have also been proposals to do that. This was in the context of templates; a better solution to which has since been worked on in public-webapps. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On Mon, Oct 29, 2012 at 6:17 PM, Boris Zbarsky wrote: > On 10/29/12 8:58 PM, Johan Sundström wrote: > >> Serializing a complete HTML document DOM to a string is surprisingly >> hard in javascript. >> > > I thought there were plans to put innerHTML on Document. Did that go > nowhere? There were plans to put in on DocumentFragment. But IIRC no other browser vendors voiced an interest and Hixie was opposed because he thought it would encourage people to do more string-based DOM building. The WebKit patch for this floundered as a result. I still think it's a good idea.
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On 10/29/12 8:58 PM, Johan Sundström wrote: Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. I thought there were plans to put innerHTML on Document. Did that go nowhere? As a fairly seasoned javascript hacker I figured this might do it: document.doctype + document.documentElement.outerHTML This seems lossy in many cases (most obviously: when the HTML uses conditional comments, though there are also various XHTML-specific issues). The most useful implementation would IMO be a native one that reproducing the doctype, as it was formatted in the source document. That might be worth doing independent of the serialization issue. -Boris
[whatwg] Proposal for window.DocumentType.prototype.toString
Hi everybody! Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. As a fairly seasoned javascript hacker I figured this might do it: document.doctype + document.documentElement.outerHTML It doesn't. No browser has a useful window.DocumentType.prototype that returns either the original document's before parsing – or a semantically equivalent post-parsing one. Google Chrome shows one in its devtools, but seems not to export some way of getting at it to programmers. My proposal is we specify this more useful behaviour for javascript-running browsers, so it does become as simple as above. A rough sketch of how a polyfill might implement the latter window.DocumentType.prototype.toString: https://gist.github.com/3977584 Even as a polyfill, the above is rather limited, though: I believe only Firefox implements "internalSubset" today, and probably only in XML contexts. The most useful implementation would IMO be a native one that reproducing the doctype, as it was formatted in the source document. Thoughts? -- / Johan Sundström, http://ecmanaut.blogspot.com/
Re: [whatwg] URL: file: URLs
On 10/29/12 10:53 AM, Anne van Kesteren wrote: But at that point in a URL you cannot have a path. A path starts with a slash after the host. The point is that on Windows, Gecko parses file://c:/something as file:///c:/something As in, it's an exception to the general "if there are two slashes after the "file:" then the next thing is a host rule. I suppose, I would hate it though for new URL(...) to depend on the platform. I'm not sure there are great solutions here. :( -Boris
Re: [whatwg] URL: file: URLs
On Mon, Oct 29, 2012 at 3:13 PM, Boris Zbarsky wrote: > On 10/29/12 5:00 AM, Anne van Kesteren wrote: >> Maybe I should introduce a "file host state" that supports colons in >> the host name (or special case the "host state" further, but the >> former seems cleaner). > > I don't think that's particularly desirable. The "c:" is totally part of > the path; treating it otherwise would just be confusing. Imo. But at that point in a URL you cannot have a path. A path starts with a slash after the host. Especially if you want file://test/ to parse with test being the host. >> Most browsers seem to fail currently on input >> such as "file://c:/" but this is on a Mac > > Yes, doing that on a Mac would just be wrong I suppose, I would hate it though for new URL(...) to depend on the platform. -- http://annevankesteren.nl/
Re: [whatwg] URL: file: URLs
On 10/29/12 5:00 AM, Anne van Kesteren wrote: But note that it would be a bit odd of file://c:/ claimed to have a host of "c" with a default port or some such... Maybe I should introduce a "file host state" that supports colons in the host name (or special case the "host state" further, but the former seems cleaner). I don't think that's particularly desirable. The "c:" is totally part of the path; treating it otherwise would just be confusing. Imo. Most browsers seem to fail currently on input such as "file://c:/" but this is on a Mac Yes, doing that on a Mac would just be wrong I would prefer having the parsing be consistent though. You mean across Windows and non-Windows? I'm not sure that's viable. -Boris
Re: [whatwg] URL: file: URLs
On Sun, Oct 28, 2012 at 6:51 PM, Boris Zbarsky wrote: > Same as the comment I quoted? As same as something else? Same as you quoted. > Well, the Gecko parser preserves the host at this stage assuming the URI was > correctly formatted with a host. Again: > > blah://foo/bar => blah://foo/bar > > The interesting things happen when you have 0, 1, or 3 slashes between ':' > and "foo". The handling of "foo" after this point is a separate issue. Those are handled the same as in Gecko (also matches Safari I think, Chrome strips are starting slashes (like if you have four), but I did not copy that). > In Gecko, it's part of URL parsing. More precisely, it's part of the > normalization performed as part of constructing a "URL" object from a > string. Since this is also how we parse URLs, it's effectively all part of > the package. > > But note that it would be a bit odd of file://c:/ claimed to have a host of > "c" with a default port or some such... Maybe I should introduce a "file host state" that supports colons in the host name (or special case the "host state" further, but the former seems cleaner). Most browsers seem to fail currently on input such as "file://c:/" but this is on a Mac so maybe that's the difference. I would prefer having the parsing be consistent though. > 7 and 8 are not, though at some point we'll need to define equality > comparisons anyway. Yeah, I guess at some point someone would need to write a processing file: URLs specification (for post-parsing operations). On the other hand, it's not entirely clear to me that needs to be interoperable. -- http://annevankesteren.nl/