On Jan 13, 2010, at 5:22 AM, Henri Sivonen wrote:

The HTML5 parser in Gecko loads all streams very asynchronously. That is, to loading a stream never finishes from the same event queue task that starts the load. This is fine for loading HTTP streams, since the general expectation is that the process of loading something from the network makes multiple trips through the event loop.

This has turned out to be a test suite compatibility problem with about:blank . Mozilla's Mochitest test suite has tests that depend about:blank in iframe having a document.body immediately upon iframe insertion to document without a trip through the event loop.

At first look, this seems like a clear case: the spec says that about:blank is navigated to synchronously. However, this is not what Gecko does (with the old parser).

Gecko (with the old parser) has these two characteristics:
1) If a browsing context that has no document object is asked to return its document object, an about:blank-like DOM is generated into the browsing context synchronously. 2) When a browsing context is navigated to about:blank, a task is posted to the task queue. When that task is run, about:blank is parsed to completion during the single task queue task.

As a result, in Gecko (with the old parser enabled), asking for document.body of an iframe never returns null even if navigation to about:blank isn't complete. If the navigation hasn't completed yet, a body element generated by #1 above is returned. If navigation has completed, a body element generated by #2 above is returned. Since #2 happens as a single task, it's never possible to see a browsing context that is being navigated to about:blank in an intermediate state of the parse. (The HTML5 parser breaks this by making the state where the document object has been created by nothing has been tokenized yet observable.)

Question: if you generate a document on the fly via early access, does it get replaced when the about:blank task actually completes?

It seems like if Gecko truly wanted to make about:blank synchronous, it should be possible simply by special-casing its load and performing a series of DOM calls right then and there, without ever involving the parser.


Now, consider the following demo:
http://hsivonen.iki.fi/test/bz-about-blank-data.html

This makes it look like Opera and Safari were doing what the spec says and navigating the iframe synchronously to about:blank. (The use of the data: URL scheme makes the demo not work in IE.)

However, if the data: URL is changed to an http: URL, Safari no longer appears to navigate to about:blank synchronously:
http://hsivonen.iki.fi/test/bz-about-blank.html

I think your test case demonstrates something that we would consider a bug. Though I am not sure what exactly is happening internally that causes it. We certainly make our best effort to load about:blank synchronously, though there may be unusual circumstances where that doesn't happen.


Let's take a more careful look:
http://hsivonen.iki.fi/test/bz-about-blank-check-body.html

Opera indeed navigates to about:blank synchronously.

IE doesn't support window.stop, so let's try testing without it:
http://hsivonen.iki.fi/test/bz-about-blank-check-body-no-stop.html

IE7 neither does IE 8 doesn't appear to actually navigate synchronously.

So it appears that only Opera is doing what the spec requires. Since IE, Firefox or Safari aren't doing what the spec requires, what the spec requires can't be exactly necessary for Web compat.

What's the actual Web compat constraint when it comes to navigating to about:blank (including loading about:blank as the initial page into a newly-inserted iframe)?

I am not sure what the exact constraints are, but I believe the following are required:

- Accessing the document of a frame with missing, empty or about:blank src has to always give you an HTML document with a body, even if there hasn't been a chance for the event loop to run. - A newly created iframe with missing, empty or about:blank src has to have an accessible document right away, without even cycling the event loop.

There are at least three particular scenarios that are relevant here:

1) Some sites document.write or otherwise poke at the DOM of their about:blank frames or iframes in inline script, without waiting for the load event or anything.

2) Some sites load multiple frames, yet one expects to poke at the other's DOM during its load. Since load order is not guaranteed, this would be a race condition, if the not-yet-loaded frame had no DOM, but synchronous about:blank lets such sites muddle on through. Before we had sufficiently synchronous loading of the initial empty frame document, we actually encountered sites like this that broke in Safari but not IE or Firefox.

3) Some sites make a new iframe element using DOM calls in an event handler, and expect it to have an empty document that's immediately ready for DOM manipulation, without any intervening returns to the event loop.

Regards,
Maciej





Reply via email to