Frame construction runs recursive algorithms along the depth of the DOM tree. The depth is currently capped at 200, because deeper trees caused stack overflow crashes on Windows in the Windows 95 era. There is a corresponding limit the HTML parser. The parser limit is there just to avoid showing frame construction to deep trees; there is no reason for the limit arising from the parser itself. The parser tries to preserve text nodes even after the limit has been reached. This works in simple cases, but apparently doesn't work in more complex cases.
For many years, my standard response to people who complained was to point the finger at layout and mark bug reports as duplicates of https://bugzilla.mozilla.org/show_bug.cgi?id=256180 . This seems acceptable, because the failures where on sites deep in the long tail. Recently, it has come to my attention (https://bugzilla.mozilla.org/show_bug.cgi?id=1188731) that the depth limit is a problem in a more serious case. Apparently there are email clients that generate unreasonably nested HTML as a side effect of rich text editing and there are notable webmail clients (at least Yahoo! apparently, but I have a vague recollection that I saw a complaint about Hotmail, too, when browsing duplicates) that don't restructure such HTML email, so the deep trees are exposed to Firefox. When the HTML parser fails to keep text nodes visible when trying to rewrite the tree before it reaches layout, the result is parts of emails going missing from layout, which may lead to notable badness due to users misinterpreting what the email say, which may lead to users switching browsers after realizing that there are other browsers that don't hide these parts of emails. There are three areas where changes could be made: 1) We could re-calibrate the depth limit. 2) The HTML parser could try harder to make the DOM rewrites keep text nodes visible. 3) The frame constructor could switch from a full-features recursive algorithm to an iterative text node-only traversal near the depth limit. I'd expect text node recovery (flattening out elements and just considering text nodes) in the frame constructor to be a more robust solution than trying to address the problem in the HTML parser. What's the feasibility of such a frame constructor change? In the meantime, considering that this is a problem that can result in users switching browsers, I think we should change the depth limit and make it different depending on operating system so that Mac and Linux users don't need to switch browsers due to Windows limitations. (Or users of 64-bit Windows due to 32-bit Windows limitations.) My findings from testing with a very high limit so far are as follows: * Firefox opt build on x86_64 Linux crashes when the DOM depth is between 3000 and 3100. * Mac: between 3900 and 4000. * Windows 64-bit: between 740 and 750. * 32-bit Firefox on 64-bit Windows 10: between 500 and 510. * 32-bit Firefox on 64-bit Windows 7: between 510 and 520. As for other browsers: I didn't find a depth limit for Chrome. (Tried up to 16000.) On 64-bit Windows 10, Edge's content process crashes when the depth is between 800 and 1000. On 64-bit Windows 10, IE11's content process crashes when the depth is between 1000 and 1100. OK if I change the limit on a per-OS basis according to these numbers? Can we ask Windows to give us more stack space? (I'd appreciate testing help on 32-bit Windows. Build with the limit set to so large that the stack overflows first: https://queue.taskcluster.net/v1/task/TUMHVvq1QpG0qmsfREuDZw/runs/0/artifacts/public/build/target.zip ; test cases: https://hsivonen.com/test/moz/deeptree/ ) -- Henri Sivonen hsivo...@hsivonen.fi https://hsivonen.fi/ _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform