Hi folks, as I can see, there are at least to different methods of parsing a (x) html-document: a "tag-soup"-parser which is very error-prone and makes the best out of non-valid websites. and a xml-parser, which is only activated if a document is served with mime-type application/xhtml +xml. IMHO an xml-parser should be a lot faster, because it just stops if there is an error, and has not to concern error-handling.
I would be very interested in performance-comparison between the mostly used tag-soup-parser and the xml-parser for xhtml1.x documents which are correctly served as application/xhtml+xml. can you maybe give me any metrics? or, better, is it possible to extract the gecko document parser and benchmark it standalone with various documents? the answer to this question is relevant if it is preferable to build valid xml xhtml documents or just stick to html 4.01. thank you very much! _______________________________________________ dev-tech-layout mailing list [email protected] https://lists.mozilla.org/listinfo/dev-tech-layout

