Tag-Soup- versus XML-Parser

Ernst Bauernfeind Sat, 11 Jul 2009 17:55:10 -0700

Hi folks,
as I can see, there are at least to different methods of parsing a (x)
html-document: a "tag-soup"-parser which is very error-prone and makes
the best out of non-valid websites. and a xml-parser, which is only
activated if a document is served with mime-type application/xhtml
+xml.
IMHO an xml-parser should be a lot faster, because it just stops if
there is an error, and has not to concern error-handling.


I would be very interested in performance-comparison between the
mostly used tag-soup-parser and the xml-parser for xhtml1.x documents
which are correctly served as application/xhtml+xml.

can you maybe give me any metrics?

or, better, is it possible to extract the gecko document parser and
benchmark it standalone with various documents?

the answer to this question is relevant if it is preferable to build
valid xml xhtml documents or just stick to html 4.01.

thank you very much!
_______________________________________________
dev-tech-layout mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-layout

Tag-Soup- versus XML-Parser

Reply via email to