Found it! It's forbidden to start a HTML 4.0 page with a UTF-8 BOM. Proof:
1. Open the main page of Unicode. You can see that the HTML header says: <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"><html> So, we are talking about HTML 4.0 here. The reference for HTML 4.0 is: http://www.w3.org/TR/1998/REC-html40-19980424/ The section about HTML header is Section 7.1, Introduction to the structure of an HTML document: http://www.w3.org/TR/1998/REC-html40-19980424/struct/global.html#h-7.1 which mentions: "An HTML 4.0 document is composed of three parts: 1. a line containing HTML version information, 2. a declarative header section (delimited by the HEAD element), 3. a body, which contains the document's actual content. The body may be implemented by the BODY element or the FRAMESET element. White space (spaces, newlines, tabs, and comments) may appear before or after each section. Sections 2 and 3 should be delimited by the HTML element." So "White space" is allowed before the line containing HTML version information. But what is a white space? It is define in Section 9.1, White space: "The document character set includes a wide variety of white space characters. Many of these are typographic elements used in some applications to produce particular visual spacing effects. In HTML, only the following characters are defined as white space characters: * ASCII space ( ) * ASCII tab (	) * ASCII form feed () * Zero-width space (​) Line breaks are also white space characters." So, we need to know what is a line break! Well, section 9.3.2 defines that: "A line break is defined to be a carriage return (
), a line feed (
), or a carriage return/line feed pair." That's all. So the only characters that are allowed in a HTML 4.0 web page before the HTML header, are U+0009, U+000A, U+000C, U+000D, U+0020, and U+200B. QED. roozbeh PS: UTF-16 is an exception to that, since the BOM is not part of the document and should be removed for processing.