From: "Roozbeh Pournader" <[EMAIL PROTECTED]> > PS: UTF-16 is an exception to that, since the BOM is not part of the > document and should be removed for processing.
And to whatever extent UTF-8 has a BOM, it would fall under the same category. Certainly that is how processors that understand the UTF-8 BOM deal with it. Rather then treating HTML like the SQL standard (lofty goals that no one company completely supports because it would be insane to do it!) they can bend to the actual usage out there and just move on, right? Even if you ignore the BOM as a BOM, the notion that a zero width space is legal but a zero width no break space is not just smacks of silliness. But at the beginning of an HTML page you are either going to not show it because you stripped it as a BOM or not show it because there is no visible representation for it. How many browsers plan to refuse to show pages that do not follow HTML 4.0 rules? :-) Of course if I had a penny for every byte that has been used discussing these three bytes sometimes found at the beginning of a UTF-8 document, I would not be working this weekend; I'd be somewhere really warm and sunny. MichKa