On 10/19/07, Cliff Hirsch <[EMAIL PROTECTED]> wrote: > Thanks. This is helpful. Here's another interesting puzzle. Why does the > page info in FireFox say encoding: UTF-8 while the Content-Type is > charset=iso-8859-1.
Note that even if you set the content-type header to ISO-8859-1 it's still very possible generate page content in some other encoding. For example, databases are pretty much put out what you put in so if you have an "admin.php" putting data in in UTF-8 and all your other pages are doing ISO-8859-1 w/ data from the DB the data from the DB will be in UTF-8 and non-ASCII characters will not be encoded properly. > Ah, I think I see it. The encoding is how the page was saved. And as usual, > Microsoft butchers everything. Actually if you right click on the page in FF and select "page info" I think it should tell you the real encoding emitted by the server (the content-type header). Note that the META content-type tag in a page is basically ignored unless the page is read from disk. Also, note that if you're serving static content, the web server has a default Content-Type encoding which is usually UTF-8. So if you have some pages that are encoded in ISO-8859-1 and the web server is UTF-8 the web server will send Content-Type UTF-8 but the page will actually still be encoded in ISO-8859-1 and the page will not be rendered properly. > But this is php -- the page is dynamically generated. So is the encoding > picked up from my php script, index.php, or the template file index.tpl? The browser will interpret the page based on the Content-Type encoding. Period. But it's up to *you* to make sure the page is really encoded in that encoding. It sounds like the script files are actually the wrong encoding or contain funky characters. The easiest way to determin if that is the case is to run hexdump or a hex editor and look at one of the script files with a non-ASCII character in it. If that character is encoded with one byte, then the encoding is ISO-8850-x. If the character is encoded with two or more bytes, it's probably UTF-8. Note that if you have a lot of pages in the wrong encoding you might want to look into the iconv utility found on *nix machines. Mike -- Michael B Allen PHP Active Directory SPNEGO SSO http://www.ioplex.com/ _______________________________________________ New York PHP Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk NYPHPCon 2006 Presentations Online http://www.nyphpcon.com Show Your Participation in New York PHP http://www.nyphp.org/show_participation.php
