On Thu, 10 Jul 2008 15:25:13 +0100, Barney Carroll wrote:
> Thanks for your swift responses, all.
>
>
> The validator gives me an unconditional pass after putting in Kevin's 
> properly-formed
> tag: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
>
> Notepad++ was really nice, but I suspected it was being a bit silly. On a Mac 
> I'd use
> BBEdit (loved that) but PCs seem to be short on really head-above-the-crowd 
> open source
> code editors.
>
>
> Nikita, that's worth knowing... And yes it is ending up US-ASCII, but I'd 
> just like to
> be sure I'm sticking to the lowest common denominator...
>
>
> Regards,
> Barney
>
> On 7/10/08, Nikita The Spider The Spider <[EMAIL PROTECTED]> wrote:> On Thu,
> Jul 10, 2008 at 8:27 AM, Barney Carroll
>> <[EMAIL PROTECTED]> wrote:
>>> Hello all,
>>>
>>> I've got a problem with character set encoding I'd like to rectify. I use 
>>> UTF-8 as
>>> a matter of convenience and ideology, and don't believe it should be that 
>>> much of a
>>> problem. My editor (Notepad++) is set to create new files in UTF-8 without 
>>> a byte
>>> order mark, but when I retrieve files from my server it tells me that 
>>> they're ANSI.
>>
>>
>> Does "ANSI" means US-ASCII? The most popular single-byte encodings 
>> (ISO-8859-X, Win-
>> 1252) and UTF-8 are supersets of US-ASCII, so a US-ASCII file is also valid 
>> UTF-8
>> (and ISO-8859-X and Win-1252) all at the same time. It's pretty easy to 
>> write English-
>> language pages that are 100% pure US-ASCII, so this might be your situation.
>> Notepad++ has saved the file as UTF-8, but in this situation that doesn't 
>> look any
>> different from US-ASCII (i.e. "ANSI").
>>
[...]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FWIW - The META content-type is only relevant to pages read from
a local file-- for example, when someone saves your page to disk.

For files served from the web, browsers look for encoding information
in the response headers. Many servers are set up to send ISO-8859-1
(or the more recent ISO-8859-15). If you want to include glyphs from
the Unicode character set with such encoding, you must use HTML
entities. If you use things like "curly quotes" for example, your text
quickly becomes unreadable. UTF-8 encoding lets you add these as
regular text.

It's not just the editor that needs to encode things properly. You
also have to make sure the file is uploaded as "binary" rather than
the usual ASCII conversion. Your server must also send the correct
encoding header.

The W3C has information on how to do that using your .htaccess file:

 <http://www.w3.org/International/questions/qa-htaccess-charset>

Firefox developer tools or Opera "Info" sidebar can tell you what
the headers are.

Hoping this helps.


Cordially,
David
--




*******************************************************************
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
*******************************************************************

Reply via email to