Re: [WSG] Character Encoding Mismatch

2008-04-06 Thread Nikita The Spider The Spider
On Sun, Apr 6, 2008 at 12:11 AM, David Hucklesby <[EMAIL PROTECTED]> wrote:
> > On Fri, Apr 4, 2008 at 4:16 PM, Kristine Cummins
>  > <[EMAIL PROTECTED]> wrote:
>  >
>  >> Can someone tell me how to fix this W3C warning – I'm new to 
> understanding this part.
>  >> 
>  >>
>
>  On Fri, 4 Apr 2008 20:15:19 -0400, Nikita The Spider replied:
>  > Kristine,
>  > If your server is already specifying the character set (a.k.a. encoding) 
> then you don't
>  > need to do so in your HTML. In fact, I'd recommend against doing so, ...
>
>  The META tag is needed when serving the page from the hard drive -
>  for example, when the page is saved for viewing later. (The hard drive
>  does not send HTTP headers.)

That's a good point that I should have mentioned, and I'm glad you
brought it up. However, IMO this need is often overstated. Browsers
are pretty good at guessing the encoding when they need to. I wouldn't
rely on browsers guessing correctly for public pages, but I think the
clutter of having duplicate encoding declarations usually outweighs
the benefit.

Of course, ideally one looks at one's pages using a local Web server.
I think Windows & Linux come with one preinstalled and I know that OS
X does, so this should be within the reach of most folks.

Cheers

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Character Encoding Mismatch

2008-04-05 Thread David Hucklesby
> On Fri, Apr 4, 2008 at 4:16 PM, Kristine Cummins
> <[EMAIL PROTECTED]> wrote:
>
>> Can someone tell me how to fix this W3C warning – I'm new to understanding 
>> this part.
>> 
>>

On Fri, 4 Apr 2008 20:15:19 -0400, Nikita The Spider replied: 
> Kristine,
> If your server is already specifying the character set (a.k.a. encoding) then 
> you don't
> need to do so in your HTML. In fact, I'd recommend against doing so, ...

The META tag is needed when serving the page from the hard drive -
for example, when the page is saved for viewing later. (The hard drive
does not send HTTP headers.)

Cordially,
David
--




***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



Re: [WSG] Character Encoding Mismatch

2008-04-04 Thread Nikita The Spider The Spider
On Fri, Apr 4, 2008 at 4:16 PM, Kristine Cummins
<[EMAIL PROTECTED]> wrote:
>
> Can someone tell me how to fix this W3C warning – I'm new to understanding
> this part.
>  

Kristine,
If your server is already specifying the character set (a.k.a.
encoding) then you don't need to do so in your HTML. In fact, I'd
recommend against doing so, and the problem you've experienced is
exactly why. If you specify the encoding in two (or more) places, they
can get out of synch. You might *think* you're specifying ISO-8859-1
because that's what your HTML META tag says, but if the server says
something else, that's what takes priority.

It's important to understand that the encoding tells browsers (and
other user agents, like Googlebot) how to interpret non-ASCII
characters in your page. It's a common mistake to think that these are
restricted to accented characters that we generally don't use in
English, but content pasted in from Microsoft Word (for instance) is
likely to contain non-ASCII as well. In other words, you might be
using them without realizing it. If you are, and you get the encoding
wrong, then what you see as quote marks (for instance) might look like
this to others: â€

Whatever tool you're using to save files should give you a choice of
which encoding/character set to use. You can use ISO-8859-1 to write
in English and most Western European languages. Since your Web server
is already identifying your pages as such, it might be a good choice.
Others have suggested UTF-8 which can represent anything under the
sun. That's great, but you'll have to find some way to cajole your
server into telling the world that your pages are UTF-8, not
ISO-8859-1. If you can't, you'll have to stick to the latter.

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***



RE: [WSG] Character Encoding Mismatch

2008-04-04 Thread Andrew Cunningham
 

The advice below is sufficient if your content is limited to characters in
the ISO-8859-1 repertoire If you are using any characters outside this
repertoire on the site, then i wouldn't use this approach.

As
indicated in a previous email, you could ask your web master to change the
default configuration of the Apache server. Unlikely to happen if other
sites are hosted on server since it may negatively impact on other
sites.

An alternative would be to use a .htaccess file. if the
administrators allow you to do this.

Info on this approach is
available at http://www.w3.org/International/questions/qa-htaccess-charset


Andrew


On Sat, April 5, 2008 6:52 am, Kepler
Gelotte wrote:
>> Can someone tell me how to fix this W3C
warning - I'm new to
>> understanding
> this part.
>>

> 
> 
> 
> Change this tag in your
 section:
> 
> 
> 
> 
> 
> 
> 
>
To:
> 
> 
> 
>  />
> 
> 
> 
> Best regards,
> 
> Kepler Gelotte
> 
> Neighbor Webmaster, Inc.
> 
> 156 Normandy Dr.,
Piscataway, NJ 08854
> 
> 
 www.neighborwebmaster.com
> 
> phone/fax: (732) 302-0904
> 
>
Thanks!
> 
> 
>
***
> List Guidelines:
http://webstandardsgroup.org/mail/guidelines.cfm
> Unsubscribe:
http://webstandardsgroup.org/join/unsubscribe.cfm
> Help:
[EMAIL PROTECTED]
>
***
> 
> 
>
***
> List Guidelines:
http://webstandardsgroup.org/mail/guidelines.cfm
> Unsubscribe:
http://webstandardsgroup.org/join/unsubscribe.cfm
> Help:
[EMAIL PROTECTED]
>
***


-- 
Andrew Cunningham
Research and Development
Coordinator
Vicnet
State Library of Victoria
Australia

[EMAIL PROTECTED]


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***


RE: [WSG] Character Encoding Mismatch

2008-04-04 Thread Kristine Cummins
FIXED. The URL below will not show any warnings now.
 
Thanks again.
 
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Tim Offenstein
Sent: Friday, April 04, 2008 1:42 PM
To: wsg@webstandardsgroup.org
Subject: Re: [WSG] Character Encoding Mismatch
 
At 1:16 PM -0700 4/4/08, Kristine Cummins wrote:
Can someone tell me how to fix this W3C warning - I'm new to understanding
this part.
<http://validator.w3.org/check?uri=http%3A%2F%2Fwww.beverlywilson.com%2F>
 
Thanks!
 
 
In the header of your HTML should be a line like this - . Your server
is sending an HTTP header that tells browsers to use the ISO-8859-1
character set, hence the mismatch. You can fix it by changing the line in
your HTML to charset=iso-8859-1. However I always recommend instead using
utf-8 because it's broader. ISO-8859-1 is actually a subset of utf-8. You'll
have to talk to your server admin to change the HTTP header I believe.
 
-Tim
-- 
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
   Tim Offenstein  ***  Campus Accessibility Liaison  ***  (217)
244-2700
CITES Departmental Services  ***  www.uiuc.edu/goto/offenstein


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***

RE: [WSG] Character Encoding Mismatch

2008-04-04 Thread Kepler Gelotte
> Can someone tell me how to fix this W3C warning - I'm new to understanding
this part.
> 

 

Change this tag in your  section:

 



 

To:

 



 

Best regards,

Kepler Gelotte

Neighbor Webmaster, Inc.

156 Normandy Dr., Piscataway, NJ 08854

  www.neighborwebmaster.com

phone/fax: (732) 302-0904

Thanks! 


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***BEGIN:VCARD
VERSION:2.1
N:Gelotte;Kepler;;Mr.
FN:Kepler Gelotte ([EMAIL PROTECTED])
ORG:Neighbor Webmaster
TITLE:Web Designer
TEL;WORK;VOICE:(732) 302-0904
TEL;WORK;FAX:(732) 302-0904
ADR;WORK:;;156 Normandy Dr;Piscataway;NJ;08854;United States of America
LABEL;WORK;ENCODING=QUOTED-PRINTABLE:156 Normandy Dr=0D=0APiscataway, NJ 08854=0D=0AUnited States of America
URL;WORK:http://www.neighborwebmaster.com
EMAIL;PREF;INTERNET:[EMAIL PROTECTED]
REV:20070415T052107Z
END:VCARD



Re: [WSG] Character Encoding Mismatch

2008-04-04 Thread Tim Offenstein

At 1:16 PM -0700 4/4/08, Kristine Cummins wrote:
Can someone tell me how to fix this W3C warning - I'm new to 
understanding this part.

<http://validator.w3.org/check?uri=http%3A%2F%2Fwww.beverlywilson.com%2F>

Thanks!



In the header of your HTML should be a line like this - http-equiv="Content-Type" content="text/html; charset=utf-8" />. Your 
server is sending an HTTP header that tells browsers to use the 
ISO-8859-1 character set, hence the mismatch. You can fix it by 
changing the line in your HTML to charset=iso-8859-1. However I 
always recommend instead using utf-8 because it's broader. ISO-8859-1 
is actually a subset of utf-8. You'll have to talk to your server 
admin to change the HTTP header I believe.


-Tim
--
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
   Tim Offenstein  ***  Campus Accessibility Liaison  ***  (217) 244-2700
CITES Departmental Services  ***  www.uiuc.edu/goto/offenstein


***
List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm
Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm
Help: [EMAIL PROTECTED]
***

Re: [WSG] Character encoding mismatch

2005-11-23 Thread Paul Collins



Hi Richard, 
 
Thanks for that info, the guy who runs the server 
here has fixed the server to run UTF-8, so no problems there. 
 
The XHTML reference was really good. I had started 
using the ' XHTML tag for ' not realising that it wouldn't work 
for browsers that don't read XHTML (such as IE5). Glad I got to read that one 
before we went live!  I have now changed it to ’ 
 
What's your opinion on using Character Entities 
over Hexadecimal values. I can't seem to get a clear response on which is 
better.
 
Thanks again.
Paul
 

  - Original Message - 
  From: 
  Richard Ishida 
  To: wsg@webstandardsgroup.org 
  Sent: Tuesday, November 22, 2005 6:54 
  PM
  Subject: RE: [WSG] Character encoding 
  mismatch
  Thanks, Susan, for pointing to that stuff.Paul, you if 
  you're using Apache you may also find this particularly useful:"Setting 
  'charset' information in .htaccess"http://www.w3.org/International/questions/qa-htaccess-charsetThat 
  would allow you to continue using utf-8, which I think is a good 
  move.Also, you may find the following useful wrt using character 
  references:"Using character entities and NCRs"http://www.w3.org/International/questions/qa-escapesHope 
  that helps,RIRichard 
  IshidaInternationalization LeadW3Chttp://www.w3.org/People/Ishida/http://www.w3.org/International/http://people.w3.org/rishida/blog/http://www.flickr.com/photos/ishida/ > 
  -Original Message-> From: [EMAIL PROTECTED] 
  > [mailto:[EMAIL PROTECTED] On Behalf Of Susanne 
  Jäger> Sent: 10 November 2005 12:21> To: wsg@webstandardsgroup.org> 
  Subject: Re: [WSG] Character encoding mismatch> > Paul Collins 
  wrote, On 10.11.2005 12:44:> > > I thought this was the 
  correct way to add special characters for > > XHTML, but what I am 
  reading now seems to contradict this. > This is the > > part 
  of standards where I get a bit confused. Does anyone have any > > 
  advice or know of some good articles where they explain > this in 
  simple > > terms??> > Have a look at the material in 
  W3Cs > internationalization-Section W3C I18N Topic Index > 
  <http://www.w3.org/International/resource-index.html#charset>> 
  > I like the Tutorial: Character sets & encodings in XHTML, 
  > HTML and CSS > <http://www.w3.org/International/tutorials/tutorial-char-enc/#choosing>> 
  At least they try to explain the rather complicated stuff for > 
  everyone. ;-)> > HTH> Susanne> > > 
  --> http://sujag.de - Webentwicklung und 
  -beratung > [EMAIL PROTECTED] 
  Lottumstr. 22, 10119 Berlin, Tel: 030 - 440 483 47> 
  **> The discussion 
  list for  http://webstandardsgroup.org/> 
  >  See http://webstandardsgroup.org/mail/guidelines.cfm>  
  for some hints on posting to the list & getting help> 
  **> 
  **The 
  discussion list for  http://webstandardsgroup.org/ See 
  http://webstandardsgroup.org/mail/guidelines.cfm for 
  some hints on posting to the list & getting 
  help**


RE: [WSG] Character encoding mismatch

2005-11-22 Thread Richard Ishida
Thanks, Susan, for pointing to that stuff.

Paul, you if you're using Apache you may also find this particularly useful:
"Setting 'charset' information in .htaccess"
http://www.w3.org/International/questions/qa-htaccess-charset

That would allow you to continue using utf-8, which I think is a good move.

Also, you may find the following useful wrt using character references:
"Using character entities and NCRs"
http://www.w3.org/International/questions/qa-escapes

Hope that helps,
RI



Richard Ishida
Internationalization Lead
W3C

http://www.w3.org/People/Ishida/
http://www.w3.org/International/
http://people.w3.org/rishida/blog/
http://www.flickr.com/photos/ishida/
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Susanne Jäger
> Sent: 10 November 2005 12:21
> To: wsg@webstandardsgroup.org
> Subject: Re: [WSG] Character encoding mismatch
> 
> Paul Collins wrote, On 10.11.2005 12:44:
> 
> > I thought this was the correct way to add special characters for 
> > XHTML, but what I am reading now seems to contradict this. 
> This is the 
> > part of standards where I get a bit confused. Does anyone have any 
> > advice or know of some good articles where they explain 
> this in simple 
> > terms??
> 
> Have a look at the material in W3Cs 
> internationalization-Section W3C I18N Topic Index 
> <http://www.w3.org/International/resource-index.html#charset>
> 
> I like the Tutorial: Character sets & encodings in XHTML, 
> HTML and CSS 
> <http://www.w3.org/International/tutorials/tutorial-char-enc/#
choosing>
> At least they try to explain the rather complicated stuff for 
> everyone. ;-)
> 
> HTH
> Susanne
> 
> 
> --
> http://sujag.de - Webentwicklung und -beratung 
> [EMAIL PROTECTED] Lottumstr. 22, 10119 Berlin, Tel: 030 - 440 483 47
> **
> The discussion list for  http://webstandardsgroup.org/
> 
>  See http://webstandardsgroup.org/mail/guidelines.cfm
>  for some hints on posting to the list & getting help
> **
> 

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding mismatch

2005-11-10 Thread Paul Collins



That seems to work, thanks heaps 
Rimantas

  - Original Message - 
  From: 
  Rimantas 
  Liubertas 
  To: wsg@webstandardsgroup.org 
  Sent: Thursday, November 10, 2005 12:01 
  PM
  Subject: Re: [WSG] Character encoding 
  mismatch
  2005/11/10, Paul Collins <[EMAIL PROTECTED]>:>> 
  I am getting the following warning when I validate my pages:>> 
  --> Character Encoding mismatch!>> The 
  character encoding specified in the HTTP header (iso-8859-1) is> 
  different from the value in the  element (utf-8). I will use the 
  value> from the HTTP header (iso-8859-1) for this 
  validation.<...>> and so on. I thought this was the correct 
  way to add special characters for> XHTML, but what I am reading now 
  seems to contradict this. This is the part> of standards where I get a 
  bit confused. Does anyone have any advice or know> of some good 
  articles where they explain this in simple terms??The problem is not 
  with your XHTML but with your server. Most likelyyou are running Apache 
  with AddDefaultCharset in configuration. If youhave access to httpd.conf 
  youshould just comment out this directive, or change it to 
  utf-8.Regards,Rimantas**The 
  discussion list for  http://webstandardsgroup.org/ See 
  http://webstandardsgroup.org/mail/guidelines.cfm for 
  some hints on posting to the list & getting 
  help**


Re: [WSG] Character encoding mismatch

2005-11-10 Thread Paul Collins



Thanks Susanne, that's a really good 
reference.
 
Cheers,Paul

  - Original Message - 
  From: 
  Susanne Jäger 
  
  To: wsg@webstandardsgroup.org 
  Sent: Thursday, November 10, 2005 12:21 
  PM
  Subject: Re: [WSG] Character encoding 
  mismatch
  Paul Collins wrote, On 10.11.2005 12:44:> I thought 
  this was the correct way to add special> characters for XHTML, but what 
  I am reading now seems to contradict> this. This is the part of 
  standards where I get a bit confused. Does> anyone have any advice or 
  know of some good articles where they explain> this in simple 
  terms??Have a look at the material in W3Cs 
  internationalization-SectionW3C I18N Topic Index<http://www.w3.org/International/resource-index.html#charset>I 
  like the Tutorial: Character sets & encodings in XHTML, HTML and 
  CSS<http://www.w3.org/International/tutorials/tutorial-char-enc/#choosing>At 
  least they try to explain the rather complicated stuff for everyone. 
  ;-)HTHSusanne-- http://sujag.de - Webentwicklung und 
  -beratung[EMAIL PROTECTED]Lottumstr. 22, 
  10119 Berlin, Tel: 030 - 440 483 
  47**The discussion 
  list for  http://webstandardsgroup.org/ See 
  http://webstandardsgroup.org/mail/guidelines.cfm for 
  some hints on posting to the list & getting 
  help**


Re: [WSG] Character encoding mismatch

2005-11-10 Thread Susanne Jäger
Paul Collins wrote, On 10.11.2005 12:44:

> I thought this was the correct way to add special
> characters for XHTML, but what I am reading now seems to contradict
> this. This is the part of standards where I get a bit confused. Does
> anyone have any advice or know of some good articles where they explain
> this in simple terms??

Have a look at the material in W3Cs internationalization-Section
W3C I18N Topic Index


I like the Tutorial: Character sets & encodings in XHTML, HTML and CSS

At least they try to explain the rather complicated stuff for everyone. ;-)

HTH
Susanne


-- 
http://sujag.de - Webentwicklung und -beratung
[EMAIL PROTECTED]
Lottumstr. 22, 10119 Berlin, Tel: 030 - 440 483 47
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding mismatch

2005-11-10 Thread Lloyd
Instead of:


Try:


This will match what your web server is sending, otherwise change your
web server config if you can :-)

Lloyd

On 11/10/05, Paul Collins <[EMAIL PROTECTED]> wrote:
>
> I am getting the following warning when I validate my pages:
>
> --
> Character Encoding mismatch!
>
> The character encoding specified in the HTTP header (iso-8859-1) is
> different from the value in the  element (utf-8). I will use the value
> from the HTTP header (iso-8859-1) for this validation.
>
> --
>
> My header code looks like this, which should validate fine:
>
>  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
> http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
> 
>  title
>  
>
> I have just started reading more about character encoding and special
> characters, is my problem that I have used decimal character refereces? For
> example
>
> - as -
>
> ' as '
>
> and so on. I thought this was the correct way to add special characters for
> XHTML, but what I am reading now seems to contradict this. This is the part
> of standards where I get a bit confused. Does anyone have any advice or know
> of some good articles where they explain this in simple terms??
>
> Cheers
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding mismatch

2005-11-10 Thread Rimantas Liubertas
2005/11/10, Paul Collins <[EMAIL PROTECTED]>:
>
> I am getting the following warning when I validate my pages:
>
> --
> Character Encoding mismatch!
>
> The character encoding specified in the HTTP header (iso-8859-1) is
> different from the value in the  element (utf-8). I will use the value
> from the HTTP header (iso-8859-1) for this validation.
<...>
> and so on. I thought this was the correct way to add special characters for
> XHTML, but what I am reading now seems to contradict this. This is the part
> of standards where I get a bit confused. Does anyone have any advice or know
> of some good articles where they explain this in simple terms??

The problem is not with your XHTML but with your server. Most likely
you are running Apache with AddDefaultCharset in configuration. If you
have access to httpd.conf you
should just comment out this directive, or change it to utf-8.

Regards,
Rimantas
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



RE: [WSG] Character encoding

2005-06-10 Thread Richard Ishida
Hello Joshua, all,

Here is the advice from the W3C Internationalization Activity:

http://www.w3.org/International/tutorials/tutorial-char-enc/en/all.html#Slide0420

(See in particular the subsection "When to use escapes".)

In summary, use characters rather than escapes when you can, except for a 
handful of syntax-significant characters, and for ambiguous or invisible 
characters. (Note that we also suggest using hex numbers rather than decimal, 
since most charts or people dealing with character code points refer to them 
that way - but that's not essential.)

Hope that helps.
RI



Richard Ishida
W3C

contact info:
http://www.w3.org/People/Ishida/ 

W3C Internationalization:
http://www.w3.org/International/ 

Publication blog:
http://people.w3.org/rishida/blog/
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Joshua Street
> Sent: 04 June 2005 03:52
> To: Web Standards Group mailing list
> Subject: [WSG] Character encoding
> 
> I've always thought that characters should be marked up with 
> appropriate entity codes (for example, accented letters, 
> etc.) in (X)HTML, rather than simply pasted in and left for 
> character encoding and the user agent to take care of.  I've 
> written a plugin for the WordPress weblog software that does 
> this for most characters ( 
> http://www.joahua.com/blog/2005/06/04/curlyenc-03 - any 
> discussion regarding this email to me offlist or post as 
> comments, please, because it's software-related ), but I'm 
> still not sure if it's required.  It's just always felt dirty 
> seeing certain characters not written in their appropriate 
> entity codes.
> 
> Could someone shed any light on this?  Are entity codes 
> redundant, or should we be using them where possible?
> 
> Kind Regards,
> Joshua Street
> 
> base10solutions
> Website:
> http://www.base10solutions.com.au/
> Phone: (02) 9898-0060  Fax: (02)
> 8572-6021
> Mobile: 0425 808 469
> 
> Multimedia  Development  Agency
> 
> 
> __
> __
> E-mails and any attachments sent from base10solutions are to 
> be regarded as confidential. Please do not distribute or 
> publish any of the contents of this e-mail without the 
> sender’s consent. If you have received this e-mail in error, 
> please notify the sender by replying to the e-mail, and then 
> delete the message without making copies or using it in any way.
> 
> Although base10solutions takes precautions to ensure that 
> e-mail sent from our accounts are free of viruses, we 
> encourage recipients to undertake their own virus scan on 
> each e-mail before opening, as base10solutions accepts no 
> responsibility for loss or damage caused by the contents of 
> this e-mail. 
> 
> __
> __
> **
> The discussion list for  http://webstandardsgroup.org/
> 
>  See http://webstandardsgroup.org/mail/guidelines.cfm
>  for some hints on posting to the list & getting help
> **
> 

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-05 Thread Dmitry Baranovskiy

Vaska.WSG wrote:


For some reason, I feel I have to escape every character that is not a
letter or number.



I was feeling the same, and working on it, when this thread arrived.  
At the time it appeared I was looking up numeric entity lists in 
Cyrillic and adapting them to a conversion_map function (for PHP).  I 
was arriving at the conclusion that it was completely crazy to go this 
route...because Chinese was on the horizon.


Thanks...v


I think it could be done easily. That what I am trying to do in my tiny 
typograph (http://siter.com.au/dmitry/typo)

**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-05 Thread Vaska . WSG

For some reason, I feel I have to escape every character that is not a
letter or number.



I was feeling the same, and working on it, when this thread arrived.  
At the time it appeared I was looking up numeric entity lists in 
Cyrillic and adapting them to a conversion_map function (for PHP).  I 
was arriving at the conclusion that it was completely crazy to go this 
route...because Chinese was on the horizon.


Thanks...v

**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding (HTML Tidy)

2005-06-05 Thread Geoff Deering

Gene Falck wrote:



Tidy is one of the programs I have been thinking
of getting, so I would like to hear about any
bugs and bug fixes.

Regards,

Gene Falck



Tidy has evolved from it's beginning with Dave Raggett.  Like many 
tools, it's great when you learn how to use it, and to work with it's 
bugs.  I haven't had to use it recently, but will so again soon.  It's 
available in many forms


http://tidy.sourceforge.net/

Here's the reference on Encoding
http://tidy.sourceforge.net/docs/quickref.html#char-encoding

Programs like Homesite and TopStyle use Tidy, but it is your 
responsibility to make sure you are using the latest version of Tidy.exe 
(from SF.net), which is located in each programs root directory.  In 
Homesite it is dated 16 June 2001, so it's pretty old and out of date.  
TopStyle is 8 Aug 2004, where the latest version on the SF.net site is 
22 May 2005.  Just download it and update it.


I remember there used to be a bug that drove me nuts where it would 
escape characters it shouldn't, but can't remember what the situation 
was.  Probably been fixed, it was a while back.  It has got better and 
better, with less bugs.


I find it a great tool to have in the kit.

Regards
Geoff
**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Gene Falck

Hi Geoff,

You wrote:

... I know I have developed sites in the past that I have felt pretty 
confident have been a good attempt at best of practice, but age sure shows 
their vintage, and I am not talking about the CSS, just thinking of the 
(X)HTML.


LOL--that's quite nice compared to what I think of
my older work!

Tidy can help with transforming characters, but it does screw some pages 
up (don't know if those bugs have been fixed).


Tidy is one of the programs I have been thinking
of getting, so I would like to hear about any
bugs and bug fixes.

Regards,

Gene Falck
[EMAIL PROTECTED]


**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Gene Falck

Hi Matt

You wrote:


Lea, I'm not sure why I always escape the dash - perhaps because I can??? :)

I am assuming the dash will someday cause me problems, so I just
escape it now, to avoid a lot of re-work.


I don't expect an unescaped dash to cause trouble
as it has, AFAIK, no meanings in code. However, I
type on an English-language keyboard so if I want
a dash, curly quotes, a reference to the keyboard
" mark, or proper spelling of various place and
personal names, the unfriendly &#number; escapes
are my only current method. Of course, if I were
working from sources that were already in proper
characters, I could try a little Copy and Paste
experimenting to see how far UTF-8 would go.


Other than that, I escape a lot of "usual characters," such as single
quotes, double quotes, and ampersands.


Those all have meanings in writing code and may
not validate, as you have seen, even if they are
they are in a running text context; I think only
linking to a separate file will work very well to
escape the quotes and ampersands in JavaScript.

Regards,

Gene Falck
[EMAIL PROTECTED]


**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Lea de Groot
On Sat, 4 Jun 2005 18:56:16 -0500, Matt Thommes wrote:
> For some reason, I feel I have to escape every character that is not a
> letter or number.

OK, I'm always up for new Best Practices, but I do need some basis for 
adopting changes.
I escape double quotes and ampersands because of the HTML issues.
I can't see a reason to escape single quotes, hyphens et al - love to 
see one :)

Lea
-- 
Lea de Groot
Elysian Systems - I Understand the Internet 
Search Engine Optimisation, Usability, Information Architecture, Web 
Design
Brisbane, Australia
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Paul Novitski

At 04:36 PM 6/4/2005, Lea de Groot wrote:

On Sat, 4 Jun 2005 18:07:48 -0500, Matt Thommes wrote:
> For instance, I always escape a dash (-) with –--- when
> using it in a normal sentence.

Thats interesting - I escape such entities as ampersands (&) and double
quotes ("), but not things such as hyphens.



Lea,

– is not a hyphen, it's an en-dash.

See for instance http://www.ascii.cl/htmlcodes.htm

Also:
The Trouble With EM 'n EN (and Other Shady Characters)
by Peter K. Sheerin
http://www.alistapart.com/articles/emen/

Paul


**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Damian Sweeney
>> What benefits or problems avoided do you perceive by doing this and
>> what other characters are you escaping?
>
> Lea, I'm not sure why I always escape the dash - perhaps because I can???
> :)
>

While preparing my recent post about image replacement I was playing with
fangs and noticed that I have a dash in my . Fangs says "dash" for
this. I replaced it with — and Fangs rendered the mdash. Can anyone
enlighten me as to the behaviour of screen readers for these entities?

tia,

Damian

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Geoff Deering

Matt Thommes wrote:


What benefits or problems avoided do you perceive by doing this and
what other characters are you escaping?
   



Lea, I'm not sure why I always escape the dash - perhaps because I can??? :)

I am assuming the dash will someday cause me problems, so I just
escape it now, to avoid a lot of re-work.

Other than that, I escape a lot of "usual characters," such as single
quotes, double quotes, and ampersands.

For some reason, I feel I have to escape every character that is not a
letter or number.

 



I think this is quite an interesting discussion, and I'm sure some of 
the members of this list can shed more light on this, but I do think 
developing with the best of practice forsight of the day does at least 
help to future proof web sites to address evolving technology.  We never 
know if the search engines or parsers of the future are going to have a 
hard time or easy time making sense of our sites.  I know I have 
developed sites in the past that I have felt pretty confident have been 
a good attempt at best of practice, but age sure shows their vintage, 
and I am not talking about the CSS, just thinking of the (X)HTML.


Tidy can help with transforming characters, but it does screw some pages 
up (don't know if those bugs have been fixed).


So what about  or  (XHTML2)?  Who bothers to use it? 


References
http://www.w3.org/TR/charmod-norm/#sec-WhyNormalization
http://www.w3.org/TR/charmod/

---
Geoff
**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Matt Thommes
> What benefits or problems avoided do you perceive by doing this and
> what other characters are you escaping?

Lea, I'm not sure why I always escape the dash - perhaps because I can??? :)

I am assuming the dash will someday cause me problems, so I just
escape it now, to avoid a lot of re-work.

Other than that, I escape a lot of "usual characters," such as single
quotes, double quotes, and ampersands.

For some reason, I feel I have to escape every character that is not a
letter or number.


MATTHOM
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Lea de Groot
On Sat, 4 Jun 2005 18:07:48 -0500, Matt Thommes wrote:
> For instance, I always escape a dash (-) with –--- when
> using it in a normal sentence.

Thats interesting - I escape such entities as ampersands (&) and double 
quotes ("), but not things such as hyphens.
What benefits or problems avoided do you perceive by doing this and 
what other characters are you escaping?

warmly,
Lea
-- 
Lea de Groot
Elysian Systems - I Understand the Internet 
Search Engine Optimisation, Usability, Information Architecture, Web 
Design
Brisbane, Australia
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Jan Brasna

I've always thought that characters should be marked up with appropriate
entity codes...
It's just always felt dirty seeing certain characters not written in their
appropriate entity codes.


Eh, maybe on anglo-saxon websites... The rest of the world has a 
different opinion ;)


--
Jan Brasna aka JohnyB :: www.alphanumeric.cz | www.janbrasna.com
**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-04 Thread Matt Thommes
Josh, I am also torn with this issue.

I ALWAYS escape characters with their decimal values, as Vlad
suggests, even if I am serving up UTF-8.

However, within code snippets (), I don't make as much of an
effort - for whatever reason.

For instance, I always escape a dash (-) with –--- when
using it in a normal sentence.

However, within code snippets, I leave the dash, as is - simply
because the dash has more meaning within code snippets (computer
talk), than is does in plain English.

For example, the code snippet of a MySQL date would be something like:
2005-06-05.

I think, in order to preserve the "code essence," I don't turn it
into: 2005–06–05.

I think the dashes are just as important as the numbers, in this case.


I could be WAY wrong - these are just my thoughts.


MATTHOM
matthom.com/


On 6/3/05, Joshua Street <[EMAIL PROTECTED]> wrote:
> On Fri, 2005-06-03 at 23:42 -0400, Vlad Alexander wrote:
> > Hi Joshua,
> >
> > If you are serving your content as Unicode (UTF-16 or UTF-8), then there is 
> > no need to use entities. If you do need to escape characters and you are 
> > using XHTML, then it's best to use their decimal values rather than 
> > entities. This makes your markup more easily parsable by XML technologies 
> > in your CMS (on the back-end). For example, instead of   use  
> 
> Ah, okay.  The plugin is using decimal values, but WordPress also uses
> UTF-8 by default -- so perhaps it is redundant.
> 
> > >>It's just always felt dirty seeing certain characters
> > >>not written in their appropriate entity codes.
> > Hmmm...that's a very English centric view of the Web ;-)
> 
> Yeah, I thought that too, but couldn't think of another way to say it!
> *blushes whilst wishing he were bilingual!*
> 
> Thanks :)
> 
> --
> Joshua Street <[EMAIL PROTECTED]>
> base10solutions
> **
> The discussion list for  http://webstandardsgroup.org/
> 
>  See http://webstandardsgroup.org/mail/guidelines.cfm
>  for some hints on posting to the list & getting help
> **
> 
>
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-03 Thread Joshua Street
On Fri, 2005-06-03 at 23:42 -0400, Vlad Alexander wrote:
> Hi Joshua,
> 
> If you are serving your content as Unicode (UTF-16 or UTF-8), then there is 
> no need to use entities. If you do need to escape characters and you are 
> using XHTML, then it's best to use their decimal values rather than entities. 
> This makes your markup more easily parsable by XML technologies in your CMS 
> (on the back-end). For example, instead of   use  

Ah, okay.  The plugin is using decimal values, but WordPress also uses
UTF-8 by default -- so perhaps it is redundant.

> >>It's just always felt dirty seeing certain characters
> >>not written in their appropriate entity codes.
> Hmmm...that's a very English centric view of the Web ;-)

Yeah, I thought that too, but couldn't think of another way to say it!
*blushes whilst wishing he were bilingual!*

Thanks :)

-- 
Joshua Street <[EMAIL PROTECTED]>
base10solutions
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character encoding

2005-06-03 Thread XStandard
Hi Joshua,

If you are serving your content as Unicode (UTF-16 or UTF-8), then there is no 
need to use entities. If you do need to escape characters and you are using 
XHTML, then it's best to use their decimal values rather than entities. This 
makes your markup more easily parsable by XML technologies in your CMS (on the 
back-end). For example, instead of   use  

>>It's just always felt dirty seeing certain characters
>>not written in their appropriate entity codes.
Hmmm...that's a very English centric view of the Web ;-)

Regards,
-Vlad
http://xstandard.com
Standards-compliant XHTML WYSIWYG editor


Joshua Street wrote:
> I've always thought that characters should be marked up with appropriate
> entity codes (for example, accented letters, etc.) in (X)HTML, rather
> than simply pasted in and left for character encoding and the user agent
> to take care of.  I've written a plugin for the WordPress weblog
> software that does this for most characters
> ( http://www.joahua.com/blog/2005/06/04/curlyenc-03 - any discussion
> regarding this email to me offlist or post as comments, please, because
> it's software-related ), but I'm still not sure if it's required.  It's
> just always felt dirty seeing certain characters not written in their
> appropriate entity codes.
>
> Could someone shed any light on this?  Are entity codes redundant, or
> should we be using them where possible?
>
> Kind Regards,
> Joshua Street
>
> base10solutions
> Website:
> http://www.base10solutions.com.au/
> Phone: (02) 9898-0060  Fax: (02)
> 8572-6021
> Mobile: 0425 808 469
>
> Multimedia  Development  Agency
>
>
> 
> E-mails and any attachments sent from base10solutions are to be regarded
> as confidential. Please do not distribute or publish any of the contents
> of this e-mail without the sender’s consent. If you have received this
> e-mail in error, please notify the sender by replying to the e-mail, and
> then delete the message without making copies or using it in any way.
>
> Although base10solutions takes precautions to ensure that e-mail sent
> from our accounts are free of viruses, we encourage recipients to
> undertake their own virus scan on each e-mail before opening, as
> base10solutions accepts no responsibility for loss or damage caused by
> the contents of this e-mail.
>
> 
> **
> The discussion list for  http://webstandardsgroup.org/
>
>  See http://webstandardsgroup.org/mail/guidelines.cfm
>  for some hints on posting to the list & getting help
> **
>
>


**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Character Encoding Mismatch

2004-07-02 Thread Vincent De Baere
On Friday 02 July 2004 05:03, Ben Bishop wrote:
> Hi Sage,
>
> > When I validate my page, I get the following message
> > The character encoding specified in the HTTP header (utf-8)  is
> > different from the value in the  element  (iso-8859-1).
> > I'd like to keep the iso-8859-1 value, just because it seems to work
>
> Your web server (eg Apache) sends the character encoding HTTP header.
> In order to match up your HTTP header to your meta-equiv you would
> need to make the change server-side, something you might not have
> access to do.
>
> This simplest way to match them would be changing your meta tags.

Or, depending on your web server setup, use a .htaccess file: 

http://httpd.apache.org/docs/mod/core.html.en#adddefaultcharset

AddDefaultCharset On should do according to the docs... 

grtz

Vincent
-- 
Vincent De Baere
*
The discussion list for http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
* 



Re: [WSG] Character Encoding Mismatch

2004-07-02 Thread Ben Bishop
> ...if the server is sending the character encoding...Is there
> any other reason to include it, client-side?

Did you read the W3C link posted? ;)

I can't speak with any authority on this matter, and not meaning to
break the unwritten rule of not answering unless you know the answer,
but:

Some servers can be configured to set the HTTP header from the meta
http-equiv or examining the first few bytes of the document.

In the case of server or configuration limitations, the meta
http-equiv can provide user agents with the encoding.

>From http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2

...conforming user agents must observe the following priorities when
determining a document's character encoding (from highest priority to
lowest):
   1. An HTTP "charset" parameter in a "Content-Type" field.
   2. A META declaration with "http-equiv" set to "Content-Type" and a
value set for "charset".
   3. The charset attribute set on an element that designates an
external resource.


-ben
*
The discussion list for http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
* 



Re: [WSG] Character Encoding Mismatch

2004-07-02 Thread Anders Nawroth
Kay Smoljak wrote:
>I was under the impression - please correct me if I'm wrong - that if
>the server is sending the character encoding, there is no need to also
>have the meta tag. Is there any other reason to include it,
>client-side?
Take a look at:


   * An in-document encoding allows the document to be read correctly 
when not on a server. This applies not only to static documents read 
from disk or CD, but also dynamic documents that are saved by the reader.
   * An in-document declaration of this kind helps developers, testers, 
or translation production managers who want to perform a visual check of 
a document.


When using a xml-declaration, the encoding should go there, and not in a 
-tag.

/AndersN
*
The discussion list for http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
* 



Re: [WSG] Character Encoding Mismatch

2004-07-02 Thread Kay Smoljak
On Fri, 2 Jul 2004 13:03:34 +1000, Ben Bishop <[EMAIL PROTECTED]> wrote:
> Your web server (eg Apache) sends the character encoding HTTP header.
> In order to match up your HTTP header to your meta-equiv you would
> need to make the change server-side, something you might not have
> access to do.
> 
> This simplest way to match them would be changing your meta tags.

I was under the impression - please correct me if I'm wrong - that if
the server is sending the character encoding, there is no need to also
have the meta tag. Is there any other reason to include it,
client-side?

-- 
Kay Smoljak
http://kay.smoljak.com
*
The discussion list for http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
* 



Re: [WSG] Character Encoding Mismatch

2004-07-01 Thread Ben Bishop
Hi Sage,

> When I validate my page, I get the following message
> The character encoding specified in the HTTP header (utf-8)  is
> different from the value in the  element  (iso-8859-1).
> I'd like to keep the iso-8859-1 value, just because it seems to work

Your web server (eg Apache) sends the character encoding HTTP header.
In order to match up your HTTP header to your meta-equiv you would
need to make the change server-side, something you might not have
access to do.

This simplest way to match them would be changing your meta tags.

Implications? Check Anne van Kesteren's (of 10 Questions fame:
http://webstandardsgroup.org/features/anne-van-kesteren.cfm )
"Quick Guide to UTF-8"
http://annevankesteren.nl/archives/2004/06/utf-8

If you're really keen, you can find out more about specifying the
character encoding at:
http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2

- ben
*
The discussion list for http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
* 



Re: [WSG] Character Encoding Mismatch

2004-07-01 Thread Mordechai Peller
Sage Olson wrote:
Here's my header:
http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
http://www.w3.org/1999/xhtml"; xml:lang="en">
That's not you HTTP header. The HTTP headers are sent by the server 
before even the first byte of your document is sent. That's why inn PHP, 
if you're playing with the headers there can't be so much as a blank 
space befor the "

And here's the meta tag:

Notice it says "http-equiv", as in only equivlant, but not the real thing.
*
The discussion list for http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
*