[WSG] Regarding foreign languages

2005-06-02 Thread Vaska . WSG

Am I allowed to ask about non-CSS things here?

In particular, I'm trying to deal with how to handle inputs of Chinese 
characters via some forms.  What I'm wondering is...


- will utf-8 suffice?
- do I need to specify http://www.w3.org/1999/xhtml"; 
xml:lang='en' lang='en'> as ZN?  is it necessary?  Isn't utf-8 good 
enough?


And further, I'm not sure how to handle Chinese text on the validation 
end of things, but this might be a subject for a different list 
altogether.


I'll eventually have to deal with some other languages but Chinese will 
likely be one of the more difficult ones.


???

I'll see what happens when I send this...v

**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-02 Thread Ben Ward
The language in your html element should be the language of the page.
If you have a section of the page (be that a parapraph, form,
anything) which uses a different language then you can add a lang and
xml:lang attribute to that as well. HTML is generally rather good at
doing multi-lingual documents.

I could do this on a page (this is condensed down and is missing some
attributes, but I just want to show the xml:lang/lang behaviour):








  






The language declaration doesn't restrict the characters you can use
in forms, regardless. So you don't need to add a language attribute to
your sub-elements unless you are explicitly requiring Chinese input.
Obviously if it's an all chinese site then it would make sense to
change the language value in the  element itself.

Ben
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-02 Thread Vaska . WSG
It's for a multilanguage site and base language will be English.  
Everything on the form will be English except the actual input 
(textarea).  Would it hurt anything if I just kept the lang declaration 
as EN in the header?  Or, since the input will be Chinese should it be 
ZN?  Or, do I need to be more specific and delcare lang=ZN on the 
textarea itself?


I was wondering though...since it's ALL utf-8 it might not be necessary 
to declare lang=whatever at all?


Out of curiousity, I'm not sure why we need to declare lang and 
xml:lang since utf-8 (I believe) is all we really need?



On Jun 2, 2005, at 4:21 PM, Ben Ward wrote:


The language in your html element should be the language of the page.
If you have a section of the page (be that a parapraph, form,
anything) which uses a different language then you can add a lang and
xml:lang attribute to that as well. HTML is generally rather good at
doing multi-lingual documents.

I could do this on a page (this is condensed down and is missing some
attributes, but I just want to show the xml:lang/lang behaviour):








  






The language declaration doesn't restrict the characters you can use
in forms, regardless. So you don't need to add a language attribute to
your sub-elements unless you are explicitly requiring Chinese input.
Obviously if it's an all chinese site then it would make sense to
change the language value in the  element itself.

Ben
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**





**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-02 Thread Ben Ward
You need both because the language of the page and the encoding of the
characters in the document are different things.
UTF8 does not tell you which language you're using, and the language
attributes to not exist for the purpose of rendering characters
correctly.

A page in UTF-8 could be in any language, it doesn't tell you which.
But the language attribute(s) are used for other things. Since you can
select them with CSS, a web browser can apply regionalised quotation
marks to blocks of a document if you've declared the langauge.  A
screen reader will use different libraries to read different
languages, too. There are a variety of 'beyond the browser' uses for
the attribute.

For further reference: You also use both "lang" and "xml:lang" in
XHTML transitional for backward compatibility with HTML4, whilst in
strict mode "xml:lang" is all you need.

In your case, since the page is mostly in English I would have
lang="en" in the  element, and as you suggest, put lang="zn" in
the relevent form elements or parent containers as necessary.

Ben

On 6/2/05, Vaska. WSG <[EMAIL PROTECTED]> wrote:
> It's for a multilanguage site and base language will be English.
> Everything on the form will be English except the actual input
> (textarea).  Would it hurt anything if I just kept the lang declaration
> as EN in the header?  Or, since the input will be Chinese should it be
> ZN?  Or, do I need to be more specific and delcare lang=ZN on the
> textarea itself?
> 
> I was wondering though...since it's ALL utf-8 it might not be necessary
> to declare lang=whatever at all?
> 
> Out of curiousity, I'm not sure why we need to declare lang and
> xml:lang since utf-8 (I believe) is all we really need?
> 
> 
> On Jun 2, 2005, at 4:21 PM, Ben Ward wrote:
> 
> > The language in your html element should be the language of the page.
> > If you have a section of the page (be that a parapraph, form,
> > anything) which uses a different language then you can add a lang and
> > xml:lang attribute to that as well. HTML is generally rather good at
> > doing multi-lingual documents.
> >
> > I could do this on a page (this is condensed down and is missing some
> > attributes, but I just want to show the xml:lang/lang behaviour):
> >
> > 
> > 
> >
> > 
> > 
> >
> > 
> >   
> > 
> >
> > 
> > 
> > 
> >
> > The language declaration doesn't restrict the characters you can use
> > in forms, regardless. So you don't need to add a language attribute to
> > your sub-elements unless you are explicitly requiring Chinese input.
> > Obviously if it's an all chinese site then it would make sense to
> > change the language value in the  element itself.
> >
> > Ben
> > **
> > The discussion list for  http://webstandardsgroup.org/
> >
> >  See http://webstandardsgroup.org/mail/guidelines.cfm
> >  for some hints on posting to the list & getting help
> > **
> >
> >
> >
> 
> **
> The discussion list for  http://webstandardsgroup.org/
> 
>  See http://webstandardsgroup.org/mail/guidelines.cfm
>  for some hints on posting to the list & getting help
> **
> 
> 


-- 
http://www.ben-ward.co.uk
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-02 Thread Juergen Auer
On 2 Jun 2005 at 16:49, Vaska.WSG wrote:

> It's for a multilanguage site and base language will be English.  
> Everything on the form will be English except the actual input 
> (textarea).  

Hello Vaska,

I think you are mixing two things which should be separated.

The first problem is the language of the page (defined in the header) 
or the language of a block (defined like http://www.sql-und-xml.de/

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



RE: [WSG] Regarding foreign languages

2005-06-02 Thread Peter Firminger
Hi Vaska,

> Am I allowed to ask about non-CSS things here?

WSG is not just a CSS list. Your question is entirely appropriate as it is
dealing with firm Web Standards.

> In particular, I'm trying to deal with how to handle inputs
> of Chinese
> characters via some forms.  What I'm wondering is...

One thing you need to watch is what the application or web server is
expecting from the form. In ColdFusion MX there are times when you have to
tell the ColdFusion server to expect a certain encoding from form posts.

E.g. setencoding("form", "UTF-8");

I don't know whether PHP and other server-side languages have this need or
whether they work it out themselves. Not that I want to start that
discussion here (as server-side technology is off topic) but as a concept,
it may well be part of the problem and I just wanted to add it to your debug
process.

Peter


**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-02 Thread Johan Steenkamp
Vaska

> - will utf-8 suffice?

Yes - however as Peter has pointed out you may need to consider server side 
aspects.

> - do I need to specify http://www.w3.org/1999/xhtml"; 
> xml:lang='en' lang='en'> as ZN?  is it necessary?  Isn't utf-8 good 
> enough?

You should specify the lang in the html element/s containing the text. For 
example this page is in English but contains divs with other languages (FF will 
get XHTML 1.1 and MSIE XHTML 1.0):

http://www.orbital.co.nz/anx/index.cfm/1,10,96,html


Johan


> Original Message
> From: Vaska.WSG <[EMAIL PROTECTED]>
> To: wsg@webstandardsgroup.org
> Date: Fri, Jun-3-2005 2:05 AM
> Subject: [WSG] Regarding foreign languages
>
> Am I allowed to ask about non-CSS things here?
> 
> In particular, I'm trying to deal with how to handle inputs of Chinese 
> characters via some forms.  What I'm wondering is...
> 
> - will utf-8 suffice?
> - do I need to specify http://www.w3.org/1999/xhtml"; 
> xml:lang='en' lang='en'> as ZN?  is it necessary?  Isn't utf-8 good 
> enough?
> 
> And further, I'm not sure how to handle Chinese text on the validation 
> end of things, but this might be a subject for a different list 
> altogether.
> 
> I'll eventually have to deal with some other languages but Chinese will 
> 
> likely be one of the more difficult ones.
> 
> ???
> 
> I'll see what happens when I send this...v
> 
> **
> The discussion list for  http://webstandardsgroup.org/
> 
>  See http://webstandardsgroup.org/mail/guidelines.cfm
>  for some hints on posting to the list & getting help
> **
> 
> 
> 


**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-03 Thread Vaska . WSG

Believe me, I'm listening to what y'all are telling me.

It is a tricky business because for a French typist I can use entities 
and change an é into é but with Chinese everything comes up 
unreadable (as you've mentioned).  I think this is going to end up 
being a case by case scenario - is this how it's done?


Unfortunately, even though I worked in a translation office for some 
time, I don't have much experience at this end of things (I only did 
design back then, no programming).  While I can smoothly transition the 
insertion of xml:lang and lang attributes into my form elements as 
needed, it just doesn't feel like it's the right thing to do (of 
course, this feeling has no basis in reality).


There will be a situation where one page will have the header encoding 
in ZH and an input/text field as EN-US.  I'm pretty sure that the field 
itself won't establish the language parameters that go into the field - 
the operating system will.  I'm confused, aside from hoping that the 
user will understand what needs to go into the field, how this will 
work.  Or perhaps this is purely a design/usability issue.


One thing I don't understand though, is at what point does the computer 
actually use the xml:lang attribute?  At the input (client-side)?  When 
it gets to the server/table (server-side)?  I can type any language I 
want into the textarea, but what comes out can vary...


And one more thing, my language declaration (in the header)...I've seen 
so many different kinds and read a few articles on the subject but I 
don't know exactly where to go on this:


en
en-us
en-gb
zh
zh-hans
etc.

What, where, which formats do I use and stick with if the idea is to 
support just about any lanugage that's out there (theoretically)?


Thanks for the help...v



On Jun 2, 2005, at 10:46 PM, Juergen Auer wrote:


On 2 Jun 2005 at 16:49, Vaska.WSG wrote:


It's for a multilanguage site and base language will be English.
Everything on the form will be English except the actual input
(textarea).


Hello Vaska,

I think you are mixing two things which should be separated.

The first problem is the language of the page (defined in the header)
or the language of a block (defined like http://www.sql-und-xml.de/

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**





**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-03 Thread Jan Brasna

Vaska, you¨re still mixing those:


I think you are mixing two things which should be separated.

The first problem is the language of the page (defined in the header)
The second problem is how to create a non-ascii character


He is right.

It is a tricky business because for a French typist I can use entities 
and change an é into é


It's wise to use codepage that contain this character, or better UTF.


but with Chinese everything comes up unreadable (as you've mentioned)


Even when using Unicode?

There will be a situation where one page will have the header encoding 
in ZH and an input/text field as EN-US.  I'm pretty sure that the field 
itself won't establish the language parameters that go into the field - 
the operating system will.


No, the browser will. It will send the characters in the encoding 
(charset, not language!) of the page.


One thing I don't understand though, is at what point does the computer 
actually use the xml:lang attribute?  At the input (client-side)?  When 
it gets to the server/table (server-side)?  I can type any language I 
want into the textarea, but what comes out can vary...


The 'lang' attrib is mostly for screen readers, CSS language tools and 
some processing applications. It doesn't determine the way how 
characters are inputed/printed/transfered. That's a part for charset.


What, where, which formats do I use and stick with if the idea is to 
support just about any lanugage that's out there (theoretically)?


Some Unicode - I don't know how it works with Asian/Arabic/Hebrew - 
whether UTF8, 16 or 32, what about the Endians etc. ...


--
Jan Brasna aka JohnyB :: www.alphanumeric.cz | www.janbrasna.com
**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re: [WSG] Regarding foreign languages

2005-06-03 Thread Vaska . WSG

Vaska, you¨re still mixing those:


I think you are mixing two things which should be separated.

The first problem is the language of the page (defined in the header)
The second problem is how to create a non-ascii character


He is right.


I've already identified that I will be using utf-8.  And I've accepted 
use of xml:lang/lang: in both the header and on the individual form 
elements (as necessary) - what am I still mixing on this issue?  Am I 
missing something more obvious?


No, the browser will. It will send the characters in the encoding 
(charset, not language!) of the page.


Thanks, I understand what's going on with this now.  I was really just 
curious how it was dealt with - I don't believe it changes anything on 
the server-end (and didn't think it would).


You mention the use of Unicode...perhaps I'm way out there on this 
point but am I not allowed to assume that the user will be using 
unicode to input their data?  I know it's a web browser, but is there 
some way I can restrict their input to unicode (the page xml:lang that 
is)?  If they enter something else, it likely won't work.  Perhaps this 
is where I'm still 'mixing' things up?


v
**
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
**



Re[2]: [WSG] Regarding foreign languages

2005-06-02 Thread Martin Heiden
Ben,

Am Donnerstag, 2. Juni 2005 um 17:23:57 haben Sie geschrieben:

> For further reference: You also use both "lang" and "xml:lang" in
> XHTML transitional for backward compatibility with HTML4, whilst in
> strict mode "xml:lang" is all you need.

I agree with you in all points but this one. Even in XHTML 1.0 the
lang-Attribute is needed. It is droped in XHTML 1.1.

See: http://www.w3.org/TR/xhtml1/dtds.html#a_dtd_XHTML-1.0-Strict




Martin.

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



Re[4]: [WSG] Regarding foreign languages

2005-06-03 Thread Martin Heiden
Patrick!

Am Donnerstag, 2. Juni 2005 um 18:11:30 haben Sie geschrieben:

>> I agree with you in all points but this one. Even in XHTML 1.0 the
>> lang-Attribute is needed.

> At the risk of splitting very fine hairs even further: *needed* or
> *allowed* ? I'd tend to think the latter...

You are right! *needed* should not say mandatory. Maybe we should say
*allowed* and *recommended*?

Martin.

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**



RE: Re[2]: [WSG] Regarding foreign languages

2005-06-02 Thread Patrick Lauke
> Martin Heiden

> I agree with you in all points but this one. Even in XHTML 1.0 the
> lang-Attribute is needed.

At the risk of splitting very fine hairs even further: *needed* or
*allowed* ? I'd tend to think the latter...

Patrick

Patrick H. Lauke
Webmaster / University of Salford
http://www.salford.ac.uk
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list & getting help
**