Hmmm... some interesting points you've brought out.

Q: Does anyone know how many of the languages that fall into a similar
category, can be determined by character set alone?  I realize that
theres no  way that every single case could be covered using the
character set alone, and as such realize this does present an area of
considerable thought.

Unless, of course, someone already knows the answer, or better said,
how does one properly handle the various situations where te character
set in and of itself doesn't provide enough information.  Can a parent
element with an xml:lang attribute be enough?  It seems that in the
case of name, email and uri, the containing author element, which as
long as I'm not mistaken, does allow xml:lang should be enough to make
the assumption that the children elements also whould be treated as
the same language specified in the value of this attribute.

The only element (I think) that might be of concern is the name
element, as the email and URI should be handled with the character
encoding.

Does any of this even sound remotely on target?

On 3/14/06, Eric Scheid <[EMAIL PROTECTED]> wrote:
>
> On 15/3/06 2:21 PM, "Martin Duerst" <[EMAIL PROTECTED]> wrote:
>
> >> Not sure if this is a known bug, but I just noticed that the RelaxNG
> >> grammar doesn't accept "atomCommonAttributes" (eg xml:lang) on the
> >> "atom:name" and "atom:uri" and "atom:email" elements used within
> >> Person constructs.
> >
> > For atom:uri and atom:email at least, not having xml:lang may
> > be seen as a feature. While these often contain pieces from one
> > language or another, they are not really in a language.
>
> Since the original discussion I've stumbled across something extra that
> makes xml:lang relevant for atom:name.
>
> Seems that in writing Hungarian names, the pattern is always surname
> followed by forename - e.g. Bartók Béla, where Béla is the personal name and
> Bartók is the family name.
>
> While common western names (eg. Eric Scheid) would be indexed as Scheid,
> Eric; a comma is instead simply added between the Hungarian surname and
> forename, making Hungarian names indistinguishable from other Western-style
> names. For example: Bartók Béla is indexed as Bartók, Béla.
>
> Icelandic names are another game altogether.
>
> e.
>
>
>


--
<M:D/>

M. David Peterson
http://www.xsltblog.com/

Reply via email to