Hi,
here are my personal preferences for the I18N model[0].
- Folder-based localization: A2 (editor's choice)
I don't feel strongly about this choice. I prefer it because it is
more flexible and more powerful, but I would understand if
implementers felt it did a bit more than necessary (I believe it is
possible if needed to start with A1 and transition to A2 in future).
Caveat: within the A2 option, I do not understand if the following:
locales/x/
locales/x-dahut/
locales/x-nessie/
causes x-dahut and x-nessie to "inherit" the content from x. Since
"x-" is used for private codes, I don't think that they should
(neither should "i-", any primary subtag in the "qaa-" to "qtz-"
range, nor any primary subtag that is four letters long or longer than
eight letters). The BCP47 lookup algorithm handles this (or at least
part of it), but I'm unsure that it's enough for our purposes — we
should ask.
- The user agent's locale: B1 (editor's choice)
I *strongly* support this option. It may be slightly more complex but
it matches how this is resolved over HTTP. Applications that act as if
users were only able to use one language are always wrong and badly
broken (I'm looking at you, Apple spell checker, Nokia T9, etc.).
- Deriving the widget's locale: C1 (editor's choice)
The way in which the BCP47 lookup algorithm is expected to work could
be slightly clarified; i.e. if I'm looking for "en-us" and in document
(or directory) order I have "en" then "en-us", then "en-us" will match
because it is tried first for the entire list ("en" would match on a
second pass, if "en-us" were not defined).
Note that the BCP47 lookup algorithm specifies that a default value is
returned if nothing matches, to be selected. We should make it clear
that in our case the default value returned for the locale when
nothing matches should be "", as that matches xml:lang (and sort of
makes sense in relationship to the locale directories). I point this
out because one of the options is returning "i-default", which could
confusingly lead to a locales/i-default/ directory being selected by
some implementations.
- Possible representations of the widget's locale: D2 (editor's choice)
I could live with D3 too, but it seems too complicated for real-world
use. D1 makes it more painful to produce localised content.
- Dereferencing URIs in Configuration Documents: something else
I don't think that either of the provided "E" options is good. I think
that the base URI for a widget resource should not include locale
information.
Assuming:
widget.wgt
config.xml
index.html
locales/en/index.html
locales/fr/index.html
and three locales en, fr, es, the start content resource would
*always* have a document URL of "widget://UUDI/index.html. If the
locale changes (even at runtime) the document URL does not. What that
content is *resolved* to inside the archive changes with the locale,
but not the URL.
The justification behind this approach is that: a) locales should be
transparent, and b) there is no requirement to have the widget URI map
*directly* unto the widget's structure. In fact, it is probably best
if it's not possible inside locales/en/index.html to go "<a href='../
fr/index.html'>Frog version</a>".
We can put that in the widget URI document, or somewhere else (I'm not
sure where it fits).
- Finding missing localized content: something else
I think that the "F" options are wrong for reasons similar to those
expressed in the previous section. I think that for all intents and
purposes, the user agent should behave as if content from the most
specific locale had been copied into less specific locales recursively
until they are copied to the root, and the locales directory is
discarded.
For instance, assuming:
widget.wgt
config.xml
index.html
d.svg
locales/en-gb-xx/a.svg
locales/en-gb-xx/b.svg
locales/en-gb/b.svg
locales/en-gb/c.svg
locales/en/index.html
After applying the above algorithm with a UA locale of en-gb-xx we
would have widget content equivalent to:
widget.wgt
config.xml
index.html [from locales/en]
a.svg [from locales/en-gb-xx]
b.svg [from locales/en-gb-xx]
c.svg [from locales/en-gb]
d.svg [from the root]
The way this is defined is that a URI space is created that has the
following resolution in place:
widget://UUID/index.html -> locales/en/index.html
widget://UUID/a.svg -> locales/en-gb-xx/a.svg
widget://UUID/b.svg -> locales/en-gb-xx/b.svg
widget://UUID/c.svg -> locales/en-gb/c.svg
widget://UUID/d.svg -> index.html
I believe that this approach is both cleaner and more powerful.
Besides, it makes specifying widget URI resolution easier :) We just
have to agree on whether that's defined in the widgets URI
specification or in P+C. I'm happy to put it in the former (but I'll
need to reference the I18N model from P+C so it may be easier to put
it there).
- HTML browsing contexts: something else
Same issue, same proposal. The example given in the I18N proposal
document is:
widget.wgt
config.xml
index.html
a.gif
b.gif
c.gif
hello/d.gif
locales/en-us-xx/a.gif
locales/en-us/a.gif
locales/en-gb/a.gif
locales/en-gb/index.html
locales/en/a.gif
locales/en/c.gif
Based on the algorithm defined above (and the fact that the UA locale
is "en-us-xx"), this generates the following URI space:
widget://UUID/index.html -> index.html
widget://UUID/a.gif -> locales/en-us-xx/a.gif
widget://UUID/b.gif -> b.gif
widget://UUID/c.gif -> locales/en/c.gif
widget://UUID/hello/d.gif -> hello/d.gif
Thoughts?
[0]http://dev.w3.org/cvsweb/~checkout~/2006/waf/widgets/i18n.html?
rev=1.29&content-type=text/html;%20charset=utf-8
--
Robin Berjon - http://berjon.com/
Feel like hiring me? Go to http://robineko.com/