Hi,

here are my personal preferences for the I18N model[0].

- Folder-based localization: A2 (editor's choice)

I don't feel strongly about this choice. I prefer it because it is more flexible and more powerful, but I would understand if implementers felt it did a bit more than necessary (I believe it is possible if needed to start with A1 and transition to A2 in future).

Caveat: within the A2 option, I do not understand if the following:

 locales/x/
 locales/x-dahut/
 locales/x-nessie/

causes x-dahut and x-nessie to "inherit" the content from x. Since "x-" is used for private codes, I don't think that they should (neither should "i-", any primary subtag in the "qaa-" to "qtz-" range, nor any primary subtag that is four letters long or longer than eight letters). The BCP47 lookup algorithm handles this (or at least part of it), but I'm unsure that it's enough for our purposes — we should ask.


- The user agent's locale: B1 (editor's choice)

I *strongly* support this option. It may be slightly more complex but it matches how this is resolved over HTTP. Applications that act as if users were only able to use one language are always wrong and badly broken (I'm looking at you, Apple spell checker, Nokia T9, etc.).


- Deriving the widget's locale: C1 (editor's choice)

The way in which the BCP47 lookup algorithm is expected to work could be slightly clarified; i.e. if I'm looking for "en-us" and in document (or directory) order I have "en" then "en-us", then "en-us" will match because it is tried first for the entire list ("en" would match on a second pass, if "en-us" were not defined).

Note that the BCP47 lookup algorithm specifies that a default value is returned if nothing matches, to be selected. We should make it clear that in our case the default value returned for the locale when nothing matches should be "", as that matches xml:lang (and sort of makes sense in relationship to the locale directories). I point this out because one of the options is returning "i-default", which could confusingly lead to a locales/i-default/ directory being selected by some implementations.


- Possible representations of the widget's locale: D2 (editor's choice)

I could live with D3 too, but it seems too complicated for real-world use. D1 makes it more painful to produce localised content.


- Dereferencing URIs in Configuration Documents: something else

I don't think that either of the provided "E" options is good. I think that the base URI for a widget resource should not include locale information.

Assuming:

  widget.wgt
    config.xml
    index.html
    locales/en/index.html
    locales/fr/index.html

and three locales en, fr, es, the start content resource would *always* have a document URL of "widget://UUDI/index.html. If the locale changes (even at runtime) the document URL does not. What that content is *resolved* to inside the archive changes with the locale, but not the URL.

The justification behind this approach is that: a) locales should be transparent, and b) there is no requirement to have the widget URI map *directly* unto the widget's structure. In fact, it is probably best if it's not possible inside locales/en/index.html to go "<a href='../ fr/index.html'>Frog version</a>".

We can put that in the widget URI document, or somewhere else (I'm not sure where it fits).


- Finding missing localized content: something else

I think that the "F" options are wrong for reasons similar to those expressed in the previous section. I think that for all intents and purposes, the user agent should behave as if content from the most specific locale had been copied into less specific locales recursively until they are copied to the root, and the locales directory is discarded.

For instance, assuming:

  widget.wgt
    config.xml
    index.html
    d.svg
    locales/en-gb-xx/a.svg
    locales/en-gb-xx/b.svg
    locales/en-gb/b.svg
    locales/en-gb/c.svg
    locales/en/index.html

After applying the above algorithm with a UA locale of en-gb-xx we would have widget content equivalent to:

  widget.wgt
    config.xml
    index.html [from locales/en]
    a.svg      [from locales/en-gb-xx]
    b.svg      [from locales/en-gb-xx]
    c.svg      [from locales/en-gb]
    d.svg      [from the root]

The way this is defined is that a URI space is created that has the following resolution in place:

  widget://UUID/index.html -> locales/en/index.html
  widget://UUID/a.svg      -> locales/en-gb-xx/a.svg
  widget://UUID/b.svg      -> locales/en-gb-xx/b.svg
  widget://UUID/c.svg      -> locales/en-gb/c.svg
  widget://UUID/d.svg      -> index.html

I believe that this approach is both cleaner and more powerful. Besides, it makes specifying widget URI resolution easier :) We just have to agree on whether that's defined in the widgets URI specification or in P+C. I'm happy to put it in the former (but I'll need to reference the I18N model from P+C so it may be easier to put it there).


- HTML browsing contexts: something else

Same issue, same proposal. The example given in the I18N proposal document is:

  widget.wgt
    config.xml
    index.html
    a.gif
    b.gif
    c.gif
    hello/d.gif
    locales/en-us-xx/a.gif
    locales/en-us/a.gif
    locales/en-gb/a.gif
    locales/en-gb/index.html
    locales/en/a.gif
    locales/en/c.gif

Based on the algorithm defined above (and the fact that the UA locale is "en-us-xx"), this generates the following URI space:

  widget://UUID/index.html  -> index.html
  widget://UUID/a.gif       -> locales/en-us-xx/a.gif
  widget://UUID/b.gif       -> b.gif
  widget://UUID/c.gif       -> locales/en/c.gif
  widget://UUID/hello/d.gif -> hello/d.gif


Thoughts?


[0]http://dev.w3.org/cvsweb/~checkout~/2006/waf/widgets/i18n.html? rev=1.29&content-type=text/html;%20charset=utf-8

--
Robin Berjon - http://berjon.com/
    Feel like hiring me? Go to http://robineko.com/






Reply via email to