Jumping in late, so top posting.

I think being able to load language data dynamically is a good idea. I don't see a reason why this should be tied in to a language pack, though. The other way around is a different question. i.e.

language data doesn't include UI localization
UI localization should include language data

We have several multi-language products by now, those should work, in particular Firefox OS. We're doing quite a few things there that already duplicate language data. Much of that is in /shared, which isn't shared, but copied to many apps. Having that data inside gecko would actually get it to be shared.

I think much of the ICU data (which is technically CLDR data packed in ICU mostly) flows along similar lines of our hyphenation dictionaries. The web should just work, independent of which UI locale you're using.

I wonder how far we can get by doing something along the lines we use for webfonts, starting to do the best we can with the data we already have, and improve once the perfect data is local. I'm personally OK if this is a notification bar to reload, even.

Axel

PS: ICU is driven by js globalization api. That API was driven by MS and Google to get the data into their html app platforms. For mozilla, IMHO, the driver for g18n api should be Firefox OS, we're struggling to work around the lack of data for sorting, timezones, language data all around.

On 10/15/13 6:06 PM, Benjamin Smedberg wrote:
With the landing of bug 853301, we are now shipping ICU in desktop
Firefox builds. This costs us about 10% in both download and on-disk
footprint: see https://bugzilla.mozilla.org/show_bug.cgi?id=853301#c2.
After a discussion with Waldo, I'm going to post some details here about
how much this costs in terms of disk footprint, to discuss whether there
are things we can remove from this footprint, and whether the footprint
is actually worth the cost. This is particularly important because our
user research team has identified Firefox download weight as an
important factor affecting Firefox adoption and update rates in some
markets.

On-disk, ICU data breaks into the following categories:

* collation tables - 3.3MB

These are rules for sorting strings in multiple languages and
situations. See http://userguide.icu-project.org/collation for basic
background. These tables are necessary for implementing Intl.Collator.

The Intl.Collator API has methods to expose a subset of languages. It is
not clear from my reading of the specification whether it is expected
that browsers will normally ship with the full set of languages or only
the subset of the browser locale.

* currency tables - 1.9 MB

These are primarily the localized name of each currency in each
language. This is used by the Intl.NumberFormat API to format
international currencies.

* timezone tables - 1.7MB

Primarily the name of every time zone in each language. This data is
necessary for implementing Intl.DateTimeFormat.

* language data - 2.1 MB

This is a bunch of other data associated with displaying information for
a particular language: number formatting in various long and short
formats, calendar formats and names for the various world calendar systems.

==

Do we need this data for any language other than the language Firefox
ships in? Can we just include the relevant language data in each
localized build of Firefox, and allow users to get other language data
via downloadable language packs, similarly to how dictionaries are handled?

Is it possible that some of this data (the collation tables?) should be
in all Firefox locales, but other data (currency and timezone names) is
not as important and we can ship it only in one language?

As far as I can tell, the spec allows user agents to ship whatever
languages they need; the real question is what users and site authors
actually need and expect out of the API. (I'm reading the spec out of
http://wiki.ecmascript.org/doku.php?id=globalization:specification_drafts)

I am still working to get better number to quantify the costs in terms
of lost adoption for additional download weight.

Also, we are currently duplicating the data tables on mac universal
builds, because they are compiled-in symbols. We should clearly use a
separate file for these tables to avoid unnecessary download/install
weight. This is now filed as bug 926980.

--BDS



_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to