Re: RE: Stupid i18n use cases question

Erik Corry Sun, 30 Jan 2011 10:48:53 -0800

If downloading the data is insufficient then download the code too (in js).


Responsibility for correctness lies with the webpage authors

The alternative is a new way to make windows-only webpages or even pages
that only work in one version of windows. It's exactly the same issues as
fonts which also contain code and data. If they can be made downloadable
then so can locale data.

Windows and the others don't even agree on the names of the locales. And it
goes downhill from there.
On Jan 30, 2011 7:32 PM, "Shawn Steele" <shawn.ste...@microsoft.com> wrote:
> Downloading the data is insufficient for collation; you'd also have to
ensure that the code processing the data is v1.0 or 1.1 or X.X. And that
there weren't any errors or discrepencies between implementations. I think
you'd quickly discover that isn't possible to guarantee. Even if everyone
agreed to use ICU and the UCA there'd be lots of differences. Also: who's
going to collect (& provide) the data to be downloaded? What's the fallback
when the data isn't available?
>
>
>
> I'm still trying to grok "word processing in JavaScript" (beyond the
simple case), however for sorting I think it's way better to provide an
architecture that works with an understanding that collation can't be
consistent between machines, at least for the foreseeable future.
>
>
> -Shawn
>
>  
> http://blogs.msdn.com/shawnste
>
> ________________________________
> From: es-discuss-boun...@mozilla.org [es-discuss-boun...@mozilla.org] on
behalf of Erik Corry [erik.co...@gmail.com]
> Sent: Sunday, January 30, 2011 12:32 AM
> To: Mark Davis ☕
> Cc: Mads Ager; Shawn Steele; es-discuss@mozilla.org
> Subject: Re: Stupid i18n use cases question
>
>
> On Jan 29, 2011 8:37 PM, "Mark Davis ☕" <m...@macchiato.com<mailto:
m...@macchiato.com>> wrote:
>>
>> There are really 5 cases at issue:
>> Code point breaks
>> Grapheme-Cluster breaks (with three possible variants: 'legacy',
extended, and aksha)
>> Word breaks
>> Line breaks
>> Sentence breaks
>> Notes:
>> #1 is pretty trivial to do right in ES.
>> The others can be done in ES, but the code is more complicated -- the
biggest issue is that they require a download of a possibly substantial
amount of data. For certain languages, #3 requires considerable code and
data.
>
> The argument that large amounts of data must be downloaded for one
language can't be used to argue that users should be forced to download that
data for all languages in the world. The alternative, that the browser make
use of data from the OS, is a fragmentation and testability nightmare.
>
> Fonts have similar issues. In that case we are moving to downloadable
fonts. That seems like the right way to go for I18n data too. Issues of the
cacheability of large font and i18n data are important but not in the scope
of ES.
>
> Moving to a downloadable I18n data architecture also solves the collation
order issues mentioned by Shawn recently where the front end and back end
disagree on collation due to all the issues he mentioned. All those issues
apply to testing and the homogeneity and testability of the web platform.
>
>> Word-breaks are different than linebreaks; the latter are the points
where you can wrap a line, which may include more than a word or come in the
middle of a word.
>> For examples, see http://unicode.org/cldr/utility/breaks.jsp.
>>
>> I don't know about the specific use cases that Jungshik had in mind, but
if you are doing client-side word-processing in ES (which various software
does, including ours), then you want all of these, except perhaps #5. For
example, a double-click uses #3.
>>
>> There are other use cases for #4 besides word processing; for example,
break up long SMS's, we break at line-boundaries. I'm not saying that
someone has to do this in ES; just giving an example outside of the
word-processing domain.
>>
>> Mark
>>
>> — Il meglio è l’inimico del bene —
>>
>>
>> On Sat, Jan 29, 2011 at 10:25, Shawn Steele <shawn.ste...@microsoft.com
<mailto:shawn.ste...@microsoft.com>> wrote:
>>>
>>> On the phone yesterday we mentioned word/line breaking and grapheme
clusters. It didn't occur to me to ask about the use cases.
>>>
>>>
>>>
>>> Why does someone need word/line breaking in js? It seems like that would
better be done by my rendering engine, like the HTML layout engine or my
edit control or something?
>>>
>>>
>>>
>>> -Shawn
>>>
>>>
>>>
>>>  
>>>
>>> http://blogs.msdn.com/shawnste
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss@mozilla.org<mailto:es-discuss@mozilla.org>
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss@mozilla.org<mailto:es-discuss@mozilla.org>
>> https://mail.mozilla.org/listinfo/es-discuss
>>

_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Re: RE: Stupid i18n use cases question

Reply via email to