Downloading the data is insufficient for collation; you'd also have to ensure 
that the code processing the data is v1.0 or 1.1 or X.X.  And that there 
weren't any errors or discrepencies between implementations.  I think you'd 
quickly discover that isn't possible to guarantee.  Even if everyone agreed to 
use ICU and the UCA there'd be lots of differences.  Also: who's going to 
collect (& provide) the data to be downloaded?  What's the fallback when the 
data isn't available?



I'm still trying to grok "word processing in JavaScript" (beyond the simple 
case), however for sorting I think it's way better to provide an architecture 
that works with an understanding that collation can't be consistent between 
machines, at least for the foreseeable future.


-Shawn

 
http://blogs.msdn.com/shawnste

________________________________
From: es-discuss-boun...@mozilla.org [es-discuss-boun...@mozilla.org] on behalf 
of Erik Corry [erik.co...@gmail.com]
Sent: Sunday, January 30, 2011 12:32 AM
To: Mark Davis ☕
Cc: Mads Ager; Shawn Steele; es-discuss@mozilla.org
Subject: Re: Stupid i18n use cases question


On Jan 29, 2011 8:37 PM, "Mark Davis ☕" 
<m...@macchiato.com<mailto:m...@macchiato.com>> wrote:
>
> There are really 5 cases at issue:
> Code point breaks
> Grapheme-Cluster breaks (with three possible variants: 'legacy', extended, 
> and aksha)
> Word breaks
> Line breaks
> Sentence breaks
> Notes:
> #1 is pretty trivial to do right in ES.
> The others can be done in ES, but the code is more complicated -- the biggest 
> issue is that they require a download of a possibly substantial amount of 
> data. For certain languages, #3 requires considerable code and data.

The argument that large amounts of data must be downloaded for one language 
can't be used to argue that users should be forced to download that data for 
all languages in the world.  The alternative, that the browser make use of data 
from the OS, is a fragmentation and testability nightmare.

Fonts have similar issues. In that case we are moving to downloadable fonts. 
That seems like the right way to go for I18n data too.  Issues of the 
cacheability of large font and i18n data are important but not in the scope of 
ES.

Moving to a downloadable I18n data architecture also solves the collation order 
issues mentioned by Shawn recently where the front end and back end disagree on 
collation due to all the issues he mentioned.  All those issues apply to 
testing and the homogeneity and testability of the web platform.

> Word-breaks are different than linebreaks; the latter are the points where 
> you can wrap a line, which may include more than a word or come in the middle 
> of a word.
> For examples, see http://unicode.org/cldr/utility/breaks.jsp.
>
> I don't know about the specific use cases that Jungshik had in mind, but if 
> you are doing client-side word-processing in ES (which various software does, 
> including ours), then you want all of these, except perhaps #5. For example, 
> a double-click uses #3.
>
> There are other use cases for #4 besides word processing; for example, break 
> up long SMS's, we break at line-boundaries. I'm not saying that someone has 
> to do this in ES; just giving an example outside of the word-processing 
> domain.
>
> Mark
>
> — Il meglio è l’inimico del bene —
>
>
> On Sat, Jan 29, 2011 at 10:25, Shawn Steele 
> <shawn.ste...@microsoft.com<mailto:shawn.ste...@microsoft.com>> wrote:
>>
>> On the phone yesterday we mentioned word/line breaking and grapheme 
>> clusters.   It didn't occur to me to ask about the use cases.
>>
>>
>>
>> Why does someone need word/line breaking in js?  It seems like that would 
>> better be done by my rendering engine, like the HTML layout engine or my 
>> edit control or something?
>>
>>
>>
>> -Shawn
>>
>>
>>
>>  
>>
>> http://blogs.msdn.com/shawnste
>>
>>
>>
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss@mozilla.org<mailto:es-discuss@mozilla.org>
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss@mozilla.org<mailto:es-discuss@mozilla.org>
> https://mail.mozilla.org/listinfo/es-discuss
>
_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to