Re: [Json] JSON: remove gap between Ecma-404 and IETF draft

2013-11-13 Thread Mark Davis
On Wed, Nov 13, 2013 at 3:51 PM, Joe Hildebrand (jhildebr) < jhild...@cisco.com> wrote: > that all software implementations > which receive the un-prefixed text will not generate parse errors." > perhaps: ...​all conformant software ...​ Mark *— Il meglio è l’

Re: Internationalization: Support for IANA time zones

2013-03-02 Thread Mark Davis
On Sat, Mar 2, 2013 at 5:11 PM, Shawn Steele wrote: > I’m uncomfortable using the CLDR names, although perhaps they could be > aliases, because other standards use the tzdb names and we have to be able > to look up the tzdb names. It might be nice to get more stability for the > tzdb names, like

Re: Internationalization: Support for IANA time zones

2013-03-02 Thread Mark Davis
UTC. > > > On Feb 28, 2013, at 16:13 , Shawn Steele wrote: > > > For #5 I might prefer falling back to English or something. I don't > think UTC offset is a good idea because that doesn't really represent a > Timezone very well. (If a meeting gets moved to a follo

Re: Internationalization: Support for IANA time zones

2013-03-02 Thread Mark Davis
PM, Norbert Lindenberg < ecmascr...@lindenbergsoftware.com> wrote: > The identifier issues first: > > On Mar 1, 2013, at 7:40 , Mark Davis ☕ wrote: > > > > These names are canonicalized to the corresponding Zone name in the > casing used > > > > Because the

Re: Internationalization: Support for IANA time zones

2013-03-01 Thread Mark Davis
> These names are canonicalized to the corresponding Zone name in the casing used Because the Zone names are unstable, in CLDR we adopted the same convention as in BCP47. That is, our canonical form never changes, no matter what happens to Zone names. I'd strongly recommend using those as the cano

Re: String.fromCodePoint and surrogate pairs?

2013-01-14 Thread Mark Davis
There is a long discussion of this on the unicode list recently. A surrogate code point is not illegal Unicode. It is illegal *in* a UTF string, but is not illegal in a Unicode String ( http://www.unicode.org/glossary/#unicode_string) I don't want to repeat that whole long discussion here. Mark

Re: Flexible String Representation - full Unicode for ES6?

2012-12-21 Thread Mark Davis
The man main complication for compatibility is indexing. See http://macchiati.blogspot.com/2012/07/unicode-string-models-many-programming.html If you look back about a year in this list's archive you'll find a long discussion. {phone} On Dec 21, 2012 9:34 PM, "Chris Angelico" wrote: > On Sat,

Re: API for text editing

2012-10-18 Thread Mark Davis
ip/editing.html#selections > > > >> - detecting whether a glyph is available for a given character > > > > I don't believe there is anything for that right now. > > > It might be a good idea to define more clearly what's needed and then > approac

Re: Minutes from 10/5 internationalization ad-hoc meeting

2012-10-15 Thread Mark Davis
I added the following for discussion: https://bugs.ecmascript.org/show_bug.cgi?id=798 https://bugs.ecmascript.org/show_bug.cgi?id=797 Mark * * *— Il meglio è l’inimico del bene —* ** On Mon, Oct 15, 2012 at 5:51 PM, Gillam, Richard wrote: > Hi

Re: Calendar issues

2012-09-13 Thread Mark Davis
we introduce a year 0 for them? > > Norbert > > > On Sep 13, 2012, at 13:31 , Mark Davis ☕ wrote: > > > In ICU, we are using Gregorian eras (AD/BC) as customarily interpreted, > and there is no year zero. There isn't a simple way to get non-era > years—and that

Re: Calendar issues

2012-09-13 Thread Mark Davis
In ICU, we are using Gregorian eras (AD/BC) as customarily interpreted, and there is no year zero. There isn't a simple way to get non-era years—and that form is mostly interesting to techies, not normal people, which is why we support the era form. (If someone wanted to do it, you could probably

Re: Calendar issues

2012-09-12 Thread Mark Davis
+Peter, since he has an interest in these issues. Mark <https://plus.google.com/114199149796022210033> * * *— Il meglio è l’inimico del bene —* ** On Wed, Sep 12, 2012 at 9:37 PM, Mark Davis ☕ wrote: > > > Mark <https://plus.google.com/114199149796022210033> >

Re: Calendar issues

2012-09-12 Thread Mark Davis
Mark * * *— Il meglio è l’inimico del bene —* ** On Wed, Sep 12, 2012 at 8:43 PM, Norbert Lindenberg < ecmascr...@norbertlindenberg.com> wrote: > ES5 section 15.9.1 specifies a number of operations to map time values > (measured in milliseconds fr

Re: General comment on ES 402 test suite (i18n)

2012-09-11 Thread Mark Davis
Can you reformulate the table attached to http://unicode.org/cldr/trac/ticket/5302? In particular, if a currency is not in the LDML table, it gets the default values (see below). So you need to compare on that basis. It is much better for comparison if you attach a tab- or comma-delimited file, s

Re: ECMAScript collation question

2012-09-05 Thread Mark Davis
ormalization is > turned on by default. > > Norbert > > > On Sep 4, 2012, at 13:23 , Mark Davis ☕ wrote: > > > In view of the schedule, I suggest that we make your first, minimal > change right now, and plan to correct it along one of the other lines in > the next editio

Re: ECMAScript collation question

2012-09-04 Thread Mark Davis
CP 47. > The Internationalization API would use this, if the normalization property > of options is undefined, to map to the appropriate boolean value. > > This can't happen today, and I'm not sure it's really required. Turning > off normalization is primarily an op

Re: ECMAScript collation question

2012-09-02 Thread Mark Davis
thoughts? Mark <https://plus.google.com/114199149796022210033> * * *— Il meglio è l’inimico del bene —* ** On Sun, Sep 2, 2012 at 8:15 AM, Markus Scherer wrote: > On Sat, Sep 1, 2012 at 4:19 PM, Mark Davis ☕ wrote: > >> Your proposal looks reasonable, except I'm not sure how s

Re: ECMAScript collation question

2012-09-01 Thread Mark Davis
ire that canonical equivalence -> 0 > unless the client explicitly turns off normalization (i.e., normalization > is on by default, independent of locale). Support for the normalization > property in options and the kk key would become mandatory. > > Norbert > > > On Aug

Re: ECMAScript collation question

2012-08-31 Thread Mark Davis
I think we could go either way. It depends on the usage mode. 1. The case where performance is crucial is where you are comparing gazillions of strings, such as records in a database. 2. If the number of strings to be compared is relatively small, and/or there is enough overhead anyway

Re: ECMAScript collation question

2012-08-30 Thread Mark Davis
ICU *is* always able to compare them as being equal, just by setting the parameter. Even if the parameter isn't set, it uses an FCD sort (see http://unicode.org/notes/tn5/) and canonical closure, which handles most cases of canonical equivalence. The default is turned on for languages where the no

Re: Unicode support in new ES6 spec draft

2012-07-17 Thread Mark Davis
8581.html > > Norbert > > > On Jul 17, 2012, at 14:49 , Brendan Eich wrote: > > > Allen Wirfs-Brock wrote: > >> On Jul 16, 2012, at 2:57 PM, Mark Davis ☕ wrote: > >> > >>> In order to support backwards iteration (which is sometimes used), we >

Re: Unicode support in new ES6 spec draft

2012-07-16 Thread Mark Davis
In order to support backwards iteration (which is sometimes used), we should have codePointBefore. -- Mark * * *— Il meglio è l’inimico del bene —* ** On Mon, Jul 16, 2012 at 2:54 PM, Gillam, Richard wrote: > Why is i

Re: Quasi-literals and localization

2012-07-12 Thread Mark Davis
I didn't pay enough attention to the whole quasi structure, so I can't pretend to speak intelligently about that. We do support a number of different mechanisms for string translation, many of them extracting strings from source files (including templating source languages like jsps, soy (aka clos

Re: Internationalization: Additional values in API

2012-06-26 Thread Mark Davis
I tend to agree with your proposal. Some caveats below. -- Mark * * *— Il meglio è l’inimico del bene —* ** On Tue, Jun 26, 2012 at 3:22 PM, Norbert Lindenberg < ecmascr...@norbertlindenberg.com> wrote: > The TC 39 m

Re: Unicode normalization

2012-05-29 Thread Mark Davis
This is for v2, right? -- Mark * * *— Il meglio è l’inimico del bene —* ** On Tue, May 29, 2012 at 5:34 PM, Norbert Lindenberg < ecmascr...@norbertlindenberg.com> wrote: > The ECMAScript Language Specification 5.1 make

Re: Internationalization API issues and updates

2012-04-16 Thread Mark Davis
Lgtm On Mar 26, 2012 4:59 PM, "Norbert Lindenberg" < ecmascr...@norbertlindenberg.com> wrote: > While everybody is reviewing the draft specification of the ECMAScript > Internationalization API [1] in preparation for this week's TC 39 meeting, > here are a few issues that have come up, with propos

Re: Full Unicode based on UTF-16 proposal

2012-03-27 Thread Mark Davis
nes code point properties which would not entail > interpreting as an abstract character, e.g., IsSurrogate, IsNonCharacter, > but where does one draw the line? > > > On Tue, Mar 27, 2012 at 11:15 AM, Mark Davis ☕ wrote: > >> The point of C1 is that you can't interpret the surroga

Re: Full Unicode based on UTF-16 proposal

2012-03-27 Thread Mark Davis
22210033> * * *— Il meglio è l’inimico del bene —* ** On Tue, Mar 27, 2012 at 08:56, Glenn Adams wrote: > This begs the question of what is the point of C1. > > > On Tue, Mar 27, 2012 at 9:13 AM, Mark Davis ☕ wrote: > >> That would not be practical, nor predictable. And

Re: Full Unicode based on UTF-16 proposal

2012-03-27 Thread Mark Davis
com/114199149796022210033> * * *— Il meglio è l’inimico del bene —* ** On Tue, Mar 27, 2012 at 08:02, Glenn Adams wrote: > > > On Tue, Mar 27, 2012 at 8:39 AM, Mark Davis ☕ wrote: > >> That, as Norbert explained, is not the intention of the standard. Take a >> look at the

Re: Full Unicode based on UTF-16 proposal

2012-03-27 Thread Mark Davis
That, as Norbert explained, is not the intention of the standard. Take a look at the discussion of "Unicode 16-bit string" in chapter 3. The committee recognized that fragments may be formed when working with UTF-16, and that destructive changes may do more harm than good. x = a.substring(0, 5) +

Re: Full Unicode based on UTF-16 proposal

2012-03-16 Thread Mark Davis
Whew, a lot of work, Norbert. Looks quite good. My one question is whether it is worth having a mechanism for iteration. OLD CODE for (int i = 0; i < s.length(); ++) { var x = s.charAt(i); // do something with x } Using your mechanism, one would write: NEW CODE for (int i = 0; i < s.length()

Re: New full Unicode for ES6 idea

2012-02-19 Thread Mark Davis
First, it would be great to get full Unicode support in JS. I know that's been a problem for us at Google. Secondly, while I agree with Addison that the approach that Java took is workable, it does cause problems. Ideally someone would be able to loop (a very common construct) with: for (codepoin

Re: Question about the “full Unicode in strings” strawman

2012-01-25 Thread Mark Davis
(oh, and I agree with your other points) Mark *— Il meglio è l’inimico del bene —* * * * [https://plus.google.com/114199149796022210033] * On Wed, Jan 25, 2012 at 11:11, Mark Davis ☕ wrote: > You can't use \u10 as syntax, because that could be \u10FF followed by > literal F

Re: Question about the “full Unicode in strings” strawman

2012-01-25 Thread Mark Davis
You can't use \u10 as syntax, because that could be \u10FF followed by literal FF. A better syntax is \u{...}, with 1 to 6 digits, values from 0 .. 10. Mark *— Il meglio è l’inimico del bene —* * * * [https://plus.google.com/114199149796022210033] * On Wed, Jan 25, 2012 at 10:59, Gillam

Re: Globalization API holiday summary

2011-12-09 Thread Mark Davis
Mark *— Il meglio è l’inimico del bene —* * * * [https://plus.google.com/114199149796022210033] * On Thu, Dec 8, 2011 at 10:25, Nebojša Ćirić wrote: > There are couple of threads going on and I wanted to wrap up current state > before the holidays... > > API: > 1. Use built in toLocaleString

Re: Globalization API: supportedLocalesOf vs. getSupportedLocales

2011-11-28 Thread Mark Davis
Here's the problem. The very same collator for "de" is valid for "de-DE", "de-AT", and "de-CH". In ICU you actually get a functionally-equivalent object back, no matter which of these you ask for. However, that collator is *also* valid for other countries where 'de' is official: de-LU, de-BE, de-

Re: Globalization API Feedback - moar!

2011-11-28 Thread Mark Davis
Some feedback on the API. This is a bit of stream-of-consciousness response, but figured it would be better to get it out than to delay & clean it up. The internationalization issues that people may not be used to are: - *Big data requirements. *A collation sequence for Chinese, for example

Regex

2011-11-17 Thread Mark Davis
Regex has not been part of scope of the Globalization API work. I wanted to find out whether any improvements from an internationalization point of view are being planned, separately. Some of the problems include: - Regex's fail on supplementary characters (above U+). Most of these are

Fwd: Slide show: Survey of current programming language support for Unicode

2011-08-01 Thread Mark Davis
FYI About the new BCP47 support in Java: http://download.oracle.com/javase/tutorial/i18n/locale/extensions.html The following, comparing Unicode support in programing languages, including ES. -- Forwarded message -- From: Karl Williamson Date: Sat, Jul 30, 2011 at 13:01 Subje

Re: i18n meeting mid August @ Google

2011-08-01 Thread Mark Davis
Works for me. (I would need to be out from 11:00-12:30.) Mark *— Il meglio è l’inimico del bene —* On Mon, Aug 1, 2011 at 09:29, Nebojša Ćirić wrote: > So far we have Monday and Tuesday off the table, and some people hinting > that Wednesday would work best for them. Anybody has a conflict wit

Re: Comments on internationalization API

2011-07-22 Thread Mark Davis
l language, region, and > options of a Collator, NumberFormat, or DateTimeFormat. E.g., if I request > ar-MA-u-ca-islamic, did I get exactly what I requested, or > ar-MA-u-ca-islamicc, ar-MA-u-ca-gregory, ar-u-ca-gregory, or yet something > else? > > Best regards, > Norbert > &

Re: Comments on internationalization API

2011-07-20 Thread Mark Davis
I have comments on some of these. Mark *— Il meglio è l’inimico del bene —* On Tue, Jul 19, 2011 at 01:29, Norbert Lindenberg < ecmascr...@norbertlindenberg.com> wrote: > Hi all, > > I'm sorry for not having been able to contribute to the > internationalization API earlier. I finally have revie

Fwd: Full Unicode strings strawman

2011-05-19 Thread Mark Davis
Markus isn't on es-discuss, so forwarding -- Forwarded message -- From: Markus Scherer Date: Wed, May 18, 2011 at 22:18 Subject: Re: Full Unicode strings strawman To: Allen Wirfs-Brock Cc: Shawn Steele , Mark Davis ☕ < m...@macchiato.com>, "es-discuss@mozill

Re: Full Unicode strings strawman

2011-05-18 Thread Mark Davis
Yes, one of the options for the internal storage of the string class is to use different arrays depending on the contents. 1. uint8's if all the codepoint are <=FF 2. uint16's if all the codepoint values <= 3. uint32's otherwise That way the internal storage always corresponds direct

Re: Full Unicode strings strawman

2011-05-18 Thread Mark Davis
the hair-pulling we had in the *80's* over those charsets ;-) > > On 17 May 2011 21:55, Mark Davis ☕ wrote:In the past, > I have read it thus, pseudo BNF: > > >>> UnicodeString => CodeUnitSequence // D80 >>> CodeUnitSequence => CodeUnit | CodeUnitSequen

Re: Full Unicode strings strawman

2011-05-17 Thread Mark Davis
That is incorrect. See below. Mark *— Il meglio è l’inimico del bene —* On Tue, May 17, 2011 at 18:33, Wes Garland wrote: > On 17 May 2011 20:09, Boris Zbarsky wrote: > >> On 5/17/11 5:24 PM, Wes Garland wrote: >> >>> Okay, I think we have to agree to disagree here. I believe my reading of >

Re: Full Unicode strings strawman

2011-05-17 Thread Mark Davis
The wrong conclusion is being drawn. I can say definitively that for the string "a\uD800b". - It is a valid Unicode string, according to the Unicode Standard. - It cannot be encoded as well-formed in any UTF-x (it is not 'well-formed' in any UTF). - When it comes to conversion, the bad

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis
Mark *— Il meglio è l’inimico del bene —* On Mon, May 16, 2011 at 15:27, Allen Wirfs-Brock wrote: > See the section of the proposal about String.prototype.charCodeAt > > On May 16, 2011, at 2:20 PM, Mike Samuel wrote: > > > Allen, could you clarify something. > > > > When the strawman says with

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis
In practice, the supplemental code points don't really cause problems in Unicode strings. Most implementations just treat them as if they were unassigned. The only important issue is that *when* they are converted to UTF-xx for storage or transmission, they need to be handled; typically by converti

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis
A correction. U+D800 is indeed a code point: http://www.unicode.org/glossary/#Code_Point. It is defined for usage in Unicode Strings (see http://www.unicode.org/glossary/#Unicode_String) because often it is useful for implementations to be able to allow it in processing. It does, however, have a

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis
:* es-discuss-boun...@mozilla.org [mailto: > es-discuss-boun...@mozilla.org] *On Behalf Of *Jungshik Shin (???, ???) > *Sent:* Monday, May 16, 2011 2:24 PM > *To:* Mark Davis ☕ > *Cc:* Markus Scherer; es-discuss@mozilla.org > > *Subject:* Re: Full Unicode strings strawman > &

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis
I'm quite sympathetic to the goal, but the proposal does represent a significant breaking change. The problem, as Shawn points out, is with indexing. Before, the strings were defined as UTF16. Take a sample string "\ud800\udc00\u0061" = "\u{1}\u{61}". Right now, the 'a' (the \u{61}) is at offs

Re: Collation API not complete for search

2011-03-28 Thread Mark Davis
> Similarly, if one is using Turkish I, we expect all of them to do so. > > > > - Shawn > > > > *From:* Nebojša Ćirić [mailto:c...@google.com] > *Sent:* Monday, March 28, 2011 1:36 PM > *To:* Mark Davis ☕ > *Cc:* es-discuss@mozilla.org; Shawn Steele; Phillip

Re: Collation API not complete for search

2011-03-25 Thread Mark Davis
I think an iterator is a cleaner interface; we were just trying to minimize new API. In general, collation is context sensitive, so searching on substrings isn't a good idea. You want to search from a location, but have the rest of the text available to you. For the iterator, you would need to be

Re: Stupid i18n use cases question

2011-01-29 Thread Mark Davis
There are really 5 cases at issue: 1. Code point breaks 2. Grapheme-Cluster breaks (with three possible variants: 'legacy', extended, and aksha ) 3. Word breaks 4. Line breaks 5. Sentence breaks Notes: - #1 is pretty trivial to do rig

Re: i18n objects

2011-01-26 Thread Mark Davis
lization is not a feature. > > It is an architecture. > > > > *From:* Nebojša Ćirić [mailto:c...@google.com] > *Sent:* Wednesday, January 26, 2011 1:02 PM > *To:* Shawn Steele > *Cc:* Phillips, Addison; Mark Davis ☕; Gillam, Richard; > es-discuss@mozilla.org > *Subj

Re: i18n objects

2011-01-24 Thread Mark Davis
I don't understand. - If you want the explicit value, you call .region. - NB, the value will be undefined iff it is not set explicitly. - If you want the (possibly) inferred value, you call .inferRegion(). - NB, the value is never undefined. What is the problem? Mark *— Il meg

Re: i18n objects

2011-01-24 Thread Mark Davis
As stated before, I think that this approach is more error prone; that it would be better to explicitly call the other function. Here would be the difference between the two alternatives for the API: A and B, under the two common scenarios: *Scenario 1 "I don't care"* A. x = myLocaleInfo.region;

Re: 2nd day meeting comments on the latest i18n API proposal

2011-01-21 Thread Mark Davis
The problem I see is that if I hand you a LocaleInfo, and there is only one API to get the region, then it (in your words) **is easy** to "accidentally make the wrong choice, or not realize they need to make a choice". - x.region may be an explicit value or may be computed: I have to call so

Re: 2nd day meeting comments on the latest i18n API proposal

2011-01-21 Thread Mark Davis
I would actually rather not have it be a construction argument, because it is easier for people to make mistakes that way. When I look this over, there are relatively few fields that need this. So what about having API like: // get an explicitly-set region, or null if there was no region paramete

Re: i18n collator options

2011-01-20 Thread Mark Davis
We could do either. Mark *— Il meglio è l’inimico del bene —* On Thu, Jan 20, 2011 at 16:14, Shawn Steele wrote: > For UTF-16 order do you use like the Turkish casing if it was a turkish > locale? > > > > -Shawn > > > > *From:* mark.edward.da...@gmail.com [mailto:mark.edward.da...@gmail.com]

Re: EcmaScript i18n API proposal

2010-06-10 Thread Mark Davis
*Re the following message:* * * It is clearly expected that the number of locales available on any particular device may be limited; a smartphone, for example, might have very few installed, or have limited services for those it does have installed. With the locale model, implementations are expect