php-i18n Digest 4 Jun 2008 11:41:42 -0000 Issue 398

Topics (messages 1234 through 1238):

Re: intl 1.0.0RC1
        1234 by: Stanislav Malyshev
        1235 by: Lars Strojny
        1236 by: Lars Strojny
        1237 by: Stanislav Malyshev
        1238 by: David Zülke

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [EMAIL PROTECTED]


----------------------------------------------------------------------
--- Begin Message ---
Hi!

What if there is a Number extension down the road. Or a Collator extension. Or what if people already have classes called NumberFormatter

Well, what if they have classes named IntlNumberFormatter? There can be
all kinds of classes, and until we have standard namespace for internal
classes we have to live with names being global and make them
sufficiently distinct to avoid collisions. I think NumberFormatter is
sufficiently distinct.

But why are those internal differences exposed through the API. I think

Because there are cases when either one of them can be used, IMO.

At least, there should be Collator::asortWithKeys(). But I really

Current implementation doesn't allow to do asort with keys easily, but
this can be improved. Contributions welcome btw ;)
Also, asort seems to be less frequent use case for the data that require
collation. That doesn't mean it shouldn't be done, just from priority point of view.

idea. I'm failing to get the extension compiled here on OS X, but will

What's the problem on OS X? I'd like having it building on any OSes
supported by PHP, so could you provide more info on this?

[EMAIL PROTECTED];collation=traditional;calendar=thai-buddhist is what I could come up with right now... 77 characters.

That's really not a frequent case - especially taking into account that
there's no function that needs currency, collation and calendar at the
same time. But for the main reason see below.

The other question is what happens if the string is longer than that? Does it get cut off or something?

No, the function getting overlong locale name would fail.

paintings. Or whatever. So locale identifier strings can be of any length.

Please tell that to the ICU library developers. 98-byte long locale
provably crashes ICU libraries. I didn't want to take chances so I chose
smallest "round" number for the limit that works reliably. I'd be happy to raise it if I could be sure ICU would work OK with it.

Maybe ext/intl should do this:

- Accept locale strings of arbitrarys length
- Parse them and throw out any keywords ICU cannot handle (i.e. everything except "collation", "currency" and "calendar", AFAIK)
- Hand the resulting string over to ICU

Well, maybe, but not in 1.0 :) Note that this will also significantly
slow down the functions and introduce dependency in PHP code for locale
formats.

What confuses me, in general, is why locales are not implemented as objects. Why do I have to pass a locale string to every locale-aware function?

Because locale is essentially the string. There's nothing in the locale that isn't in the string, so you don't need any specific object for that - it wouldn't give you any value.

Also... having uloc_acceptLanguageFromHTTP exposed in the API would be pretty neat ;) Since apparently, that does a mapping of e.g. "en-GB" to "en_UK" etc

Feature request on pecl.php.net? ;) It'd be really easier to keep track of it that way.

And.. is there going to be Resources support in the future? AFAIK, the

Yes, it's planned.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]



--- End Message ---
--- Begin Message ---
Hi Stas,

Am Montag, den 02.06.2008, 08:36 +0300 schrieb Stanislav Malyshev:
> Well, what if they have classes named IntlNumberFormatter? There can
> be all kinds of classes, and until we have standard namespace for
> internal classes we have to live with names being global and make them
> sufficiently distinct to avoid collisions. I think NumberFormatter is
> sufficiently distinct.

There might be classes like IntlNumberFormatter and someones business
could be set at risk by a nuke but both are not very likely, so people
normally ignore the nuke scenario. In the same way I think it is not so
realistic to have a IntlNumberFormatter but a NumberFormatter is much
more realistic. Would you accept a patch to prefix everything with an
intl-Prefix?

cu, Lars

Attachment: signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


--- End Message ---
--- Begin Message ---
Hi Stas,

(sorry for the cluttered reply)

Am Montag, den 02.06.2008, 08:36 +0300 schrieb Stanislav Malyshev:
> Because locale is essentially the string. There's nothing in the
> locale that isn't in the string, so you don't need any specific object
> for that - it wouldn't give you any value.

Except that it is easier to validate with object setters than to parse a
locale string. 

$locale = new IntlLocale();
$locale->setCurrency(IntlCurrency::USD);
$locale->setCollation(IntlCollation::TRADITIONAL);

or 

$locale = new IntlLocale('en_US', IntlCurrency::USD, 
IntlCollation::TRADITIONAL, IntlCalendar::THAI_BUDDHIST);

Creating the locale programatically, which might be a common use case
for the web, is much easier with a well defined object than with a
somehow concatted string.

cu, Lars

Attachment: signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


--- End Message ---
--- Begin Message ---
Hi!

$locale = new IntlLocale();
$locale->setCurrency(IntlCurrency::USD);
$locale->setCollation(IntlCollation::TRADITIONAL);

You have functions for that in Locale class.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--- End Message ---
--- Begin Message ---
Am 02.06.2008 um 07:36 schrieb Stanislav Malyshev:

Hi!

What if there is a Number extension down the road. Or a Collator extension. Or what if people already have classes called NumberFormatter

Well, what if they have classes named IntlNumberFormatter? There can be all kinds of classes, and until we have standard namespace for internal
classes we have to live with names being global and make them
sufficiently distinct to avoid collisions. I think NumberFormatter is
sufficiently distinct.

Obviously, it is less likely that someone who wrote a Collator called their class "IntlCollator" instead of "Collator".

Consistency is important. The lack of it is also often what PHP users complain about. This should really be changed to be consistent across the board, Stas.


But why are those internal differences exposed through the API. I think

Because there are cases when either one of them can be used, IMO.

I don't think that's a valid argument. Why does the API have to expose the internal implementation differences?


At least, there should be Collator::asortWithKeys().  But I really

Current implementation doesn't allow to do asort with keys easily, but
this can be improved. Contributions welcome btw ;)
Also, asort seems to be less frequent use case for the data that require collation. That doesn't mean it shouldn't be done, just from priority point of view.

Again, I think it should be consistent :) Why not just toss out sort(), rename sortWithKeys() to sort(), and the optimize asort() later.


idea. I'm failing to get the extension compiled here on OS X, but will

What's the problem on OS X? I'd like having it building on any OSes
supported by PHP, so could you provide more info on this?

Can't quite remember, but will let you know ASAP. Probably a problem on my side.


[EMAIL PROTECTED];collation=traditional;calendar=thai- buddhist is what I could come up with right now... 77 characters.

That's really not a frequent case - especially taking into account that
there's no function that needs currency, collation and calendar at the
same time. But for the main reason see below.

Waitwaitwait. The entire point of doing internationalization properly, using ICU and the CLDR, is that even seemingly obscure cases are possible.

I mean who are you/we to decide that someone from the Republic of Serbia, who speaks Serbian, may not view sales numbers for last week's month in the thai-buddhist calendar in USD and sorted "traditionally"?

You think that is unrealistic? Maybe. Then what about this:

[EMAIL PROTECTED];collation=traditional;calendar=gregorian

China, simplified Han, List of quarterly sales, Gregorian calendar, normal collation for sorting person names.

65 characters. And this is not unrealistic.

Stas, internationalization is not about neglecting edge cases. It has to be done properly. That's the whole point of it.


The other question is what happens if the string is longer than that? Does it get cut off or something?

No, the function getting overlong locale name would fail.

paintings. Or whatever. So locale identifier strings can be of any length.

Please tell that to the ICU library developers. 98-byte long locale
provably crashes ICU libraries. I didn't want to take chances so I chose smallest "round" number for the limit that works reliably. I'd be happy to raise it if I could be sure ICU would work OK with it.

97? ;)


Maybe ext/intl should do this:
- Accept locale strings of arbitrarys length
- Parse them and throw out any keywords ICU cannot handle (i.e. everything except "collation", "currency" and "calendar", AFAIK)
- Hand the resulting string over to ICU

Well, maybe, but not in 1.0 :) Note that this will also significantly
slow down the functions and introduce dependency in PHP code for locale
formats.

What confuses me, in general, is why locales are not implemented as objects. Why do I have to pass a locale string to every locale- aware function?

Because locale is essentially the string. There's nothing in the locale that isn't in the string, so you don't need any specific object for that - it wouldn't give you any value.

But you have to parse the locale string each time, right? That is overhead. It would be much more logical to pass it around as a resource/object. ICU does it the same way.


Also... having uloc_acceptLanguageFromHTTP exposed in the API would be pretty neat ;) Since apparently, that does a mapping of e.g. "en- GB" to "en_UK" etc

Feature request on pecl.php.net? ;) It'd be really easier to keep track of it that way.

That was more of a joke ;) But I will add all those to the issue tracker, yes.

And.. is there going to be Resources support in the future? AFAIK, the

Yes, it's planned.

Awesome.


David

--- End Message ---

Reply via email to