php-i18n Digest 14 Jun 2008 06:34:59 -0000 Issue 401

Topics (messages 1249 through 1266):

Re: intl 1.0.0RC1
        1249 by: Darren Cook
        1253 by: Stanislav Malyshev
        1254 by: Darren Cook
        1255 by: Stanislav Malyshev
        1256 by: Darren Cook
        1257 by: Stanislav Malyshev
        1258 by: Guillaume Rossolini
        1259 by: Jan Schneider
        1260 by: Texin, Tex
        1261 by: David Zülke
        1262 by: Stanislav Malyshev
        1263 by: Texin, Tex
        1264 by: Stanislav Malyshev
        1266 by: Norbert Lindenberg ☮

Re: Unicode Transliteration & ICU
        1250 by: Darren Cook
        1251 by: Stanislav Malyshev
        1252 by: Darren Cook
        1265 by: Andrei Zmievski

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [EMAIL PROTECTED]


----------------------------------------------------------------------
--- Begin Message ---
>> Finally, I would like to point you to
>> http://www.php.net/manual/en/userlandnaming.rules.php:
>> "PHP will prefix any global symbols of an extension with the name of
>> the extension."
> 
> That's not what the real coding standard says:
> 
>  If they are part of a "parent set" of functions, that parent should
>     be included in the user function name, and should be clearly related
>     to the parent program or function family. This should be in the form
>     of ``parent_*``

That seems to be item #2 in the Naming Conventions section of:
http://cvs.php.net/viewvc.cgi/php-src/CODING_STANDARDS?view=co

but #7 says: "The class name should be prefixed with the name of the
'parent set' (e.g. the name of the extension)"

What are the good reasons for not changing the names?

Darren

-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
                        open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/work/charts/  (My flash charting demos)

--- End Message ---
--- Begin Message ---
Hi!

What are the good reasons for not changing the names?

This was already discussed on the list, so reading the archives would help. But OK, here it goes again:

1. We want the extension to be in 5.2 to 6.x
2. We want PHP source code compatibility between 5.x and 6.x code
3. We want PHP 6 users use class Collator when they need collator, not some weird long name, same with Locale etc. 4. intl extension, unlike many others, do not have one functional module, but multiple ones, related only by underlying library. Thus prefixing all modules with extension name makes as much sense as prefixing all functions using libc or libm with common prefix. Each functional module, of course, does have its own prefix.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--- End Message ---
--- Begin Message ---
>> What are the good reasons for not changing the names?
> 
> 1. We want the extension to be in 5.2 to 6.x
> 2. We want PHP source code compatibility between 5.x and 6.x code

I don't understand why the choice between IntlCollator and Collator
affects compatibility between 5.x and 6.x. Can you explain this further?

> 3. We want PHP 6 users use class Collator when they need collator, not
> some weird long name, same with Locale etc.

I don't think IntlCollator is either long, or weird. Actually I feel
prefixing the extension name aids code clarity (as it tells me which
extension the class belongs to, and I can search on "Intl" to find all
places I'm using anything from the extension).

> 4. intl extension, unlike many others, do not have one functional
> module, but multiple ones, related only by underlying library.

I think the users will see it as a single extension:
  a) All the classes are based on ICU classes of the same name;
  b) I type "pecl install intl"; (I cannot just install one part)
  c) All classes are for i18n, which most people will see as a single task.


The arguments for prefixing all classes with "Intl" are:
 * Consistency with IntlDateFormatter

 * Following PHP naming guidelines

 * Avoid clashes with existing classes (someone having written a class
called Collator is more likely than someone having written one called
IntlCollator)

(I'd also add "code clarity", but that is more of a subjective, personal
thing.)


Darren

-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
                        open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/work/charts/  (My flash charting demos)

--- End Message ---
--- Begin Message ---
Hi!

I don't understand why the choice between IntlCollator and Collator
affects compatibility between 5.x and 6.x. Can you explain this further?

Because if you want Collator in 6, and you want same code to run both 5.x and 6.x, then it's Collator in 5.x. If A==B, then B==A.

I don't think IntlCollator is either long, or weird. Actually I feel

Well, of course it would be strange for you to say "I think it's weird but I still advocate it's usage". I however think that here usability is more important than uniformity for the sake of uniformity. There would be no problem for people to know which class name to use for collator.

prefixing the extension name aids code clarity (as it tells me which
extension the class belongs to, and I can search on "Intl" to find all
places I'm using anything from the extension).

Now, why would it be important to you which directory contains the source file that produced your function? So important that you would want to sacrifice usability for it? I think most users couldn't care less once it works for them.

 * Following PHP naming guidelines

PHP naming guidelines do not say it must always be extension name. It has it has to be by functional group.

 * Avoid clashes with existing classes (someone having written a class
called Collator is more likely than someone having written one called
IntlCollator)

That's because IntlCollator is so awkward a name nobody would use it in the code unless forced to. Why would I want to force all PHP users to use awkward names? To show off how "consistent" we are? To hell with the consistency if the consistency means being consistently hard to use.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--- End Message ---
--- Begin Message ---
>> I don't understand why the choice between IntlCollator and Collator
>> affects compatibility between 5.x and 6.x. Can you explain this further?
> 
> Because if you want Collator in 6, and you want same code to run both
> 5.x and 6.x, then it's Collator in 5.x. If A==B, then B==A.

Yes. Why can't it be IntlCollator in both 5.x and 6.x? In other words
(unless I'm missing something) compatibility between 5.x and 6.x doesn't
affect the naming choice.

> I however think that here usability is
> more important than uniformity for the sake of uniformity. 

I'm advocating uniformity for the sake of usability :-). (for reasons
already given). (I'm a freelance programmer, working on real-world
i18n-related code, not an ivory tower academic.)

Darren


-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
                        open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/work/charts/  (My flash charting demos)

--- End Message ---
--- Begin Message ---
Hi!

Yes. Why can't it be IntlCollator in both 5.x and 6.x? In other words
(unless I'm missing something) compatibility between 5.x and 6.x doesn't
affect the naming choice.

The thing is that having collators in PHP called IntlCollator and locales in PHP called IntlLocale sucks. It looks so awkward and artificial.

I'm advocating uniformity for the sake of usability :-). (for reasons

I don't see how IntlLocale is more usable than Locale. If you'd be total stranger to all the things discussed here and I'd tell you - there's a thing called "locale", what do you think would be the name of the class that works with it - would you answer "Locale" or "IntlLocale"?

already given). (I'm a freelance programmer, working on real-world
i18n-related code, not an ivory tower academic.)

OK, great - and you'd prefer all your functions and classes to have prefix Intl because of that? I know I wouldn't. I would like to have class named Locale to work with locales.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--- End Message ---
--- Begin Message ---
Hi,

Here's my user point of view, if you don't mind.

I think your argument is about namespaces, which will be in 6 but not
5.2...  So you might want to have IntCollator and IntLocale in 5.2 to avoid
possible collisions, but you would rather have something like Intl::Collator
and Intl::Locale from 5.3 and 6 onward.  I guess the question is: can you
bear with Collator and Locale until we have namespaces?  I know I can.

Guillaume Rossolini


On Tue, Jun 10, 2008 at 11:17 AM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:

> Hi!
>
>  Yes. Why can't it be IntlCollator in both 5.x and 6.x? In other words
>> (unless I'm missing something) compatibility between 5.x and 6.x doesn't
>> affect the naming choice.
>>
>
> The thing is that having collators in PHP called IntlCollator and locales
> in PHP called IntlLocale sucks. It looks so awkward and artificial.
>
>  I'm advocating uniformity for the sake of usability :-). (for reasons
>>
>
> I don't see how IntlLocale is more usable than Locale. If you'd be total
> stranger to all the things discussed here and I'd tell you - there's a thing
> called "locale", what do you think would be the name of the class that works
> with it - would you answer "Locale" or "IntlLocale"?
>
>  already given). (I'm a freelance programmer, working on real-world
>> i18n-related code, not an ivory tower academic.)
>>
>
> OK, great - and you'd prefer all your functions and classes to have prefix
> Intl because of that? I know I wouldn't. I would like to have class named
> Locale to work with locales.
> --
> Stanislav Malyshev, Zend Software Architect
> [EMAIL PROTECTED]   http://www.zend.com/
> (408)253-8829   MSN: [EMAIL PROTECTED]
>
> --
> PHP Unicode & I18N Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

--- End Message ---
--- Begin Message ---
Zitat von "Stanislav Malyshev" <[EMAIL PROTECTED]>:

Hi!

Yes. Why can't it be IntlCollator in both 5.x and 6.x? In other words
(unless I'm missing something) compatibility between 5.x and 6.x doesn't
affect the naming choice.

The thing is that having collators in PHP called IntlCollator and locales in PHP called IntlLocale sucks. It looks so awkward and artificial.

What kind of attitude is that? You personally think it sucks, so it's not going to happen? It's not artificial or awkward at all, it's poor man's namespacing that we use in PHP since ages, because we don't have namespacing yet. Everybody is using it, especially in PHP internally.

I'm advocating uniformity for the sake of usability :-). (for reasons

I don't see how IntlLocale is more usable than Locale. If you'd be total stranger to all the things discussed here and I'd tell you - there's a thing called "locale", what do you think would be the name of the class that works with it - would you answer "Locale" or "IntlLocale"?

If I would be a stranger, I wouldn't try guessing class names at all. Instead I would look into the manual. And I would easily spot that in the manual: extension == manual chapter == function/class prefix. THAT is usability.

already given). (I'm a freelance programmer, working on real-world
i18n-related code, not an ivory tower academic.)

OK, great - and you'd prefer all your functions and classes to have prefix Intl because of that? I know I wouldn't. I would like to have class named Locale to work with locales.

Maybe because you DO live in the ivory tower and have the luxury to be able to start each project from scratch with the newest and freshest code. Nobody else does. Anybody in the real-world is dealing with large portions of existing code, slowly integrating new features where see fit. If anybody who is currently using i18n in his code (and only those are interested in ext/intl anyway), wants to use functionality from ext/intl, chances are high they have to rewrite their whole code, because they happen to use Collator or Locale already. Not very unlikely in i18n related code, isn't it? This is not different from trying introduce the Date class a few years back.

Jan.

--
Do you need professional PHP or Horde consulting?
http://horde.org/consulting/


--- End Message ---
--- Begin Message ---
1) I don't see any reason for name calling or personal attacks. 
Besides, Stas's ivory tower is more of a penthouse than a tower.

2) I don't understand  your argument that it takes a major code rewrite to use 
the collator.
That has not been my experience.
Why do you make that claim or what did you have to do to introduce date class 
that is relevant?

tex

> -----Original Message-----
> From: Jan Schneider [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 10, 2008 3:33 AM
> To: [EMAIL PROTECTED]
> Subject: Re: [PHP-I18N] intl 1.0.0RC1
> 
> Zitat von "Stanislav Malyshev" <[EMAIL PROTECTED]>:
> 
> > Hi!
> >
> >> Yes. Why can't it be IntlCollator in both 5.x and 6.x? In 
> other words 
> >> (unless I'm missing something) compatibility between 5.x and 6.x 
> >> doesn't affect the naming choice.
> >
> > The thing is that having collators in PHP called IntlCollator and 
> > locales in PHP called IntlLocale sucks. It looks so awkward and 
> > artificial.
> 
> What kind of attitude is that? You personally think it sucks, 
> so it's not going to happen?
> It's not artificial or awkward at all, it's poor man's 
> namespacing that we use in PHP since ages, because we don't 
> have namespacing yet.  
> Everybody is using it, especially in PHP internally.
> 
> >> I'm advocating uniformity for the sake of usability :-). 
> (for reasons
> >
> > I don't see how IntlLocale is more usable than Locale. If you'd be 
> > total stranger to all the things discussed here and I'd tell you - 
> > there's a thing called "locale", what do you think would be 
> the name 
> > of the class that works with it - would you answer "Locale" or 
> > "IntlLocale"?
> 
> If I would be a stranger, I wouldn't try guessing class names 
> at all.  
> Instead I would look into the manual. And I would easily spot 
> that in the manual: extension == manual chapter == 
> function/class prefix. THAT is usability.
> 
> >> already given). (I'm a freelance programmer, working on real-world 
> >> i18n-related code, not an ivory tower academic.)
> >
> > OK, great - and you'd prefer all your functions and classes to have 
> > prefix Intl because of that? I know I wouldn't. I would 
> like to have 
> > class named Locale to work with locales.
> 
> Maybe because you DO live in the ivory tower and have the 
> luxury to be able to start each project from scratch with the 
> newest and freshest code.
> Nobody else does. Anybody in the real-world is dealing with 
> large portions of existing code, slowly integrating new 
> features where see fit.
> If anybody who is currently using i18n in his code (and only 
> those are interested in ext/intl anyway), wants to use 
> functionality from ext/intl, chances are high they have to 
> rewrite their whole code, because they happen to use Collator 
> or Locale already. Not very unlikely in i18n related code, 
> isn't it? This is not different from trying introduce the 
> Date class a few years back.
> 
> Jan.
> 
> --
> Do you need professional PHP or Horde consulting?
> http://horde.org/consulting/
> 
> 
> -- 
> PHP Unicode & I18N Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
> 
> 

--- End Message ---
--- Begin Message ---
Am 10.06.2008 um 09:02 schrieb Stanislav Malyshev:

Hi!

What are the good reasons for not changing the names?

This was already discussed on the list, so reading the archives would help. But OK, here it goes again:

1. We want the extension to be in 5.2 to 6.x

I don't. Maybe you do. Or, rather, Zend does. So the Zend Framework can use it, I guess.

The reasonable and responsible thing to do would be making it 5.3+ only.


David

--- End Message ---
--- Begin Message ---
Hi!

What kind of attitude is that? You personally think it sucks, so it's not going to happen?

Not only myself, but myself too. Why do you think that your opinion matters but mine and everybody's that agrees with me doesn't?

It's not artificial or awkward at all, it's poor man's namespacing that

Poor man's namespacing is not exactly what we want. I just call you to think wider than narrow focus on "all names should start with the same letter".

we use in PHP since ages, because we don't have namespacing yet. Everybody is using it, especially in PHP internally.

No, not everybody. Exception class is not called StandardException, and ArrayIterator is not called SplArrayIterator, Directory is not called StandardDirectory, etc.

Maybe because you DO live in the ivory tower and have the luxury to be

I'm with PHP since 1998 at least, maybe earlier (hard to remember 10 years back). I don't think it's especially smart to try and accuse me in not knowing what happens with real-world projects in PHP.

ext/intl, chances are high they have to rewrite their whole code, because they happen to use Collator or Locale already. Not very unlikely

And they should. If they are going to use ICU extension, why would they need to use their own Collator? It makes little sense to both have your own collators and ICU collator and use those in the same application.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--- End Message ---
--- Begin Message ---
It wasn't Zend. Other companies wanted it in 5.2.
tex 

> -----Original Message-----
> From: David Zülke [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 10, 2008 4:14 AM
> To: Stanislav Malyshev
> Cc: Darren Cook; [EMAIL PROTECTED]
> Subject: Re: [PHP-I18N] intl 1.0.0RC1
> 
> Am 10.06.2008 um 09:02 schrieb Stanislav Malyshev:
> 
> > Hi!
> >
> >> What are the good reasons for not changing the names?
> >
> > This was already discussed on the list, so reading the 
> archives would 
> > help. But OK, here it goes again:
> >
> > 1. We want the extension to be in 5.2 to 6.x
> 
> I don't. Maybe you do. Or, rather, Zend does. So the Zend 
> Framework can use it, I guess.
> 
> The reasonable and responsible thing to do would be making it 
> 5.3+ only.
> 
> 
> David
> 
> --
> PHP Unicode & I18N Mailing List (http://www.php.net/) To 
> unsubscribe, visit: http://www.php.net/unsub.php
> 
> 

--- End Message ---
--- Begin Message ---
Hi!

I don't. Maybe you do. Or, rather, Zend does. So the Zend Framework can use it, I guess.

Do you seriously argue that I have to redesign the extension, throwing out all the original requirements for all the participating parties in the project and all people who contributed to it and already use it - and make it only what *you* personally want and need? Please tell me you were joking.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--- End Message ---
--- Begin Message --- I like Guillaume's proposal - use Collator and Locale as is for 5.2, make them Intl::Collator and Intl::Locale when namespaces are available, i.e., from 5.3.

Would that work for everybody? How much effort is it to implement this?

Norbert


On Jun 10, 2008, at 02:44 , Guillaume Rossolini wrote:

Hi,

Here's my user point of view, if you don't mind.

I think your argument is about namespaces, which will be in 6 but not
5.2... So you might want to have IntCollator and IntLocale in 5.2 to avoid possible collisions, but you would rather have something like Intl::Collator and Intl::Locale from 5.3 and 6 onward. I guess the question is: can you
bear with Collator and Locale until we have namespaces?  I know I can.

Guillaume Rossolini


-------------------------------------
Norbert Lindenberg
Yahoo! Internationalization Architect



--- End Message ---
--- Begin Message ---
> Does anyone know anything about transliteration in PHP6. I have noticed
> that PHP has an ICU extension. ICU has a very comprehensive
> transliteration/transform module that is not documented.

It is documented here:
  http://www.icu-project.org/userguide/Transform.html

(But I don't think Transform in the PHP intl extension?)

No Arabic support, which is the transliteration code I'm working on at
the moment (in native PHP; it'll be in the next (MIT open-source) fclib
library release).

I'm also not sure the Japanese one will be useful, as it sounds like
they do things slightly differently from normal Hepburn romaji to allow
the conversions to be reversible. (which also suggests they don't
transliterate the katakana long vowel but keep it as a hyphen??)

> Currently I am using iconv and a PLEC extension to transliterate, but
> they area neither comprehensive or widely supported.

Which languages are you trying to transliterate for?

Darren


-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
                        open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/work/charts/  (My flash charting demos)

--- End Message ---
--- Begin Message ---
Hi!

It is documented here:
  http://www.icu-project.org/userguide/Transform.html

(But I don't think Transform in the PHP intl extension?)

No, not yet.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--- End Message ---
--- Begin Message ---
> Thanks for your very informative reply, Darren. I guess that maybe
> PHP6 has implemented this from ICU. I was told by a PECL developer
> that there is something in PHP6 but he didn't elaborate.

The intl extension:
  http://pecl.php.net/package/intl/
You can use it from php 5.2.4 onwards (or 5.2.3 with some
modifications). Also see php|a magazine,Mar 2008.

> The one I am using at the moment is: 
> http://derickrethans.nl/translit.php

Thanks, I'd not heard of that. The Chinese conversion seems to be done
by a huge lookup table, which is interesting.

> Your work sounds interesting. I have downloaded your library, but am 
> having trouble navigating through it.

Yes, fclib is quite informal :-).

> What files should I be looking at for the transliteration?

utf8.inc, e.g. fclib_katakana_to_hepburn_romaji().
See also my articles in php|a, Aug and Sep 2007.

> I would like to be able to transliterate absolutely everything in 
> unicode. I have no idea if that is unreasonable as I am just getting 
> into character sets. I want them to make a bulletproof string to url 
> function for search engine friendliness and I also believe it is not 
> really a good thing to have high unicode in the url. For example
> 
> Héllo Thìs is a URL Ælfred => hello-this-is-a-url-aelfred

If URLs are the only concern I think I'd do this using urlencode(). What
does a transliteration approach gain you?

> Another thing that I started working on was a strtoupper, strtolower
> and ucfirst function for cyrillic and anything else that can be upper
> and lower case. However, being new to character set and unicode I am
> having trouble converting the hex codes to actual character and
> cannot get preg_replace to work with high unicode.

See fclib_utf8_chr() and uniord() in utf8.inc, which are UTF-8 versions
of PHP's chr() and ord() functions.

I'm not sure about using preg as I'm not sure I've done it that way. The
manual http://jp2.php.net/manual/en/regexp.reference.php has a section
on unicode, but still doesn't seem to support giving a 4-character hex
code. Perhaps you just use \x twice in a row? E.g.
  \x06\x28
to match U+0628 (Arabic BEH).

Darren


-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic
                        open source dictionary/semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/work/charts/  (My flash charting demos)


--- End Message ---
--- Begin Message --- The full text transformation support is not there yet, but there is a simple transliteration function - str_transliterate().

-Andrei

Darren Cook wrote:
Does anyone know anything about transliteration in PHP6. I have noticed
that PHP has an ICU extension. ICU has a very comprehensive
transliteration/transform module that is not documented.

It is documented here:
  http://www.icu-project.org/userguide/Transform.html

(But I don't think Transform in the PHP intl extension?)

No Arabic support, which is the transliteration code I'm working on at
the moment (in native PHP; it'll be in the next (MIT open-source) fclib
library release).

I'm also not sure the Japanese one will be useful, as it sounds like
they do things slightly differently from normal Hepburn romaji to allow
the conversions to be reversible. (which also suggests they don't
transliterate the katakana long vowel but keep it as a hyphen??)

Currently I am using iconv and a PLEC extension to transliterate, but
they area neither comprehensive or widely supported.

Which languages are you trying to transliterate for?

Darren



--- End Message ---

Reply via email to