php-i18n Digest 2 Mar 2006 17:47:26 -0000 Issue 315

Topics (messages 951 through 956):

Re: Code updates?
        951 by: Andrei Zmievski
        952 by: l0t3k
        953 by: l0t3k
        954 by: Andrei Zmievski
        955 by: l0t3k

Re: [PHP-DEV] Unicode string literals and casting
        956 by: Andrei Zmievski

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [email protected]


----------------------------------------------------------------------
--- Begin Message --- Never mind, I seem to have found it at http://cvs.iworks.at/cvs.php/php-i18n.

-Andrei

On Feb 27, 2006, at 10:45 AM, Andrei Zmievski wrote:

Clayton,

Can you please make your code available somewhere so we can take a look at it?

-Andrei

--
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

--- End Message ---
--- Begin Message ---
Andrei,
  keep in mind that this is still very alpha level. Its a reconstruction of 
the most recent backup i could find after my machine crashed last year.

Clayton

"Andrei Zmievski" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Never mind, I seem to have found it at 
> http://cvs.iworks.at/cvs.php/php-i18n.
>
> -Andrei
>
> On Feb 27, 2006, at 10:45 AM, Andrei Zmievski wrote:
>
>> Clayton,
>>
>> Can you please make your code available somewhere so we can take a look 
>> at it?
>>
>> -Andrei
>>
>> -- 
>> PHP Unicode & I18N Mailing List (http://www.php.net/)
>> To unsubscribe, visit: http://www.php.net/unsub.php 

--- End Message ---
--- Begin Message ---
Andrei,
  keep in mind that this is still very alpha level. Its a reconstruction of
the most recent backup i could find after my machine crashed last year.

Clayton

"Andrei Zmievski" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Never mind, I seem to have found it at
> http://cvs.iworks.at/cvs.php/php-i18n.
>
> -Andrei
>
> On Feb 27, 2006, at 10:45 AM, Andrei Zmievski wrote:
>
>> Clayton,
>>
>> Can you please make your code available somewhere so we can take a look
>> at it?
>>
>> -Andrei
>>
>> -- 
>> PHP Unicode & I18N Mailing List (http://www.php.net/)
>> To unsubscribe, visit: http://www.php.net/unsub.php

--- End Message ---
--- Begin Message --- Okay. I do want to mention that your code seems to be very object oriented and has a complete separate class hierarchy (starting with I18NBase?). I am not sure that we can use it as is in PHP. I think we discussed making most of the ICU API procedural, with some OO parts. At the very least we need a thorough review before we decide on anything.

-Andrei

On Feb 27, 2006, at 2:08 PM, l0t3k wrote:

Andrei,
keep in mind that this is still very alpha level. Its a reconstruction of
the most recent backup i could find after my machine crashed last year.

Clayton


--- End Message ---
--- Begin Message ---
"Andrei Zmievski" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Okay. I do want to mention that your code seems to be very object oriented 
> and has a complete separate class hierarchy (starting with I18NBase?). I 
> am not sure that we can use it as is in PHP. I think we discussed making 
> most of the ICU API procedural, with some OO parts. At the very least we 
> need a thorough review before we decide on anything.

Not a problem. When i started, my goal was to adhere to the ICU C++ API, 
mainly for documentation and Java compatibility reasons (there's lots of 
code out there that would be easily adaptable).

I personally prefer the OOP approach, but i'll work on whatever is decided.

BTW - the I18NBase thing was simply a way of dealing with some nested object 
lifetime management issues, which i've found a easier way to resolve.

Clayton


> -Andrei
>
> On Feb 27, 2006, at 2:08 PM, l0t3k wrote:
>
>> Andrei,
>>   keep in mind that this is still very alpha level. Its a reconstruction 
>> of
>> the most recent backup i could find after my machine crashed last year.
>>
>> Clayton
>> 

--- End Message ---
--- Begin Message ---
[moving the discussion to php-i18n list]

Will a program always be able to change the runtime_encoding setting?

Yes.

Some hosts like to lock off everything and disable ini_set etc. If the host has hardlocked it at something terrible, can my portable program still declare that
it needs to work with UTF-8?

You can change anything but unicode_semantics from your own .ini file or from within the script (using declare() pragma instead of script_encoding).

Which brings to mind; if the input in $_REQUEST etc has been misconverted by a bad setting, how do I get at the unconverted data to fix it? The (outdated ;) README says this will be possible but I didn't see any reference to how.

Yes, that part has not been implemented yet.

I do find the FATAL ERRORS on using the 'wrong' string type a bit odd though; most other types in PHP will coerce silently (string . int), and the wildly
incompatible ones usually cause mere NOTICE or WARNING-level messages.

Was this change from PHP's regular behavior a conscious decision to make people think harder about what kind of strings they're using? From the original design document I got the impression that it was meant to be specific to special binary-only strings, which would be used relatively rarely (eg for binary file I/O) while more typical strings would transparently "just work" most of the time. Now the binary strings have replaced the native strings and the whole
behavior has changed.

The only difference between binary and native strings in the original design was that binary ones did no participate in implicit or some explicit conversions. Now that these two types have been conflated, we may have to adjust the semantics, which is why I proposed that casting operators (explicit conversions) always work. We could make implicit conversions work also if we work out the details, such as what encoding to use for converting binary strings to Unicode (script or runtime). I kind of like your idea of allowing only ASCII characters in binary string literals..

(A comparison with other languages; Python is normally very strict about typing and won't even let you concatenate a string with an integer without an explicit conversion. But it will let you concatenate a byte string with a Unicode string,
with an automatic coercion to Unicode.)

I guess they don't worry about script encoding?

Personally I have no use for non-ASCII identifiers.

Anything that needs to get used for referring to identifiers, though, needs to
be able to operate consistently in some fashion...
* array_map("some_function_name", $data);
* $GLOBALS["myConfigVar"] = $newval;
etc

These probably need to either 'just work' when passed the other kind of string,
or have some kind of consistent cast available.

Yes, of course. One solution would be to make our class/function tables always be Unicode instead of depending on the unicode_semantics switch. But that might slow down the non-Unicode mode.

(Life would be a lot simpler if there weren't two different modes, of course. :)


Absolutely. Derick, Rasmus, and I have been discussing the possibility of having only one mode. The main issue is performance (portability issues should not be that great) so Derick was going to run some tests and see how much slower PHP 6 in unicode mode really is compared to non-unicode mode and to 5.1.

-Andrei

--- End Message ---

Reply via email to