php-i18n Digest 24 Mar 2006 03:18:19 -0000 Issue 321
Topics (messages 987 through 991):
setlocale returns FALSE sometimes
987 by: Patrick Savelberg
988 by: Patrick Savelberg
Re: is_string()
989 by: M. Sokolewicz
990 by: Andrei Zmievski
Re: Hash api change
991 by: Andi Gutmans
Administrivia:
To subscribe to the digest, e-mail:
[EMAIL PROTECTED]
To unsubscribe from the digest, e-mail:
[EMAIL PROTECTED]
To post to the list, e-mail:
[email protected]
----------------------------------------------------------------------
--- Begin Message ---
Hi list,
I have a few locales installed on my system:
locale -a
C
POSIX
dutch
nl_BE
nl_BE.iso88591
[EMAIL PROTECTED]
nl_BE.utf8
[EMAIL PROTECTED]
[EMAIL PROTECTED]
nl_NL
nl_NL.iso88591
[EMAIL PROTECTED]
nl_NL.utf8
[EMAIL PROTECTED]
[EMAIL PROTECTED]
In my PHP code I have the following two lines:
if( !setlocale(LC_TIME, 'nl_NL') ) {
die("Failed setting locale");
}
Now when I load this script a few times, sometimes I get the message "Failed
setting locale", but not every time.
Why does this happen or what can I do to overcome this?
--- End Message ---
--- Begin Message ---
Hi list,
I have a few locales installed on my system:
locale -a
C
POSIX
dutch
nl_BE
nl_BE.iso88591
[EMAIL PROTECTED]
nl_BE.utf8
[EMAIL PROTECTED]
[EMAIL PROTECTED]
nl_NL
nl_NL.iso88591
[EMAIL PROTECTED]
nl_NL.utf8
[EMAIL PROTECTED]
[EMAIL PROTECTED]
In my PHP code I have the following two lines:
if( !setlocale(LC_TIME, 'nl_NL') ) {
die("Failed setting locale");
}
Now when I load this script a few times, sometimes I get the message "Failed
setting locale", but not every time.
Why does this happen or what can I do to overcome this?
--- End Message ---
--- Begin Message ---
Pierre wrote:
On Wed, 15 Mar 2006 12:26:53 -0800
[EMAIL PROTECTED] (Andrei Zmievski) wrote:
I guess I can live with this. It's just a little strange that we have
only 2 string types, but 3 casts..
I find that confusing too. Given the commit log, it is even more
confusing:
- Updated is_string():
If Unicode semantics is turned on, return "true" for Unicode strings
only. If Unicode semantics is turned off, return "true" for native
strings only.
It makes is_string basically useless, I can imagine two solutions:
- keep it and always returns true (unicode and binary strings are
strings anyway)
- deprecate it in php 6.0, it will still returns true. At least
people will know that they should use the new functions
There is maybe a third solution, but certainly too drastic, remove
is_string :)
--Pierre
I think, to keep bc, it would be best to do as you say in point #1:
is_string() returns true on unicode, native AND binary strings
is_unicode() returns true only on unicode strings
is_binary() returns true only on binary strings
which would mean that string = unicode|binary.
- tul
--- End Message ---
--- Begin Message ---
I find that confusing too. Given the commit log, it is even more
confusing:
- Updated is_string():
If Unicode semantics is turned on, return "true" for Unicode
strings
only. If Unicode semantics is turned off, return "true" for native
strings only.
It makes is_string basically useless, I can imagine two solutions:
- keep it and always returns true (unicode and binary strings are
strings anyway)
You have a good point here. is_string() should treat both types as
"string" type.
- deprecate it in php 6.0, it will still returns true. At least
people will know that they should use the new functions
Ideally we'd have is_text() that would return true for both and
deprecate is_string(), but that seems like a big change for not a
whole lot of pay-off.
-Andrei
--- End Message ---
--- Begin Message ---
Generally speaking I think this would be a good move.
So what's the more common case? Key buckets or integer buckets? I'd
guess the former, so probably going down Dmitry's route and keeping
the [1] would make most sense (and I'm trying to stay unbiased even
though I did that :)
But in any case, I think the rest of the change is the more important
one and positive.
Andi
At 01:02 AM 3/13/2006, Marcus Boerger wrote:
Hello Dmitry,
Monday, March 13, 2006, 7:43:05 AM, you wrote:
> Hi Marcus,
> I agree that we can allocate Bucket and string in one block.
> But we will need to initialize pointer in Bucket.key.zstr and then
> dereference it every time we access string part of key.
> This will a slowdown.
If is a simple add and store operation. This usually tracks down to three
assembler ops. That is much faster than having to create a new struct for
each apply function's callback of which we have a few.
> Of course we can access it directly as arKey, but for what reason we need
> this change?
I didn't say that hash internal operations should use it. But the apply
stuff can simply bypass the adress of the zend_hash_key if we go with that
layout.
> Also each bucket will eat more RAM.
wow 4 bytes.
> One more thing: having "h" as first field of Bucket allows faster access to
> "h" or more compact code as C compiler doesn't need add offset.
On an X86 this is only a single byte more in the assembler opcodes.
> I didn't understand what you mean in "not alignable memory block in the
> integer index case".
In the current struct we force the string container to hold at least one
byte even if we don't use it. that means that three bytes are unused.
If the compiler doesn't automatically align the data the next block is
unaligned. Also if not forcing to use any byte you get back the 4 bytes
i added in the above in case of interger indices.
best regards
marcus
>> -----Original Message-----
>> From: Marcus Boerger [mailto:[EMAIL PROTECTED]
>> Sent: Sunday, March 12, 2006 1:59 PM
>> To: Dmitry Stogov
>> Cc: [email protected]
>> Subject: Re: [PHP-I18N] Hash api change
>>
>>
>> Hello Dmitry,
>>
>> you are right about the strign allocation, we have this in
>> 4 and 5 as well. But a minor change would even captutre that too:
>>
>> typedef struct bucket {
>> ulong h; /* Used for numeric indexing */
>> uint nKeyLength;
>> void *pData;
>> void *pDataPtr;
>> struct bucket *pListNext;
>> struct bucket *pListLast;
>> struct bucket *pNext;
>> struct bucket *pLast;
>> HashKey key;
>> union {
>> char s[]; /* Must be last element */
>> UChar u[]; /* Must be last element */
>> } arKey;
>> }
>>
>> Also you had that member being [1] though the language allows
>> to have it be an incomplete type ([]). I prefer the latter
>> becuase that does not force to allocate a not alignable
>> memory block in the integer index case. If we preceed this
>> way there is no need to allocate two blocks and instead we
>> can easily fill in the pointer for the string key by a simple
>> addition:
>>
>> // for the integer index case
>> bucket p = pemalloc(sizeof(bucket), persistent);
>>
>> // for the string index case
>> bucket p = pemalloc(sizeof(bucket)+key_len, persistent);
>> p->key.arKey.s = p + offsetof(p, arKey);
>> memmove(p->key.s, key_value, key_len);
>>
>> My opinion here is anyway that having to deal with only one
>> struct is much easier and also faster. Being able to change
>> the malloc to a single pointer add operation is a nice addon.
>>
>> best regards
>> marcus
>>
>> Sunday, March 12, 2006, 11:11:32 AM, you wrote:
>>
>> > Hi Marcus,
>>
>> > The idea make sense, but note that this modification will need two
>> > malloc() for each bucket with string index. First for bucket itself
>> > and the second for "arKey". (now they are done in one call)
>> > Also we will lose some performance in dereferencing pointer
>> to arKey in each
>> > ZendHash operation.
>>
>> > Thanks. Dmitry.
>>
>>
>> >> -----Original Message-----
>> >> From: Marcus Boerger [mailto:[EMAIL PROTECTED]
>> >> Sent: Saturday, March 11, 2006 3:22 PM
>> >> To: [email protected]
>> >> Subject: [PHP-I18N] Hash api change
>> >>
>> >>
>> >> Hello php-i18n,
>> >>
>> >> i think we should change the api a bit:
>> >>
>> >> From:
>> >>
>> >> typedef struct _key {
>> >> zend_uchar type;
>> >> union {
>> >> char s[1]; /* Must be last element */
>> >> UChar u[1]; /* Must be last element */
>> >> } arKey;
>> >> } HashKey;
>> >>
>> >> typedef struct bucket {
>> >> ulong h;
>> >> /* Used for numeric indexing */
>> >> uint nKeyLength;
>> >> void *pData;
>> >> void *pDataPtr;
>> >> struct bucket *pListNext;
>> >> struct bucket *pListLast;
>> >> struct bucket *pNext;
>> >> struct bucket *pLast;
>> >> HashKey key; /* Must be last element */
>> >> } Bucket;
>> >>
>> >> To:
>> >>
>> >> typedef struct _key {
>> >> ulong h; /* Used for numeric indexing */
>> >> uint nKeyLength;
>> >> zend_uchar type;
>> >> zstr arKey; /* Must be last element */
>> >> } HashKey;
>> >>
>> >> typedef struct bucket {
>> >> void *pData;
>> >> void *pDataPtr;
>> >> struct bucket *pListNext;
>> >> struct bucket *pListLast;
>> >> struct bucket *pNext;
>> >> struct bucket *pLast;
>> >> HashKey key; /* Must be last element */
>> >> } Bucket;
>> >>
>> >> So now HashKey matches zend_hash_key just by pure reordering.
>> >>
>> >> Also we should probably change the apply_with_arguments stuff
>> >> to pass the tsrm key. That is we'd change from:
>> >>
>> >> typedef int (*apply_func_args_t)(void *pDest, int num_args,
>> >> va_list args, zend_hash_key *hash_key); ZEND_API void
>> >> zend_hash_apply_with_arguments(HashTable *ht,
>> >> apply_func_args_t apply_func, int, ...);
>> >>
>> >>
>> >> To:
>> >>
>> >> typedef int (*apply_func_args_t)(void *pDest TSRMLS_DC, int
>> >> num_args, va_list args, zend_hash_key *hash_key); ZEND_API
>> >> void zend_hash_apply_with_arguments(HashTable *ht TSRMLS_DC,
>> >> apply_func_args_t apply_func, int, ...);
>> >>
>> >> Further more our add_assoc functions currently only allow
>> >> handling of native string indexes. This should be changed to
>> >> allow unicode indexes as well:
>> >>
>> >> ZEND_API int add_assoc_long_ex(zval *arg, char *key, uint
>> >> key_len, long n);
>> >>
>> >> TO:
>> >>
>> >> ZEND_API int add_assoc_long_ex(zval *arg, zend_uchar type,
>> >> zst *key, uint key_len, long n);
>> >>
>> >> Though maybe it is considerable to just add more functions i
>> >> don't think flooding the api is a good idea and prefer to
>> >> only have one version that can easily deal with both.
--
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--- End Message ---