php-i18n Digest 27 Feb 2006 18:45:26 -0000 Issue 314
Topics (messages 948 through 950):
Re: Ideas for a portable string api
948 by: Tex Texin
949 by: Dmitry Stogov
Code updates?
950 by: Andrei Zmievski
Administrivia:
To subscribe to the digest, e-mail:
[EMAIL PROTECTED]
To unsubscribe from the digest, e-mail:
[EMAIL PROTECTED]
To post to the list, e-mail:
[email protected]
----------------------------------------------------------------------
--- Begin Message ---
I wouldn't change it for string length.
For representing characters, there were subtle problems on 64-bit
archtitectures when the int size changed. That is why ICU went with fixing
the int size at 32-bit, if I recall correctly, not to mention the waste of
memory.
Tex Texin
Internationalization Architect, Yahoo! Inc.
> -----Original Message-----
> From: Marcus Boerger [mailto:[EMAIL PROTECTED]
> Sent: Thursday, February 23, 2006 1:25 AM
> To: Andi Gutmans
> Cc: Dmitry Stogov; 'Andrei Zmievski'; [email protected]
> Subject: Re: [PHP-I18N] Ideas for a portable string api
>
>
> Hello Andi,
>
> yes i don't see a reason for a change either.
>
> marcus
>
> Thursday, February 23, 2006, 3:02:46 AM, you wrote:
>
> > I think int is fine for this and we don't need to be super-accurate
> > on the unsigned for string length. This is how we work
> today and it's
> > a big change for no real gain.
>
> > At 01:15 AM 2/21/2006, Marcus Boerger wrote:
> >>Hello Dmitry,
> >>
> >>Tuesday, February 21, 2006, 7:45:17 AM, you wrote:
> >>
> >> > I don't like int32_t because
> >>
> >> > 1) it is defined in ICU.
> >>
> >> int32_t is defined in <stdint.h> by ansi c 89.
> >>
> >> > 2) I will need to rewrite a lot of existing functions to use
> >> int32_t instead
> >> > of int.
> >>
> >>3) Both int and int32_t are wrong. The only correct choice would be
> >>size_t.
> >>
> >>best regards
> >>marcus
> >>
> >> > Thanks. Dmitry.
> >>
> >> >> -----Original Message-----
> >> >> From: Andi Gutmans [mailto:[EMAIL PROTECTED]
> >> >> Sent: Tuesday, February 21, 2006 9:26 AM
> >> >> To: Andrei Zmievski; Dmitry Stogov
> >> >> Cc: [email protected]
> >> >> Subject: Re: [PHP-I18N] Ideas for a portable string api
> >> >>
> >> >>
> >> >> Is there a reason why?
> >> >> We usually just use int/uint almost anywhere... Then
> again I don't
> >> >> really care because 32bit should be plenty (famous last
> >> >> words) for strings....
> >> >>
> >> >> Andi
> >> >>
> >> >> At 10:16 PM 2/20/2006, Andrei Zmievski wrote:
> >> >> >I prefer to use fixed integer type, int32_t.
> >> >> >
> >> >> >-Andrei
> >> >> >
> >> >> >
> >> >> >On Feb 20, 2006, at 11:11 AM, Dmitry Stogov wrote:
> >> >> >
> >> >> >>Hi Andrei,
> >> >> >>
> >> >> >>We decide to use the same type for str.len (now int) and
> >> >> ustr.len (now
> >> >> >>int32_t, it comes from ICU).
> >> >> >>
> >> >> >>I prefer to make both of them - int.
> >> >> >>Any reclaims? I plan to do it at Thursday.
> >> >> >>
> >> >> >>Thanks. Dmitry.
> >> >> >>
> >> >> >>>-----Original Message-----
> >> >> >>>From: Dmitry Stogov [mailto:[EMAIL PROTECTED]
> >> >> >>>Sent: Thursday, February 16, 2006 12:12 PM
> >> >> >>>To: [email protected]
> >> >> >>>Subject: RE: [PHP-I18N] Ideas for a portable string api
> >> >> >>>
> >> >> >>>
> >> >> >>>Hi,
> >> >> >>>
> >> >> >>>After reviewing Marcus ideas, some experiments and speaking
> >> >> >>>with Andrei. I propose the following solutions:
> >> >> >>>
> >> >> >>>1) We will not use any kind of unicode literals in C code
> >> >> (no L"foo"
> >> >> >>>no "f\0o\0o\0\0"), Because L"" is not portable and "f\0.."
> >> >> looks to
> >> >> >>>ugly.
> >> >> >>>
> >> >> >>>2) We will change "zval" structure to make
> >> >> "zval.value.str.len" and
> >> >> >>>"zval.value.ustr.len" of the same type. This will allow
> >> >> >>>optimize
> >> >> >>>Z_UNISTR() and Z_UNILEN() macros. They will
> >> >> >>>
> >> >> >>>#define Z_UNISTR(z) ((void*)(Z_STRVAL(z)))
> >> >> >>>#define Z_UNILEN(z) ((void*)(Z_STRLEN(z)))
> >> >> >>>
> >> >> >>>Instead of
> >> >> >>>
> >> >> >>>#define Z_UNISTR(z)
> >> >> >>>Z_TYPE(z)==IS_UNICODE?(char*)Z_USTRVAL(z):Z_STRVAL(z)
> >> >> >>>#define Z_UNILEN(z)
> >> >> >>>Z_TYPE(z)==IS_UNICODE?(int)Z_USTRLEN(z):Z_STRLEN(z)
> >> >> >>>
> >> >> >>>3) I don't like to break source compatibility with
> >> >> modification of
> >> >> >>>"zval" layout as Marcus suggested. We will pass
> >> >> string/unicode values
> >> >> >>>near in the same way as do today. As three values -
> >> >> zend_uchar type,
> >> >> >>>void* str, int len. But we will create a set of the
> >> >> following macros
> >> >> >>>to do it with less overhead.
> >> >> >>>
> >> >> >>>#define S_TYPE(x) _type_##x
> >> >> >>>#define S_UNIVAL(x) _val_##x
> >> >> >>>#define S_UNILEN(x) _len_##x
> >> >> >>>#define S_STRVAL(x) ((char*)S_UNIVAL(x))
> >> >> >>>#define S_USTRVAL(x) ((UChar*)S_UNIVAL(x))
> >> >> >>>#define S_STRLEN(x) S_UNILEN(x)
> >> >> >>>#define S_USTRLEN(x) S_UNILEN(x)
> >> >> >>>
> >> >> >>>#define S_ARG(x) zend_uchar S_TYPE(x), void
> >> >> >>>*S_UNIVAL(x), int
> >> >> >>>S_UNILEN(x)
> >> >> >>>
> >> >> >>>#define S_PASS(x) S_TYPE(x),
> S_UNIVAL(x), S_UNILEN(x)
> >> >> >>>
> >> >> >>>#define Z_STR_PASS(x) Z_TYPE(x),
> Z_UNIVAL(x), Z_UNILEN(x)
> >> >> >>>#define Z_STR_PASS_P(x) Z_TYPE_P(x), Z_UNIVAL_P(x),
> >> >> >>>Z_UNILEN_P(x)
> >> >> >>>#define Z_STR_PASS_PP(x) Z_TYPE_PP(x), Z_UNIVAL_PP(x),
> >> >> >>>Z_UNILEN_PP(x)
> >> >> >>>
> >> >> >>>Then most zend_u_... Functions must be rewriten with these
> >> >> >>>macros
> >> >> >>>
> >> >> >>>Foe example:
> >> >> >>>
> >> >> >>>ZEND_API int zend_u_lookup_class(S_ARG(name),
> >> >> zend_class_entry ***ce
> >> >> >>>TSRMLS_DC)
> >> >> >>>{
> >> >> >>> return zend_u_lookup_class_ex(S_PASS(name), 1, ce
> >> >> >>>TSRMLS_CC); }
> >> >> >>>
> >> >> >>>Instead of
> >> >> >>>
> >> >> >>>ZEND_API int zend_u_lookup_class(zend_uchar type,
> void *name,
> >> >> >>>int name_length, zend_class_entry ***ce TSRMLS_DC) {
> >> >> >>> return zend_u_lookup_class_ex(type, name,
> >> >> name_length, 1, ce
> >> >> >>>TSRMLS_CC); }
> >> >> >>>
> >> >> >>>Any objections, additions?
>
> --
> PHP Unicode & I18N Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
--- End Message ---
--- Begin Message ---
I already switched from "int32_t" to "int" and fix most places in ZE (except
zend_unicode.c), but not in PHP extensions.
Also I see that some ZE functions use "unsigned int" and "size_t" for string
length.
Thanks. Dmitry.
> -----Original Message-----
> From: Andi Gutmans [mailto:[EMAIL PROTECTED]
> Sent: Thursday, February 23, 2006 5:03 AM
> To: Marcus Boerger; Dmitry Stogov
> Cc: 'Andrei Zmievski'; [email protected]
> Subject: Re: [PHP-I18N] Ideas for a portable string api
>
>
> I think int is fine for this and we don't need to be super-accurate
> on the unsigned for string length. This is how we work today and it's
> a big change for no real gain.
>
> At 01:15 AM 2/21/2006, Marcus Boerger wrote:
> >Hello Dmitry,
> >
> >Tuesday, February 21, 2006, 7:45:17 AM, you wrote:
> >
> > > I don't like int32_t because
> >
> > > 1) it is defined in ICU.
> >
> > int32_t is defined in <stdint.h> by ansi c 89.
> >
> > > 2) I will need to rewrite a lot of existing functions to use
> > int32_t instead
> > > of int.
> >
> >3) Both int and int32_t are wrong. The only correct choice would be
> >size_t.
> >
> >best regards
> >marcus
> >
> > > Thanks. Dmitry.
> >
> > >> -----Original Message-----
> > >> From: Andi Gutmans [mailto:[EMAIL PROTECTED]
> > >> Sent: Tuesday, February 21, 2006 9:26 AM
> > >> To: Andrei Zmievski; Dmitry Stogov
> > >> Cc: [email protected]
> > >> Subject: Re: [PHP-I18N] Ideas for a portable string api
> > >>
> > >>
> > >> Is there a reason why?
> > >> We usually just use int/uint almost anywhere... Then
> again I don't
> > >> really care because 32bit should be plenty (famous last
> > >> words) for strings....
> > >>
> > >> Andi
> > >>
> > >> At 10:16 PM 2/20/2006, Andrei Zmievski wrote:
> > >> >I prefer to use fixed integer type, int32_t.
> > >> >
> > >> >-Andrei
> > >> >
> > >> >
> > >> >On Feb 20, 2006, at 11:11 AM, Dmitry Stogov wrote:
> > >> >
> > >> >>Hi Andrei,
> > >> >>
> > >> >>We decide to use the same type for str.len (now int) and
> > >> ustr.len (now
> > >> >>int32_t, it comes from ICU).
> > >> >>
> > >> >>I prefer to make both of them - int.
> > >> >>Any reclaims? I plan to do it at Thursday.
> > >> >>
> > >> >>Thanks. Dmitry.
> > >> >>
> > >> >>>-----Original Message-----
> > >> >>>From: Dmitry Stogov [mailto:[EMAIL PROTECTED]
> > >> >>>Sent: Thursday, February 16, 2006 12:12 PM
> > >> >>>To: [email protected]
> > >> >>>Subject: RE: [PHP-I18N] Ideas for a portable string api
> > >> >>>
> > >> >>>
> > >> >>>Hi,
> > >> >>>
> > >> >>>After reviewing Marcus ideas, some experiments and
> speaking with
> > >> >>>Andrei. I propose the following solutions:
> > >> >>>
> > >> >>>1) We will not use any kind of unicode literals in C code
> > >> (no L"foo"
> > >> >>>no "f\0o\0o\0\0"), Because L"" is not portable and "f\0.."
> > >> looks to
> > >> >>>ugly.
> > >> >>>
> > >> >>>2) We will change "zval" structure to make
> > >> "zval.value.str.len" and
> > >> >>>"zval.value.ustr.len" of the same type. This will
> allow optimize
> > >> >>>Z_UNISTR() and Z_UNILEN() macros. They will
> > >> >>>
> > >> >>>#define Z_UNISTR(z) ((void*)(Z_STRVAL(z)))
> > >> >>>#define Z_UNILEN(z) ((void*)(Z_STRLEN(z)))
> > >> >>>
> > >> >>>Instead of
> > >> >>>
> > >> >>>#define Z_UNISTR(z)
> > >> >>>Z_TYPE(z)==IS_UNICODE?(char*)Z_USTRVAL(z):Z_STRVAL(z)
> > >> >>>#define Z_UNILEN(z)
> > >> >>>Z_TYPE(z)==IS_UNICODE?(int)Z_USTRLEN(z):Z_STRLEN(z)
> > >> >>>
> > >> >>>3) I don't like to break source compatibility with
> > >> modification of
> > >> >>>"zval" layout as Marcus suggested. We will pass
> > >> string/unicode values
> > >> >>>near in the same way as do today. As three values -
> > >> zend_uchar type,
> > >> >>>void* str, int len. But we will create a set of the
> > >> following macros
> > >> >>>to do it with less overhead.
> > >> >>>
> > >> >>>#define S_TYPE(x) _type_##x
> > >> >>>#define S_UNIVAL(x) _val_##x
> > >> >>>#define S_UNILEN(x) _len_##x
> > >> >>>#define S_STRVAL(x) ((char*)S_UNIVAL(x))
> > >> >>>#define S_USTRVAL(x) ((UChar*)S_UNIVAL(x))
> > >> >>>#define S_STRLEN(x) S_UNILEN(x)
> > >> >>>#define S_USTRLEN(x) S_UNILEN(x)
> > >> >>>
> > >> >>>#define S_ARG(x) zend_uchar S_TYPE(x), void
> > >> >>>*S_UNIVAL(x), int
> > >> >>>S_UNILEN(x)
> > >> >>>
> > >> >>>#define S_PASS(x) S_TYPE(x),
> S_UNIVAL(x), S_UNILEN(x)
> > >> >>>
> > >> >>>#define Z_STR_PASS(x) Z_TYPE(x),
> Z_UNIVAL(x), Z_UNILEN(x)
> > >> >>>#define Z_STR_PASS_P(x) Z_TYPE_P(x), Z_UNIVAL_P(x),
> > >> >>>Z_UNILEN_P(x)
> > >> >>>#define Z_STR_PASS_PP(x) Z_TYPE_PP(x), Z_UNIVAL_PP(x),
> > >> >>>Z_UNILEN_PP(x)
> > >> >>>
> > >> >>>Then most zend_u_... Functions must be rewriten with these
> > >> >>>macros
> > >> >>>
> > >> >>>Foe example:
> > >> >>>
> > >> >>>ZEND_API int zend_u_lookup_class(S_ARG(name),
> > >> zend_class_entry ***ce
> > >> >>>TSRMLS_DC)
> > >> >>>{
> > >> >>> return zend_u_lookup_class_ex(S_PASS(name), 1, ce
> > >> >>>TSRMLS_CC); }
> > >> >>>
> > >> >>>Instead of
> > >> >>>
> > >> >>>ZEND_API int zend_u_lookup_class(zend_uchar type, void *name,
> > >> >>>int name_length, zend_class_entry ***ce TSRMLS_DC) {
> > >> >>> return zend_u_lookup_class_ex(type, name,
> > >> name_length, 1, ce
> > >> >>>TSRMLS_CC); }
> > >> >>>
> > >> >>>Any objections, additions?
> > >> >>>
> > >> >>>Thanks. Dmitry.
> > >> >>>
> > >> >>>--
> > >> >>>PHP Unicode & I18N Mailing List (http://www.php.net/) To
> > >> >>>unsubscribe, visit: http://www.php.net/unsub.php
> > >> >>>
> > >> >>>
> > >> >
> > >> >--
> > >> >PHP Unicode & I18N Mailing List (http://www.php.net/)
> > >> >To unsubscribe, visit: http://www.php.net/unsub.php
> > >>
> > >>
> > >>
> >
> >
> >
> >
> >
> >
> >--
> >Best regards,
> > marcus
>
> --
> PHP Unicode & I18N Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
>
--- End Message ---
--- Begin Message ---
Clayton,
Can you please make your code available somewhere so we can take a look
at it?
-Andrei
--- End Message ---