php-i18n Digest 18 Feb 2006 23:10:45 -0000 Issue 311

Topics (messages 925 through 935):

Re: Ideas for a portable string api
        925 by: Tex Texin
        926 by: Andi Gutmans
        927 by: Dmitry Stogov
        928 by: Derick Rethans
        929 by: Derick Rethans

remaining tasks
        930 by: Tex Texin
        931 by: Marcus Boerger
        932 by: Derick Rethans
        933 by: Tex Texin
        934 by: Dmitry Stogov
        935 by: Andi Gutmans

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [email protected]


----------------------------------------------------------------------
--- Begin Message ---
It is not clear to me that the overhead of supporting both native and
unicode strings is worth the effort.
The developers of the api will still need to make changes to their
implementation, and making the code look like it supports either form,
disguises errors that will become bugs.

It might be more efficient in the long run to be unicode internally, and
offer a thunk layer to interface to the unmodified functions.
Old functions will go thru a conversion of string vars to native and back on
call/return, perhaps some other conversions for length. As functions are
upgraded to be native unicode, they will avoid the conversions.

There is a performance cost, but the most frequently called functions will
be upgraded first so the majority of the performance issues can be
addressed. This approach has the benefit that we don't carry the baggage for
dual support around throughout the php core going forward.

The thunk layer could include a way to specify the conversion details for
each api so there wouldn't be "guessing" as to what is needed.

Tex Texin
Internationalization Architect,   Yahoo! Inc.
 
 


> -----Original Message-----
> From: Dmitry Stogov [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, February 16, 2006 1:12 AM
> To: [email protected]
> Subject: RE: [PHP-I18N] Ideas for a portable string api
> 
> 
> Hi,
> 
> After reviewing Marcus ideas, some experiments and speaking 
> with Andrei. I propose the following solutions:
> 
> 1) We will not use any kind of unicode literals in C code (no 
> L"foo" no "f\0o\0o\0\0"), Because L"" is not portable and 
> "f\0.." looks to ugly.
> 
> 2) We will change "zval" structure to make 
> "zval.value.str.len" and "zval.value.ustr.len" of the same 
> type. This will allow optimize Z_UNISTR() and Z_UNILEN() 
> macros. They will
> 
> #define Z_UNISTR(z)  ((void*)(Z_STRVAL(z)))
> #define Z_UNILEN(z)  ((void*)(Z_STRLEN(z)))
> 
> Instead of
> 
> #define Z_UNISTR(z)  
> Z_TYPE(z)==IS_UNICODE?(char*)Z_USTRVAL(z):Z_STRVAL(z)
> #define Z_UNILEN(z)  
> Z_TYPE(z)==IS_UNICODE?(int)Z_USTRLEN(z):Z_STRLEN(z)
> 
> 3)  I don't like to break source compatibility with 
> modification of "zval" layout as Marcus suggested. We will 
> pass string/unicode values near in the same way as do today. 
> As three values - zend_uchar type, void* str, int len. But we 
> will create a set of the following macros to do it with less overhead.
> 
> #define S_TYPE(x)             _type_##x
> #define S_UNIVAL(x)           _val_##x
> #define S_UNILEN(x)           _len_##x
> #define S_STRVAL(x)           ((char*)S_UNIVAL(x))
> #define S_USTRVAL(x)          ((UChar*)S_UNIVAL(x))
> #define S_STRLEN(x)           S_UNILEN(x)             
> #define S_USTRLEN(x)          S_UNILEN(x)
> 
> #define S_ARG(x)              zend_uchar S_TYPE(x), void 
> *S_UNIVAL(x), int
> S_UNILEN(x)
> 
> #define S_PASS(x)             S_TYPE(x), S_UNIVAL(x), S_UNILEN(x)
> 
> #define Z_STR_PASS(x)         Z_TYPE(x), Z_UNIVAL(x), Z_UNILEN(x)
> #define Z_STR_PASS_P(x)       Z_TYPE_P(x), Z_UNIVAL_P(x), 
> Z_UNILEN_P(x)
> #define Z_STR_PASS_PP(x)      Z_TYPE_PP(x), Z_UNIVAL_PP(x), 
> Z_UNILEN_PP(x)
> 
> Then most zend_u_... Functions must be rewriten with these macros
> 
> Foe example:
> 
> ZEND_API int zend_u_lookup_class(S_ARG(name), zend_class_entry ***ce
> TSRMLS_DC)
> {
>       return zend_u_lookup_class_ex(S_PASS(name), 1, ce TSRMLS_CC); }
> 
> Instead of
> 
> ZEND_API int zend_u_lookup_class(zend_uchar type, void *name, 
> int name_length, zend_class_entry ***ce TSRMLS_DC) {
>       return zend_u_lookup_class_ex(type, name, name_length, 
> 1, ce TSRMLS_CC); }
> 
> Any objections, additions?
> 
> Thanks. Dmitry.
> 
> -- 
> PHP Unicode & I18N Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
> 
> 

--- End Message ---
--- Begin Message --- I think this is a good approach (although debugging will not be easy, but that's already a given for PHP :)
+1 from me.

Andi

At 01:12 AM 2/16/2006, Dmitry Stogov wrote:
Hi,

After reviewing Marcus ideas, some experiments and speaking with Andrei.
I propose the following solutions:

1) We will not use any kind of unicode literals in C code (no L"foo" no
"f\0o\0o\0\0"),
Because L"" is not portable and "f\0.." looks to ugly.

2) We will change "zval" structure to make "zval.value.str.len" and
"zval.value.ustr.len" of the same type.
This will allow optimize Z_UNISTR() and Z_UNILEN() macros. They will

#define Z_UNISTR(z)  ((void*)(Z_STRVAL(z)))
#define Z_UNILEN(z)  ((void*)(Z_STRLEN(z)))

Instead of

#define Z_UNISTR(z)  Z_TYPE(z)==IS_UNICODE?(char*)Z_USTRVAL(z):Z_STRVAL(z)
#define Z_UNILEN(z)  Z_TYPE(z)==IS_UNICODE?(int)Z_USTRLEN(z):Z_STRLEN(z)

3)  I don't like to break source compatibility with modification of "zval"
layout as Marcus suggested.
We will pass string/unicode values near in the same way as do today.
As three values - zend_uchar type, void* str, int len.
But we will create a set of the following macros to do it with less
overhead.

#define S_TYPE(x)               _type_##x
#define S_UNIVAL(x)             _val_##x
#define S_UNILEN(x)             _len_##x
#define S_STRVAL(x)             ((char*)S_UNIVAL(x))
#define S_USTRVAL(x)            ((UChar*)S_UNIVAL(x))
#define S_STRLEN(x)             S_UNILEN(x)
#define S_USTRLEN(x)            S_UNILEN(x)

#define S_ARG(x)                zend_uchar S_TYPE(x), void *S_UNIVAL(x), int
S_UNILEN(x)

#define S_PASS(x)               S_TYPE(x), S_UNIVAL(x), S_UNILEN(x)

#define Z_STR_PASS(x)           Z_TYPE(x), Z_UNIVAL(x), Z_UNILEN(x)
#define Z_STR_PASS_P(x) Z_TYPE_P(x), Z_UNIVAL_P(x), Z_UNILEN_P(x)
#define Z_STR_PASS_PP(x)        Z_TYPE_PP(x), Z_UNIVAL_PP(x), Z_UNILEN_PP(x)

Then most zend_u_... Functions must be rewriten with these macros

Foe example:

ZEND_API int zend_u_lookup_class(S_ARG(name), zend_class_entry ***ce
TSRMLS_DC)
{
        return zend_u_lookup_class_ex(S_PASS(name), 1, ce TSRMLS_CC);
}

Instead of

ZEND_API int zend_u_lookup_class(zend_uchar type, void *name, int
name_length, zend_class_entry ***ce TSRMLS_DC)
{
        return zend_u_lookup_class_ex(type, name, name_length, 1, ce
TSRMLS_CC);
}

Any objections, additions?

Thanks. Dmitry.

--
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

--- End Message ---
--- Begin Message ---
We already discussd these solutions with Andrei and Marcus.
I'll start implementation at Monday.

Thanks. Dmitry.

> -----Original Message-----
> From: Andi Gutmans [mailto:[EMAIL PROTECTED] 
> Sent: Friday, February 17, 2006 6:09 AM
> To: Dmitry Stogov; [email protected]
> Subject: RE: [PHP-I18N] Ideas for a portable string api
> 
> 
> I think this is a good approach (although debugging will not be easy, 
> but that's already a given for PHP :)
> +1 from me.
> 
> Andi
> 
> At 01:12 AM 2/16/2006, Dmitry Stogov wrote:
> >Hi,
> >
> >After reviewing Marcus ideas, some experiments and speaking with 
> >Andrei. I propose the following solutions:
> >
> >1) We will not use any kind of unicode literals in C code 
> (no L"foo" no 
> >"f\0o\0o\0\0"), Because L"" is not portable and "f\0.." 
> looks to ugly.
> >
> >2) We will change "zval" structure to make "zval.value.str.len" and 
> >"zval.value.ustr.len" of the same type. This will allow optimize 
> >Z_UNISTR() and Z_UNILEN() macros. They will
> >
> >#define Z_UNISTR(z)  ((void*)(Z_STRVAL(z)))
> >#define Z_UNILEN(z)  ((void*)(Z_STRLEN(z)))
> >
> >Instead of
> >
> >#define Z_UNISTR(z)  
> >Z_TYPE(z)==IS_UNICODE?(char*)Z_USTRVAL(z):Z_STRVAL(z)
> >#define Z_UNILEN(z)  
> Z_TYPE(z)==IS_UNICODE?(int)Z_USTRLEN(z):Z_STRLEN(z)
> >
> >3)  I don't like to break source compatibility with modification of 
> >"zval" layout as Marcus suggested. We will pass 
> string/unicode values 
> >near in the same way as do today. As three values - zend_uchar type, 
> >void* str, int len. But we will create a set of the 
> following macros to 
> >do it with less overhead.
> >
> >#define S_TYPE(x)               _type_##x
> >#define S_UNIVAL(x)             _val_##x
> >#define S_UNILEN(x)             _len_##x
> >#define S_STRVAL(x)             ((char*)S_UNIVAL(x))
> >#define S_USTRVAL(x)            ((UChar*)S_UNIVAL(x))
> >#define S_STRLEN(x)             S_UNILEN(x)
> >#define S_USTRLEN(x)            S_UNILEN(x)
> >
> >#define S_ARG(x)                zend_uchar S_TYPE(x), void 
> *S_UNIVAL(x), int
> >S_UNILEN(x)
> >
> >#define S_PASS(x)               S_TYPE(x), S_UNIVAL(x), S_UNILEN(x)
> >
> >#define Z_STR_PASS(x)           Z_TYPE(x), Z_UNIVAL(x), Z_UNILEN(x)
> >#define Z_STR_PASS_P(x) Z_TYPE_P(x), Z_UNIVAL_P(x), Z_UNILEN_P(x)
> >#define Z_STR_PASS_PP(x)        Z_TYPE_PP(x), 
> Z_UNIVAL_PP(x), Z_UNILEN_PP(x)
> >
> >Then most zend_u_... Functions must be rewriten with these macros
> >
> >Foe example:
> >
> >ZEND_API int zend_u_lookup_class(S_ARG(name), zend_class_entry ***ce
> >TSRMLS_DC)
> >{
> >         return zend_u_lookup_class_ex(S_PASS(name), 1, ce 
> TSRMLS_CC); 
> >}
> >
> >Instead of
> >
> >ZEND_API int zend_u_lookup_class(zend_uchar type, void *name, int 
> >name_length, zend_class_entry ***ce TSRMLS_DC) {
> >         return zend_u_lookup_class_ex(type, name, name_length, 1, ce
> >TSRMLS_CC);
> >}
> >
> >Any objections, additions?
> >
> >Thanks. Dmitry.
> >
> >--
> >PHP Unicode & I18N Mailing List (http://www.php.net/)
> >To unsubscribe, visit: http://www.php.net/unsub.php
> 
> 
> 

--- End Message ---
--- Begin Message ---
On Thu, 16 Feb 2006, Andi Gutmans wrote:

> I think this is a good approach (although debugging will not be easy, but
> that's already a given for PHP :)

Yeah, because the engine isn't documented ;-) Actually, I don't find 
debugging that hard... unless you're in TSRM mode ofcourse :)

regards,
Derick

-- 
Derick Rethans
http://derickrethans.nl | http://ez.no | http://xdebug.org

--- End Message ---
--- Begin Message ---
On Fri, 17 Feb 2006, Dmitry Stogov wrote:

> We already discussd these solutions with Andrei and Marcus.
> I'll start implementation at Monday.

Great, I like this too.

Derick

-- 
Derick Rethans
http://derickrethans.nl | http://ez.no | http://xdebug.org

--- End Message ---
--- Begin Message ---
I would like to discuss how we might partition the remaining work so we can
accelerate the schedule.
Is there a good day and time for a meeting/teleconference?

How is Feb. 22 lunchtime?
If that day time is not good, please suggest some good times.

The agenda would be to clarify the remaining work and to identify tasks
which others can contribute to.

Tex Texin
Internationalization Architect,   Yahoo! Inc.
 
 

--- End Message ---
--- Begin Message ---
Hello Tex,

  generally i am not available until 2030GMT and on wednesday i am not
available before 2100GMT. GMT Evening hours is no problem for me. However
i am not that important for i18n meeting i guess.

best regards
marcus

Friday, February 17, 2006, 9:59:55 AM, you wrote:

> I would like to discuss how we might partition the remaining work so we can
> accelerate the schedule.
> Is there a good day and time for a meeting/teleconference?

> How is Feb. 22 lunchtime?
> If that day time is not good, please suggest some good times.

> The agenda would be to clarify the remaining work and to identify tasks
> which others can contribute to.

> Tex Texin
> Internationalization Architect,   Yahoo! Inc.
>  
>  




-- 
Best regards,
 marcus

--- End Message ---
--- Begin Message ---
On Fri, 17 Feb 2006, Marcus Boerger wrote:

>   generally i am not available until 2030GMT and on wednesday i am not
> available before 2100GMT. GMT Evening hours is no problem for me. However
> i am not that important for i18n meeting i guess.

2100GMT on wednesday would work for me too (that is 1300PST, 1600EST). 

regards,
Derick

-- 
Derick Rethans
http://derickrethans.nl | http://ez.no | http://xdebug.org

--- End Message ---
--- Begin Message ---
thanks guys, 2100gmt is fine. Let's see if that is good for others.

Tex Texin
Internationalization Architect,   Yahoo! Inc.
 
 


> -----Original Message-----
> From: Derick Rethans [mailto:[EMAIL PROTECTED] 
> Sent: Friday, February 17, 2006 1:13 AM
> To: Marcus Boerger
> Cc: Tex Texin; 'Dmitry Stogov'; 'Andi Gutmans'; [email protected]
> Subject: Re: [PHP-I18N] remaining tasks
> 
> 
> On Fri, 17 Feb 2006, Marcus Boerger wrote:
> 
> >   generally i am not available until 2030GMT and on 
> wednesday i am not 
> > available before 2100GMT. GMT Evening hours is no problem for me. 
> > However i am not that important for i18n meeting i guess.
> 
> 2100GMT on wednesday would work for me too (that is 1300PST, 
> 1600EST). 
> 
> regards,
> Derick
> 
> -- 
> Derick Rethans
> http://derickrethans.nl | http://ez.no | http://xdebug.org
> 
> 

--- End Message ---
--- Begin Message ---
February 22 after 18:00 GMT+3 and February 23 are bad for me. (February 23
is a holiday in Russia).

Thanks. Dmitry.

> -----Original Message-----
> From: Tex Texin [mailto:[EMAIL PROTECTED] 
> Sent: Friday, February 17, 2006 12:00 PM
> To: 'Dmitry Stogov'; 'Andi Gutmans'; [email protected]
> Subject: remaining tasks
> 
> 
> I would like to discuss how we might partition the remaining 
> work so we can accelerate the schedule. Is there a good day 
> and time for a meeting/teleconference?
> 
> How is Feb. 22 lunchtime?
> If that day time is not good, please suggest some good times.
> 
> The agenda would be to clarify the remaining work and to 
> identify tasks which others can contribute to.
> 
> Tex Texin
> Internationalization Architect,   Yahoo! Inc.
>  
>  
> 
> 
> 

--- End Message ---
--- Begin Message --- I probably won't be able to make it until 1:30pm, although you can start and I'll join.
Dmitry, can you make that time?

Andi

At 01:37 AM 2/17/2006, Tex Texin wrote:
thanks guys, 2100gmt is fine. Let's see if that is good for others.

Tex Texin
Internationalization Architect,   Yahoo! Inc.




> -----Original Message-----
> From: Derick Rethans [mailto:[EMAIL PROTECTED]
> Sent: Friday, February 17, 2006 1:13 AM
> To: Marcus Boerger
> Cc: Tex Texin; 'Dmitry Stogov'; 'Andi Gutmans'; [email protected]
> Subject: Re: [PHP-I18N] remaining tasks
>
>
> On Fri, 17 Feb 2006, Marcus Boerger wrote:
>
> >   generally i am not available until 2030GMT and on
> wednesday i am not
> > available before 2100GMT. GMT Evening hours is no problem for me.
> > However i am not that important for i18n meeting i guess.
>
> 2100GMT on wednesday would work for me too (that is 1300PST,
> 1600EST).
>
> regards,
> Derick
>
> --
> Derick Rethans
> http://derickrethans.nl | http://ez.no | http://xdebug.org
>
>

--- End Message ---

Reply via email to