> De : morrison.l...@gmail.com [mailto:morrison.l...@gmail.com] De la part
> 
> Just to time in with my $0.02: I feel that using macros as an
> abstraction in this case is bad practice. I believe that in *most*
> cases macros as an abstraction is a bad practice. Furthermore, there
> isn't any reason that `zend_string_*` functions cannot act as an
> abstraction layer since zend_string's are passed by pointer.

Agreed. That's why ZSTR_VAL() and ZSTR_LEN() are functions now. Macros don't 
provide enough isolation. The choice of renaming 'zend_string_' to 'ZSTR_' is 
just a question of name consistency. The most important is that these are 
functions.

Zend_string are passed by pointer and, in theory, this type should be opaque, 
(void *) for instance.

@Bob - I remember an idea I had, that I should discuss with Dmitry, and which 
can be implemented without any change to the proposed API. The idea is to 
return the address of the string instead of the address of the struct. This 
would allow using this address for the zend_string API and for any other 
function expecting a plain (char *) address. Z_STRVAL() and '->val' would 
become useless, of course. The other struct elements would just be below the 
used address (as it is done in malloc()). In this case, the calling code must 
consider the address as, either an opaque value that can be passed to the 
zend_string API, or the address of an allocated memory buffer that can be read 
and written, up to the declared size. This is possible only through an 
encapsulated API. Compare it with malloc(), as both would use a similar 
mechanism. When you call malloc() on a system, you don't care about the 
underlying allocated structure, and it may be very different on different 
systems. This is the same, a malloc() with a pair of additional features. You 
don't have to know more about the implementation.

Working on the allocation scheme just requires to store a new 'allocated size' 
element , which does not cost much and can avoid a lot of costly [e]realloc() 
calls. Then new functions may be defined, if needed, to control allocation 
policy. All of this doesn't require changing the existing API, these are just 
additions. I really don't understand why you are so sure that any change to the 
internal representation will require changes to the API. New functions can be 
added, yes, but we can improve a lot of things while keeping the same API.

You base all your examples on the fact that zend_string represents a structure. 
I don't assume anything at this level. Maybe we'll find that performance is 
better with a 2-level storage, storing fixed-length information in a 
pre-allocated array, for example, and storing the strings elsewhere. I leave it 
as open as possible, while you prefer constraints just because you cannot 
imagine today how it can evolve tomorrow.

About hash values, nobody said we should automatically reset the hash value any 
time something is written. And you're wrong : we don't end up controlling the 
hash value manually. We control it through two well-defined methods. This is 
not low-level control, not the same as using 'zstr->h', for instance. It is 
part of the API, nothing shocking there. And nothing says that someone won't 
find some way to make hash management 'smarter', without doing millions of 
useless operations. There may be new operations but, once again, the existing 
ones will remain unchanged.

As a conclusion, the zend_string API I propose provides some isolation, but you 
will be glad to know that it is not as advanced as I'd like, mostly for 
historical reasons. As an example, I am sure we will be annoyed by the 
'persistent' argument to init/alloc/realloc. For init and alloc, it would be 
better to have a flag mask, allowing other flags to be defined in the future. 
We'll need this, for instance, if we define different switchable allocation 
policies. So, we'll probably need to define new functions where it could have 
be done right from the beginning. Another issue is that this argument is 
theoretically useless in the realloc/extend/truncate functions, as reallocation 
can be done only in the memory space where the zend_string was created. So, 
this value is totally constrained, which is bad for an API. This is justified 
by the fact that it allows compiler optimizations which generally remove a pair 
of CPU cycles. I think the argument is not sufficient but it is too late for 
7.0, as we cannot find an agreement on this without serious performance testing.

PS: About PHP 7, it was clear from the beginning that preserving the C APIs was 
not the priority. When asking, I was told there are so few extension developers 
that preserving the C API isn't worth the effort. We all know that the real 
reason is that, after wasting years of fights about pointless subjects, the PHP 
community, and especially  Zend, is so afraid of HHVM that they switched to 
panic mode and decided to propose something 'quick-and-dirty' as soon as 
possible. IMO, that's a wrong, defensive, strategy but the community was naive 
enough to approve a whitecard RFC. Actually, I don't even know if we can regain 
control someday from Zend. Hope we can but it will very hard because we've 
clearly switched from IT to politics and marketing. The biggest issue, IMO, is 
that I absolutely don't trust them to lead a strategy against HHVM and facebook.

Regards

François



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to