Hi, > -----Message d'origine----- > De : Bob Weinand [mailto:bobw...@hotmail.com] > > Hey, looks like I'm a bit late to the party, but I'd like to object against > the > accessor macros on zend_string. > > What does it help to replace very nice ->val, ->len etc. with macros?
If you find it 'very nice', probably nothing. I suggest you stop using macros everywhere. After all, why did we define Z_STRVAL() instead of writing something 'very nice' like 'zv.value.str->val' ? Just one reason : if we had written 'zv.value.val' everywhere, zend_strings couldn't exist today. In a few words, we are improving encapsulation : publishing an API instead of a C structure. > ZSTR_VAL(str)[ZSTR_LEN(str)] = 0; or str->val[str->len] = 0; > zend_string_set_len(str, ZSTR_LEN(str) - 2); or str->len -= 2; > Also, that particular function suggests to me that the string is expanded if > lengths don't match. I agree it would be quite natural and I'd like to implement it in a future version. We just need to store the allocated size in the struct, which also allows to over-allocate memory in advance. Anyway, performance is critical there, so the code must remain simple and fast. > I much prefer direct accessors which indicate a) no side-effects happening > here, b) less functions to remember (you just look at the struct and know > everything about it) and c) end up being more readable. a) You deal with an API, not a structure. So, you call methods instead of accessing variables. The user does not have to care about side effects of the methods he's using. Don't think about C macros, think about methods. b) Getting documentation from the code is something we should forget. We still need to do it because PHP lacks a usable core API documentation, especially for PHP 7. So, getting API usage from the code is a workaround, not the appropriate way to publish an API. The same with some recently-published articles about new PHP 7 structures. These articles should *never* include the underlying structures. Then, we complain about people bypassing the 'official' APIs but, if they use such documentation, how can they know what they should access or not ? IMO, the rules are 'never publish structure details' and 'never allow direct access to structure elements'. c) Using ZSTR_VAL/ZSTR_LEN macros can also be considered as more readable. I personally consider it as more readable because other unrelated structures may also define 'len' and 'val' elements. > Yes, I'm totally aware that when we ever want to change the API (I don't see > coming it in near future anyway), it could help having wrappers. But > wrappers don't help against every API change… let's say for some reason > we'd want to go back to the old char*/int; would wrappers help? no. (Yes, > that's maybe a bad example, but just trying to show that, while it has maybe > a theoretical gain, I don't see a real practical benefit.) The fact that you don't imagine the API changing in the near future is not an argument. First, wrappers don't protect against API changes, they allow to maintain the same API when the backend implementation is modified. I see at least two reasons that could bring changes in the zend_string API, and it could be quite soon. The first one is unifying zend_string with smart_str, including in zend_string the smart_str allocation scheme. I don't know exactly the form it would take but reading and/or writing the string length could involve more than just writing a value in a structure element. The other subject that could require implementation changes is Unicode. It failed for a number of reasons in PHP6, but it doesn't mean we'll avoid the question forever. If we want to include Unicode at a low level, we may have to work at the zend_string level. Once again, in this case, reading and writing the string length will probably involve different internal operations. So, the question is not to blindly apply the theory and say 'tight coupling is evil'. There are objective reasons to encapsulate APIs. Zend_string is just the first one. It was choosen because no API existed to access the 'val' and 'len' elements, and because it was relatively easy to fix, allowing for an integration in 7.0. But the next one is HashTable, which is *much* more complex. If you look at the implementation and usage of HashTables in the code, you'll see how a combination of growing complexity and tight coupling can make code unmanageable. Actually, we are at a point where changing the HashTable internal implementation would be extremely heavy and complex. If we don't fix it now, it can only become worse over time. > zend_string is really a core structure and extremely simple. Please don't > trade simplicity for a small other benefit. > > Also, we still leave zend_string struct exposed (presumably for allowing > inlining). At the point where we are now, many extension authors probably > already are also accessing zend_string directly. We're going to have a > colorful > mix of code using the APIs and not using it, which is the worst we can get > now. Providing an API when it was missing allows to start 'cleaning' the existing code. It will take time and, during this time, we'll have a mix of code, yes. As long as the code is not 100 % clean, we cannot modify the API, yes. But the most important, IMO, is to know where we are going to. What alternative do we have ? Provide no encapsulation at all and get blocked when we *need* to change implementation in 1 or 2 years ? It will be too late. > To end up, I very much like unifying prefixes (ZSTR_*), but I strongly dislike > hurting readability and consistency so much for such little benefit. You will be satisfied. I just sent a new patch to Dmitry where the whole zend_string API is prefixed with 'ZSTR_', providing a better consistency (with compatibility macros for old names). Regards François -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php