Re: [PHP-DEV] Asymmetric visibility Reflection API problems
Hi Valentin Thank you for bringing up your concerns. On Thu, Oct 3, 2024 at 3:32 PM Valentin Udaltsov wrote: > > We are going to talk about the asymmetric visibility RFC [2] and the > new `ReflectionProperty::is(Private|Protected)Set()` > methods. Consider a class and reflected facts about it's properties [3]: > > ```php > class A > { > // isPrivateSet() = true (OK) > public private(set) mixed $public_private_set; > > // isPrivateSet() = false (I would expect true) > private private(set) mixed $private_private_set; > > // isPrivateSet() = false (I would expect true) > private mixed $private; > > // isProtectedSet() = true (OK, explained in the RFC) > public readonly mixed $public_readonly; > > // isProtectedSet() = false (I would expect true) > protected readonly mixed $protected_readonly; > > // isProtectedSet() = false (I would expect true) > protected protected(set) readonly mixed $protected_protected_set_readonly; > > // isPrivateSet() = false (OK), isProtectedSet() = false (OK) > public bool $virtual_no_set_hook { get => true; } > } > ``` As mentioned on GitHub, I do understand and agree that this may seem confusing without more context. However, this behavior for flags is really not unique to asymmetric visibility. For example: ```php interface I { public function test(); } var_dump((new ReflectionMethod('I', 'test'))->isAbstract()); //> bool(true) ``` The `abstract` flag is not (and cannot) be specified explicitly, but it is added implicitly. Similarly, the `protected(set)` in `protected protected(set)` is redundant, so the compiler removes it to omit additional visibility checks. For the same reason, we don't even have an internal flag (at runtime) for `public(set)`, as it is always removed at compile time. As mentioned, these methods are generally quite "dumb" and simply expose these internal flags to userland. This is reflected by `getModifiers()`, which returns a bitmask over all flags. `getModifiers()` should stay consistent with `isPublic()`, as well as all the other methods. This will be increasingly difficult if we start tweaking the implementation of these methods. I have proposed to instead solve this by introducing new functions, `ReflectionProperty::isReadable()` and `isReadable()`. They would not only handle visibility and asymmetric visibility, but also scope, virtual-ness, hooks and readonly (including __clone). Additionally, if some new concept is ever added, the functions may adapt. I have created a simple PoC [1]. > Here's what I propose: > > - `ReflectionProperty::isPublic()` returns `true` only if the property > is symmetrically public. For `public readonly $prop` > it will return `false`, because it's implicitly `public protected(set) > readonly $prop`. This is a BC break, but a > smaller one, given that libraries now take "readonly" into account. > - `ReflectionProperty::isProtected()` returns `true` only if the > property is symmetrically protected. For `protected readonly $prop` > it will return ``true``, because it's implicitly `protected > protected(set) readonly $prop`. Fully backward compatible. > - `ReflectionProperty::isPrivate()` returns `true` only if the > property is symmetrically private. Fully backward compatible. > - Add `ReflectionProperty::isPublicGet()`: returns `true` if property > is `public` or `public XXX(set)`. > - Add `ReflectionProperty::isProtectedGet()`: returns `true` if > property is `protected` or `protected XXX(set)`. > - Add `ReflectionProperty::isPrivateGet()`: returns `true` if property > is `private` or `private private(set)`. > - Add `ReflectionProperty::isPublicSet()`: returns `true` if property > is symmetrically public (asymmetric public > set is not possible). > - Change `ReflectionProperty::isProtectedSet()`: returns `true` if > property is symmetrically protected or has an > asymmetric protected set. > - Change `ReflectionProperty::isPrivate()`: returns `true` if property > is symmetrically private or has an asymmetric > private set. I very much disagree with changing `isPublic` and friends. It would be a considerable BC break which isn't acceptable in the RC phase. I also don't think it's necessarily more correct. With this RFC, properties have two visibilities: 1. The visibility of the entire property, which we are all used to. 2. The visibility of the set operation, which was introduced with asymmetric visibility. Essentially, the get operation continues to inherit the visibility from the property. This is what `isPublic()` and friends are referring to. Of course, you may argue that an equally valid mental model is that the property no longer has a visibility, but an additional get visibility. However, this does not accurately reflect the implementation, nor, more importantly, the syntax. > The most difficult question is whether we can have these or similar > changes in 8.4 after RC1... Does a BC break > in the current implementation (if we agree that it exists
Re: [PHP-DEV] VCS Account Request: ayesh
On Thu, Sep 26, 2024 at 1:45 PM Tim Düsterhus wrote: > > Am 2024-09-25 21:26, schrieb Ayesh Karunaratne: > > I see the account is already approved. > > Unless you did not yet accept the invitation to the GitHub organization, > it appears that you were not invited there yet. If that is the case, > someone should send the invite. Thanks for the hint, I sent Ayesh an invite to the php-src team. Ilija
Re: [PHP-DEV] Which IDE do you recommend for php-src development?
Hi Carlos On Sat, Sep 14, 2024 at 11:44 PM Barel wrote: > > For C/C++ development I usually use CLion from Jetbrains but I tried to use > it with php-src and was unable to get it to work properly. CLion really > insists on using CMake and has only quite limited support for makefiles. > After trying to get it to work unsuccessfully I am ready to try something > else. > > So which IDE would you recommend for php-src development? I understand that > people probably have many different preferences but I wondered if there was > something that most php internals developers used. I tried to like CLion, but it was extremely laggy for me every time I tried it. php-src contains some rather large files, and even some dsl (zend_vm_def.h) that didn't work well at all. I also didn't find a way to add simple highlighting support for PHP test files. VSCode works surprisingly well, I've used it for C development for years. I created a simple guide for setting it up, including gdb. It shouldn't take you more than 10 minutes. [1] Zed is also already a decent option that essentially just works out of the box. Unfortunately, it's currently lacking debugging support. [2] > One important feature would be to easily work with the project running on a > docker container Can't help you there, I don't use Docker for C development. Ilija [1] http://php.github.io/php-src/introduction/ides/visual-studio-code.html [2] https://zed.dev/
Re: [PHP-DEV] json_encode() and Generators / RFC Karma
On Fri, Aug 30, 2024 at 12:19 PM Philip Hofstetter wrote: > > Username is pilif RFC karma was granted. Good luck with the RFC!
Re: [PHP-DEV] json_encode() and Generators / RFC Karma
Hi Philip On Fri, Aug 30, 2024 at 9:37 AM Philip Hofstetter wrote: > > If you think this is a worthwhile thing to do an RFC for, I would kindly ask > for somebody to grant me RFC karma, so I can start working on one. To grant you RFC karma, I need to know your wiki.php.net username. Ilija
Re: [PHP-DEV] [RFC] Default expression
Hi Rowan On Sun, Aug 25, 2024 at 6:06 PM Rowan Tommins [IMSoP] wrote: > > On 25/08/2024 16:29, Bilge wrote: > > You can write, `include(1 + 1);`, because `include()` accepts an > > expression. You will get: "Failed opening '2' for inclusion". Should > > we restrict that? No, because that's just how expressions work in any > > context where they're allowed. > > > I think a better comparison might be the "new in initializers" and > "fetch property in const expressions" RFCs, which both forbid uses which > would naturally be allowed by the grammar. The rationale in those cases > was laid out in > https://wiki.php.net/rfc/new_in_initializers#unsupported_positions and > https://wiki.php.net/rfc/fetch_property_in_const_expressions#supporting_all_objects I don't agree with that. Constant expressions in PHP already only support a subset of operations that expressions do. However, default is proposed to be a true expression, i.e. one that compiles to opcodes. Looking at the `expr` nonterminal [1] I can't see any productions that are restricted in the context they can be used in, even though plenty of them are nonsensical (e.g. exit(1) + 2). Furthermore, new in initializers was disallowed in some contexts not because it would be nonsensical, but because it posed technical difficulties. I also believe some of the rules you've laid out would be hard to enforce. > 1) The expression should be reasonably guaranteed to produce the same type as > the actual default. Even the simple cases of ??, ?: can easily break this rule. Furthermore, context restriction is easily circumvented. E.g. foo((int) default); // This is not allowed foo((int) match (true) { default => default }); // Let me just do that I'm not sure context restriction is worthwhile, if 1. we can't do it properly anyway and 2. there are no technical reasons to do so. Ilija [1] https://github.com/php/php-src/blob/3f4028d3d9d63e1dae012a9c350141493b30825f/Zend/zend_language_parser.y#L1198
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
Hi Stephen On Sat, Aug 24, 2024 at 1:54 PM Stephen Reay wrote: > > Thanks for clarifying. Out of curiosity, how much optimisation do you imagine > would be possible if the lookups were done the same was as classes (ie no > fallback, names must be local, qualified or imported with `use`)? I haven't measured this case specifically, but if unqualified calls to local functions are indeed rare (which the last analysis seems to indicate), then it should make barely any difference. Of course, if your code makes lots of use of them, then the story might be different. That said, the penalty of an ambiguous internal call is much higher than that of a user, local call, given that internal calls sometimes have special optimizations or can even be entirely executed at compile time. For local calls, it will simply lead to a double lookup on first execution. > I am aware this is a BC break. But if it's kosher to discuss introducing a > never ending BC break I don't see why this isn't a valid discussion either. > It would give *everyone* that elusive 2-4% performance boost, would resolve > any ambiguity about which function a person intended to call (the claimed > security issue) and would bring consistency with the way classes/etc are > referenced. >From my analysis, there were 2 967 unqualified calls to local functions in the top 1 000 repositories. (Disclaimer: There might be a "use function" at the top for some of these, the analysis isn't that sophisticated.) I also ran the script to check for unqualified calls to global functions (or at least functions that weren't statically visible in that scope in any of the repositories files), and there were ~139 000 of them. It seems like this is quite a different beast. To summarize: 1. Flipping lookup order: ~a few dozens of changes 2. Global only: ~3 000 changes 3. Local only: ~139 000 changes While much of this can be automated, huge diffs still require reviewing time, and can lead to many merge conflicts which also take time to resolve. I would definitely prefer to go with 1. or potentially 2. Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
Stephen On Sat, Aug 24, 2024 at 2:00 PM Stephen Reay wrote: > > When I said this thread reads like an April fools joke that wasn't a > challenge you know. We *just* had somebody temporarily banned for ad-hominem attacks like a week ago. Please familiarize yourself with the mailing list rules. They apply to everyone. https://github.com/php/php-src/blob/master/docs/mailinglist-rules.md Most significantly: > a. Make everybody happier, ... and > 1. Do not post when you are angry. Any post can wait a few hours. Review your > post after a good breather, or a good nights sleep. Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
On Fri, Aug 23, 2024 at 9:41 PM Rowan Tommins [IMSoP] wrote: > > On 23 August 2024 18:32:41 BST, Ilija Tovilo wrote: > >IMO, 1. is too drastic. As people have mentioned, there are tools to > >automate disambiguation. But unless we gain some other benefit from > >dropping the lookup entirely, why do it? > > I can think of a few disadvantages of "global first": > > - Fewer code bases will be affected, but working out which ones is harder. > The easiest migration will probably be to make sure all calls to namespaced > functions are fully qualified, as though it was "global only". To talk about more concrete numbers, I now also analyzed how many relative calls to local functions there are in the top 1000 composer packages. https://gist.github.com/iluuu1994/9d4bbbcd5f378d221851efa4e82b1f63 There were 4229 calls to local functions that were statically visible. Of those, 1534 came from thecodingmachine/safe, which I'm excluding again for a fair comparison. The remaining 2695 calls were split across 210 files and 27 repositories, which is less than I expected. The calls that need to be fixed by swapping the lookup order are a subset of these calls, namely only the ones also clashing with some global function. Hence, the process of identifying them doesn't seem fundamentally different. Whether the above are "few enough" to justify the BC break, I don't know. > - The engine won't be able to optimise calls where the name exists locally > but not globally, because a userland global function could be defined at any > time. When relying on the lookup, the lookup will be slower. But if the hypothesis is that there are few people relying on this in the first place, it shouldn't be an issue. It's also worth noting that many of the optimizations don't apply anyway, because the global function is also unknown and hence a user function, with an unknown signature. > - Unlike with the current way around, there's unlikely to be a use case for > shadowing a namespaced name with a global one; it will just be a gotcha that > trips people up occasionally. Indeed. But this is a downside of both these approaches. > None of these seem like showstoppers to me, but since we can so easily go one > step further to "global only", and avoid them, why wouldn't we? > > Your answer to that seems to be that you think "global only" is a bigger BC > break, but I wonder how much difference it really makes. As in, how many > codebases are using unqualified calls to reference a namespaced function, but > *not* shadowing a global name? I hope this provides some additional insight. Looking at the analysis, I'm not completely opposed to your approach. There are some open questions. For example, how do we handle functions declared and called in the same file? namespace Foo; function bar() {} bar(); Without a local fallback, it seems odd for this call to fail. An option might be to auto-use Foo\bar when it is declared, although that will require a separate pass over the top functions so that functions don't become order-dependent. Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
On Fri, Aug 23, 2024 at 8:19 PM John Coggeshall wrote: > > 1. People who don't think BC is a problem, and would like to drop > either the global or local lookup entirely, requiring disambiguation. > > There is also an option of swapping the priority, making local lookups > secondary to global lookups -- and to override that behavior you would > require disambiguation. It doesn't have to mean drop the lookup entirely. Right, that's my proposal. My point was that few (nobody except you and me, I believe) showed particular excitement for this approach.
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
On Fri, Aug 23, 2024 at 8:11 PM Stephen Reay wrote: > > On 24 Aug 2024, at 00:32, Ilija Tovilo wrote: > > > I obviously also disagree with 3. as I wouldn't have sent this > > proposal otherwise. :) Performance improvements are hard to come by > > nowadays. It was measured on real codebases (Symfony and Laravel). > > Just to make sure I'm not misunderstanding something here - Symphony and > Laravel (or anyone else) can get the exact same performance benefit you're > talking about, if they use either a leafing `\` or `use function...` in their > own code, today, correct? As I wrote in my e-mail: > For some context: Before proposing this change, I asked Symfony if > they were interested in disambiguating calls to improve performance, > given I did some work to make fully qualified calls faster. But they > were not, stating that the change would be too verbose for their > liking. > > Making unqualified calls to mean local would force Symfony into making > the change, which is not the approach I'm interested in taking. Yes, the full performance benefits can be achieved by prefixing your entire codebase, which includes not only your own code but also the code of your framework. How much you benefit from just converting your own code of course depends on how much you rely on vendor code. > I'm not going to pretend to know how many files either project has that uses > global functions, but neither of those two existing options seems > particularly "hard to come by" in my mind. The "hard to come by" part is referring to the engine, which is quite optimized for the current semantics. Some of PHPs quirky semantics make it hard to improve it further, this being one of them.
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
On Fri, Aug 23, 2024 at 5:49 PM Rowan Tommins [IMSoP] wrote: > > Other proposals aim to shift that balance - leaving some inconsistency, but > less compatibility break. > > And most users don't object to using a leading backslash, they just (quite > reasonably) have no idea what impact it has on the ability of the engine to > optimise their code. For some context: Before proposing this change, I asked Symfony if they were interested in disambiguating calls to improve performance, given I did some work to make fully qualified calls faster. But they were not, stating that the change would be too verbose for their liking. Making unqualified calls to mean local would force Symfony into making the change, which is not the approach I'm interested in taking. Making them global would likely reduce breakage by much, but not nearly as much as keeping the fallback. >From reading the responses, it seems we have three primary camps: 1. People who don't think BC is a problem, and would like to drop either the global or local lookup entirely, requiring disambiguation. 2. People who do think BC is a problem, and would like some opt-in mechanism to explicitly pick global or local scope. 3. People who aren't convinced that the performance improvements are worth it to begin with, or that the developers themselves are responsible for disambiguation. IMO, 1. is too drastic. As people have mentioned, there are tools to automate disambiguation. But unless we gain some other benefit from dropping the lookup entirely, why do it? Consistency with class lookups is a factor, but is it enough to break a large portion of codebases? The summed up time of every maintainer installing and running a tool that modifies a large portion of the codebase, and then dealing with conflicts in existing branches is not miniscule. Fixing local calls will also require context from other files to correctly disambiguate. I'm not aware if any tools actually consider context, or just take the naive approach of making known, internal calls global, and leaving the rest. 2. misses the point of the immediate performance gains without modifications to the codebase. Even if the disambiguation itself is a one-liner, it still needs to be added to every codebase and every file, and still requires fixing actual local calls that may be made within the same file. I obviously also disagree with 3. as I wouldn't have sent this proposal otherwise. :) Performance improvements are hard to come by nowadays. It was measured on real codebases (Symfony and Laravel). Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
Hi John On Wed, Aug 21, 2024 at 8:02 PM John Coggeshall wrote: > > This is an attack vector for every application and I would argue should be a > real concern for the vast majority of applications out there -- any which > rely on namespace-based frameworks and composer packages from untrustworthy > sources. It's not just Wordpress -- literally every single PHP application > that uses a publicly available framework and consumes external composer > packages should be FQing their internal function calls. The natural behavior > of the language shouldn't be the insecure way of doing things for the sake of > maintaining BC compatibility with existing, insecure, code. Including a malicious composer package already allows for arbitrary code execution, do you really need more than that? Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
Hi Levi On Tue, Aug 20, 2024 at 5:14 PM Levi Morrison wrote: > > I have long been in favor of a larger BC break with better language > consistency. Class lookup and function lookup with respect to > namespaces should be treated the same. The difficulty is getting a > majority of people to vote yes for this. Keep in mind that qualifying > every global function is annoying but probably can be somewhat > automated, and will bring better performance. So again, this improves > the existing code even without upgrading. > > Yes, there would be complaints about it. Yes, there are probably some > people or projects who wouldn't upgrade. I don't particularly care, as > there are increasingly more operating systems and companies providing > LTS support for long periods of time. Probably Zend.com will offer LTS > support for the last PHP 8.X release, and possibly there will be some > distro which also has it. I believe it's the right thing to do > because: > > 1. It's faster. > 2. It enables function autoloading in a similar manner to class autoloading. > 3. It's more consistent, and simpler to teach and maintain. > > It's rare that you get all of these together, often you have to make > tradeoffs within them. The approach I originally proposed also solves 1. and 2. (mostly) with very little backwards incompatibility. Consistency is absolutely something to strive for, but not at the cost of breaking most PHP code. To clarify on 2.: The main issue with function autoloading today is that the engine needs to trigger the autoloader for every unqualified call to global functions, given that the autoloader might declare the function in local scope. As most unqualified calls are global calls, this adds a huge amount of overhead. Gina solved this in part by aliasing the local function to the global one after the first lookup. However, that still means that the autoloader will trigger for every new namespace the function is called in, and will also pollute the function table. Reversing the lookup order once again avoids local lookup when calling global functions in local scope, which also means dodging the autoloader. The caveat is that calling local functions in local scope triggers the autoloader on first encounter, but at least it can be marked as undeclared in the symbol table once, instead of in every namespace, which also means triggering the autoloader only once. Ilija
Re: [PHP-DEV] Require C11 in PHP 8.4
Hi Levi On Tue, Aug 13, 2024 at 5:00 PM Levi Morrison wrote: > > On Tue, Aug 13, 2024 at 8:17 AM Ilija Tovilo wrote: > > > > Just to state it officially: You object to switching to C11 in 8.4? In > > that case, we'll have to postpone. > > I don't object. I think it's smarter to wait, and do more than just > redeclare a few typedefs. But I won't block it. Thank you for the clarification. In that case, I'd say let's go forward with the proposal. I agree that we should be making use of more C11 features in the future, once the opportunity presents itself. > If only documentation changes, then I think the risk is lower. We > definitely should not be doing configure checks in 8.4 with how little > time there is, except maybe _one_ that attempts to redeclare a typedef > to see if it compiles, and give a nicer error message. I agree. I will create a PR to provide a ./configure time error message if the compiler doesn't support typedef redeclarations. Adopting C11 does mean we can start using other C11 features too, though that should naturally happen only on (future) master rather than 8.4, given that ABI freeze is only a little over a month away. Ilija
Re: [PHP-DEV] Require C11 in PHP 8.4
Hi Levi On Mon, Aug 12, 2024 at 5:56 PM Levi Morrison wrote: > > Given the timetable, I wouldn't change the C std requirements for 8.4. Just to state it officially: You object to switching to C11 in 8.4? In that case, we'll have to postpone. > I would stop relying on the typedef and forward declare only the > struct, and that. Note that although Windows supports C11, it does not > support all features including atomics. Someone chimed in to say that > they are available, but this doesn't match the information I got from > a coworker who did a similar test. Given conflicting information and > the short timetable, I think we should lean towards being cautious. I > hope for 8.5/9.0 we can move to C11/C17 which can improve the typedef > situation, simplify our atomics handling, and more. I'm a bit confused about the relevance of C11 atomics. As I'm sure you're aware, they remain optional in C17/C23. So, we'll need to support a fallback, my suggestion was not to remove the fallback. Essentially, code-wise, nothing would change if we adopt C11, except being allowed to redeclare typedefs. Apart from that, only the documentation would change. Ilija
Re: [PHP-DEV] [DISCUSSION] C++ Enhancements in Zend API
On Mon, Aug 12, 2024 at 9:00 PM Lanre wrote: > > I didn’t realize this was an open mic for Rust devs to flaunt their > ignorance, but since you’ve decided to chime in, let me spell it out for you. > Rust has absolutely nothing to do with this discussion, so try to stay on > topic. Nowhere did I mention a move to C++; perhaps reading comprehension > isn’t your strong suit. May I remind you again that we expect all participants on this list to be polite. Ad-hominem attacks are not tolerated. Please freshen up on the mailing list rules. https://github.com/php/php-src/blob/master/docs/mailinglist-rules.md Ilija
Re: [PHP-DEV] Require C11 in PHP 8.4
> Giovanni's remark that this would impact many people was challenged by > Jakub [1] which didn't get a response. I believe it's safe to assume > that this isn't the case. I somehow forgot to link this reference, sorry about that. [1] https://externals.io/message/124706#124717
Re: [PHP-DEV] Require C11 in PHP 8.4
Hi Christoph On Sat, Aug 10, 2024 at 2:19 PM Christoph M. Becker wrote: > > On 01.08.2024 at 23:57, Ilija Tovilo wrote: > > So skimming the whole discussion[1] it seems that most are generally > fine with bumping the requirements to C11, except for Giovanni Giacobbi > (whose draft PR[2] had no further discussion so far), and maybe for some > uncertainties regarding some less used compilers. Giovanni's remark that this would impact many people was challenged by Jakub [1] which didn't get a response. I believe it's safe to assume that this isn't the case. > And if we're going with C11, figuring out the details (which > configuration check to use, or only documenting the requirement) can > still be decided somehwat later, in my opinion. The consensus seems to be that bumping to C11 is ok. While keeping C99 compatibility wouldn't be a big hassle, it doesn't seem necessary anymore. I would agree with Christoph that documentation is enough for now. If your compiler happens to support all the C11 features we require, without full C11 compliance, then that's ok. If we want to reduce bug reports, a configure check for typedef redeclarations, nudging people in the right direction with an appropriate message might help. Verifying C11 support might be tricky, given that the compiler flags are not standardized. But I'm not a build system expert, there might be an approach I'm not aware of. Ilija
Re: [PHP-DEV] [RFC] Add Directive to Make All Namespaced Function Calls Global
On Mon, Aug 5, 2024 at 3:33 PM Nick Lockheart wrote: > > This is a different problem that could be solved by a sandbox API. Not sure which case we were talking about then. ClockMock is what I've been referencing all along. > > Well, ok. But then we're back to prefixing global calls, which > > defeats the purpose of the proposal. > > Global functions would only need a prefix `\` in the *very rare* cases > where local functions are set as the default. For most people, the \ > would be omitted, as globals would be set as default for unqualified > function names. Right. But apart from mocking, what are these cases? If performance were no longer an issue, "using global functions" just makes the language harder to use, removing the local fallback. "using local functions" may be useful in namespaced code making many calls to functions within the same namespace. In that case, it would probably be more useful to switch the lookup order back instead. If you want to pay zero performance penalty, you can prefix global calls with \. You'd need to do that with "using local functions" anyway. As for mocking: If the code needs to change either way, why not make it testable in the first place, e.g. through dependency injection for time()? At least this only requires changing the calls that are mocked, instead of all the calls that aren't. The main benefit of the approach from ClockMock is that your code (probably) doesn't need to change. I do think that the entire approach is hacky, and probably worth solving on a language-level, at least if possible without adding limitations to the engine. A good first start would be to know what functions are commonly mocked with this approach. Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
On Mon, Aug 5, 2024 at 1:23 PM Derick Rethans wrote: > > On Fri, 2 Aug 2024, Ilija Tovilo wrote: > > > As for providing a migration path: One approach might be to introduce > > an INI setting that performs the function lookup in both local and > > global scope at run-time, and informs the user about the behavioral > > change in the future. > > That INI setting would control the *warning*, and not the > *functionlity*, right? Yes, that was my suggestion. First, in a future minor version, an INI option could be added that would warn when finding both a local and global function when performing an unqualified function call. The only reason to hide this behind a setting is to avoid the cost of a double lookup in production code when one isn't necessary, i.e. when calling local functions in some namespace. > I am surprised that it is that much of a performance benefit as well, > but I am also concerned about the BC impact. But if that isn't too much, > then I guess we need to consider this, but only for a major version. Not > something I believe we can change in a 8.x version. Sure, waiting for 9.0 sounds reasonable if we were to choose this approach. Ilija
Re: [PHP-DEV] [RFC] Add Directive to Make All Namespaced Function Calls Global
On Mon, Aug 5, 2024 at 2:27 PM Nick Lockheart wrote: > > > > I'm not sure your proposal solves the mocking problem. If the engine > > is to interpret all non-fq calls as global or local, how would a > > library include your file while switching this configuration, when it > > is implemented as some directive in the file? > > I'm not sure I understand this question. Consider this example: > Also, how would only singular functions be mocked when there is no > > fallback to the global scope for the rest of the functions used > > within the file? That would necessitate mocking all functions, even > > the unmodified ones. > > If a developer needed to override built-in functions (in a specific > file) with local ones, and still use some built-in functions, they > would then need to use the fully-qualified `\function();` to call the > built-in. Well, ok. But then we're back to prefixing global calls, which defeats the purpose of the proposal. I think it's much preferable to look for a different, more robust mocking solution that also works for unnamespaced code and fully qualified calls. Ilija
Re: [PHP-DEV] [RFC] Add Directive to Make All Namespaced Function Calls Global
On Mon, Aug 5, 2024 at 10:22 AM Nick Lockheart wrote: > > > Sorry, my language was not precise enough. Your proposal suggests > > making unqualified calls global when the directive is present, > > whereas my proposal suggests keeping local scope as a fallback, hence > > the two not being compatible. > > Keeping local scope as a fallback would still require an NS lookup > opcode. In the general sense: Yes, we would need to keep the NS lookup opcode. However, functions that are known to exist in global scope would not compile to it. The same goes for your solution, as code that doesn't opt-into disambiguated lookups will still require the ZEND_INIT_NS_FCALL_BY_NAME opcode. > My proposal is that there still be an NS lookup as the default > behavior, but allow an optional override. > > That is, users would have three options: > > (1) resolve all unqualified function names to global (no NS lookup) > (2) resolve all unqualified function names to local (no NS lookup) > (3) Use the default behavior, which at present, is local first, then > global, with a namespace lookup. (no change to user code, keeps BC) > > Your proposal reverses the order of the default NS lookup from local > first to global first. > > That's not mutually exclusive. Sure. But it's questionable whether the motivation of introducing one when the other already exists is still there. > If my proposal *and* yours were *both* accepted, users would have these > options: > > (1) resolve all unqualified function names to global (no NS lookup) > (2) resolve all unqualified function names to local (no NS lookup) > (3) Use the default behavior, which would now be global first, then > local, with a namespace lookup. (no change to user code, keeps BC > working, but changes the default automatic lookup order) > > I believe that my propsal also helps yours be more viable, too. Because > if your proposal was accepted alone, and the NS lookup order changed, > then it would no longer be possible to do unit testing with stubs that > replace built-in functions, UNLESS there was also an option for > developers to instruct the compiler to use local functions over global > ones. I'm not sure your proposal solves the mocking problem. If the engine is to interpret all non-fq calls as global or local, how would a library include your file while switching this configuration, when it is implemented as some directive in the file? Also, how would only singular functions be mocked when there is no fallback to the global scope for the rest of the functions used within the file? That would necessitate mocking all functions, even the unmodified ones. > > What I'm saying is that: > > > > 1. If the vote fails, it might have been because some people don't > > want opt-in behavior. Niels just voiced this opinion. > > Every token added to a PHP file is opt-in behavior that changes how the > parser works. I don't understand what you mean. What Niels was suggesting is that adding more context dependency is undesired. I.e. language semantics should not be dependent on configuration. > I think something needs to be at least as formal as an RFC or people > won't take it seriously enough to contribute feedback. > > I'm trying to start a discussion on this and get some momentum so that > we can hopefully implement something in a near future version. Fair enough. But it should be made clearer what is being voted on. >From reading the RFC, it is not clear that there even are alternative options, which ones have been proposed before, what points have been raised in the discussion, etc. Ilija
Re: [PHP-DEV] [RFC] Add Directive to Make All Namespaced Function Calls Global
On Sun, Aug 4, 2024 at 8:41 PM Nick Lockheart wrote: > > > > > > But that's not quite what the RFC says: > > > > > I am asking that we discuss and vote on the following question: > > > > > > “Should there be some way for developers to signal to the parser at > > > compile time that all unqualified function names found in a > > > namespace context are global, without a namespace lookup?” > > > > Which implies: > > > > 1. There is _some_ change to syntax. > > 2. All unqualified calls become global. > > I did not intend for all unqualified calls to become global, unless the > new directive is present. Sorry, my language was not precise enough. Your proposal suggests making unqualified calls global when the directive is present, whereas my proposal suggests keeping local scope as a fallback, hence the two not being compatible. What I'm saying is that: 1. If the vote fails, it might have been because some people don't want opt-in behavior. Niels just voiced this opinion. 2. If the vote is accepted, it would void my suggestion, because it is not compatible with the conditions laid out in your proposal. If this were a straw poll, rather than an RFC vote, it would be clearer that there is no mandatory approach or policy being accepted. A vote with multiple options to pick an approach (opt-in through some directive, hard BC break, flipping lookup order, nothing at all) might be more appropriate. Ilija
Re: [PHP-DEV] [RFC] Add Directive to Make All Namespaced Function Calls Global
Hi Nick On Sun, Aug 4, 2024 at 7:29 PM Nick Lockheart wrote: > > So I wanted to get a "yes" from the people who need to say yes, then > discuss all of those things you mentioned: > > 1. File level vs global > 2. Syntax > 3. Alternative options > > But I think we need a "yes" for the concept first. Otherwise the vote > will fail on the details, as they have in the past. But that's not quite what the RFC says: > I am asking that we discuss and vote on the following question: > > “Should there be some way for developers to signal to the parser at compile > time that all unqualified function names found in a namespace context are > global, without a namespace lookup?” Which implies: 1. There is _some_ change to syntax. 2. All unqualified calls become global. So, it doesn't seem like there's room for alternative approaches within the definition of your RFC. Both of these points are not compatible with my proposal. Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
Hi Claude On Fri, Aug 2, 2024 at 9:02 PM Claude Pache wrote: > > I propose the following alternative approach: > > * establish a restricted whitelist of global functions for which the > performance gain would be noteworthy if there wasn’t any need to look at > local scope first; > > * for those functions, disallow to define a function of same name in any > namespace, e.g.: https://3v4l.org/RKnZt > > That way, those functions could be optimised, but the current semantics of > namespace lookup would remain unchanged. That would be an improvement over the status quo. However, if you look at the bullet points in my original email, while some of the optimizations apply only to some functions (CTE, custom opcodes and frameless calls), others apply to all internal, global functions (double-lookup, lookup by offset, specialized argument passing). Hence, we may only get a fraction of the benefits by restricting the optimization to a handful of functions. I also wonder if the impact is actually bigger, as then there's no workaround for redeclaring the function, requiring much bigger refactoring. Ilija
Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
Hi Rob On Fri, Aug 2, 2024 at 7:10 PM Rob Landers wrote: > > So, what you’re saying is that symfony and laravel can get a performance > increase by simply adding a \ in the right places? Why don’t they do that > instead of changing the language? Nothing, of course. However, a Symfony maintainer has expressed uninterest in prefixing all internal function calls, including automated use statements at the top of the file. Even if they did, most users will not. > For functions/classes in the same exact namespace, you don’t need a use > statement. But after this change, you do in certain cases? > > namespace Foo; > > function array_sum($bar) {} > > function baz($bar) { > return array_sum($bar); > } > > So, how do you use that function in the same file? Yes. But I'm not sure how that's different from today? If there's a local and global function declared with the same name, and you intend to call the global one, you'll already need to disambiguate the call with a \. With this change, your two options would be to: * Prefix your calls with namespace\. That's quite ugly, but is the syntax we currently offer. * Add a `use array_sum;` to the top of the file. An explicit use has upsides too. It makes it much more obvious that the global function is shadowed. > We can only see open source code when doing impact analysis. This means > picking even a slightly “popular” name could go very poorly. Yes, and there are many more than 10 000 composer repositories. An impact analysis can give you an approximation for breakage, not absolute numbers. Ilija
Re: [PHP-DEV] [RFC] Add Directive to Make All Namespaced Function Calls Global
Hi Nick I find it a bit unfortunate that you gave my thread barely any time to be discussed. On Sun, Aug 4, 2024 at 1:42 PM Nick Lockheart wrote: > > I would like to discuss and then vote on this proposal as a feature, > without getting into any specifics of syntax. > > I propose that we vote yes/no on if there should be some way, whatever > that way ends up being, to tell the parser to always treat unqualified > function names as global. This can be achieved in various ways. For example: * Per-file, via declare(), new use syntax, or whatnot. * Globally, through an INI setting. * Via hard cut in a new PHP version, where a relative calls just stop looking up local scope. * By flipping lookup order, as proposed in my last thread. One might be in favor of one approach but not others. Of course, everybody will be in favor of a "free" 2-4% speedup. Hence, a yes vote won't mean much. Ilija
[PHP-DEV] [Concept] Flip relative function lookup order (global, then local)
Hi everyone As you probably know, a common performance optimization in PHP is to prefix global function calls in namespaced code with a `\`. In namespaced code, relative function calls (meaning, not prefixed with `\`, not imported and not containing multiple namespace components) will be looked up in the current namespace before falling back to the global namespace. Prefixing the function name with `\` disambiguates the called function by always picking the global function. Not knowing exactly which function is called at compile time has a couple of downsides to this: * It leads to the aforementioned double-lookup. * It prevents compile-time-evaluation of pure internal functions. * It prevents compiling to specialized opcodes for specialized internal functions (e.g. strlen()). * It requires branching for frameless functions [1]. * It prevents an optimization that looks up internal functions by offset rather than by name [2]. * It prevents compiling to more specialized argument sending opcodes because of unknown by-value/by-reference passing. All of these are enabled by disambiguating the call. Unfortunately, prefixing all calls with `\`, or adding a `use function` at the top of every file is annoying and noisy. We recently got a feature request to change how functions are looked up [3]. The approach that appears to cause the smallest backwards incompatibility is to flip the order in which functions are looked up: Check in global scope first, and only then in local scope. With this approach, if we can find a global function at compile-time, we know this is the function that will be picked at run-time, hence automatically enabling the optimizations above. I created a PoC implementing this approach [4]. Máté has kindly benchmarked the patch, measuring an improvement of ~3.9% for Laravel, and ~2.1% for Symfony (https://gist.github.com/kocsismate/75be09bf6011630ebd40a478682d6c17). This seems quite significant, given that no changes were required in either of these two codebases. There are a few noteworthy downsides: * Unqualified calls to functions in the same namespace would be slightly slower, because they now involve checking global scope first. I believe that unqualified, global calls are much more common, so this change should still result in a net positive. It's also possible to avoid this cost by adding a `use function` to the top of the file. * Introducing new functions in the global namespace could cause a BC break for unqualified calls, if the function happens to have the same name. This is unfortunate, but likely rare. Since new functions are only introduced in minor/major versions, this should be manageable, but must be considered for every PHP upgrade. * Some mocking libraries (e.g. Symfony's ClockMock [5]) intentionally declare functions called from some file in the files namespace to intercept these calls. This use-case would break. That said, it is somewhat of a fragile approach to begin with, given that it wouldn't work for fully qualified calls, or unnamespaced code. I performed a small impact analysis [6]. There are 484 namespaced functions shadowing global, internal functions in the top 1000 composer packages. However, the vast majority (464) of these functions come from thecodingmachine/safe, whose entire purpose is offering safer wrappers around internal functions. Excluding this library, there are only 20 shadowing functions, which is surprisingly little. Furthermore, the patch would have no impact on users of thecodingmachine/safe, only on the library code itself. As for providing a migration path: One approach might be to introduce an INI setting that performs the function lookup in both local and global scope at run-time, and informs the user about the behavioral change in the future. To mitigate it, an explicit `use function` would need to be added to the top of the file, or the call would need to be prefixed with `namespace\`. The impact analysis [6] also provides a script that looks for shadowing functions in your project. It does not identify uses of these functions (yet), just their declarations. Lastly, I've already raised this idea in the PHP Foundations internal chat but did not receive much positive feedback, mostly due to fear of the potential BC impact. I'm not particularly convinced this is an issue, given the impact analysis. Given the surprisingly large performance benefits, I was inclined to raise it here anyway. It also sparked some related ideas, like providing modules that lock namespaces and optimize multiple files as a singular unit. That said, such approaches would likely be significantly more complex than the approach proposed here (~30 lines of C code). Anyway, please let me know about possible concerns, broken use-cases, or any alternative approaches that may come to mind. I'm looking forward to your feedback. Ilija [1] https://github.com/php/php-src/pull/12461 [2] https://github.com/php/php-src/pull/13634 [3] https://github.com/php/php-src/issues/
[PHP-DEV] Require C11 in PHP 8.4
Hi everyone We've gotten a bug report about compile errors in the PHP 8.4 alpha on old C99 compilers (and on some modern compilers when passing the -std=c99 flag). Specifically, C99 does not support typedef redeclarations, which is a C11 feature. ```c // some_header.h typedef struct _zval_struct zval; void some_func(zval *zv); // zend_types.h struct _zval_struct { ... }; typedef struct _zval_struct zval; ``` Some headers might want to forward declare zval so that zend_types.h doesn't have to be included. The two primary reasons you might want to avoid that are 1. to reduce compile times if the header is very large and 2. to prevent recursive header dependencies. However, in C99 this code is actually illegal if both of these headers end up being included, because redeclarations of typedefs are not allowed. To fix it, the typedef must be removed and the signatures must refer to the struct directly. ```c // some_header.h struct _zval_struct; void some_func(struct _zval_struct *zv); ``` I started fixing these in a PR [1] which required more changes than expected. After a short discourse, we were wondering whether it might be better to switch to a newer C standard instead. Our coding standards [2] currently specify that compiling php-src requires C99. The Unix installation page on php.net [3] claims it is ANSI C, which is certainly outdated. There have been suggestions to require C11 for a while, which should be well supported by all compilers shipped with maintained distributions. GCC gained support for C11 in 4.7 [4], LLVM Clang in 3.1, both released in 2012. For context, Debian Bullseye comes with GCC 10.2 [5], Ubuntu Focal Fossa comes with GCC 9.3 [6], RHEL 8 comes with GCC 8.x [7]. The biggest blocker in the past was MSVC, which has finally added C11 support in Visual Studio 2019 [8]. Technically, Visual Studio 2015 and 2017 are still supported by Microsoft [9], but according to Christoph they are no longer used for PHP builds since version 8. Hence, it seems like it would be ok to bump our C compiler requirement to C11. We'd like to make this change before beta 1 if there are no objections. There are no immediate plans to make non-optional use of other C11 features, although that is conceivable at some point. Another benefit brought up by Christoph is that C11 relegated the requirement for the compiler to support VLAs (variable-length arrays) which was never implemented by MSVC, technically not complying with the currently required standard. If you do have any concerns, please let me know. Ilija [1] https://github.com/php/php-src/pull/15079 [2] [https://github.com/php/php-src/blob/8b6f14a9a221599b076e4bc3570f013a7e65aa21/CODING_STANDARDS.md?plain=1#L12](https://github.com/php/php-src/blob/8b6f14a9a221599b076e4bc3570f013a7e65aa21/CODING_STANDARDS.md?plain=1#L12) [3] https://www.php.net/manual/en/install.unix.php [4] https://stackoverflow.com/a/16256596 [5] https://packages.debian.org/bullseye/gcc [6] https://packages.ubuntu.com/focal/gcc [7] https://access.redhat.com/solutions/19458 [8] https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-arriving-in-msvc/ [9] https://learn.microsoft.com/en-us/visualstudio/productinfo/vs-servicing
Re: [PHP-DEV] Request for opinions: bug vs feature - change intokenization of yield from
Hi Juliette On Sat, Jul 20, 2024 at 6:40 PM Juliette Reinders Folmer wrote: > > On 20-7-2024 18:04, Tim Düsterhus wrote: > > As I've said: I agree that the current situation is unfortunate. But the > correct solution is not "disallow comments", but "split the T_YIELD_FROM into > T_YIELD T_WHITESPACE T_FROM_FOR_YIELD_FROM". > > Tim, you're making my point for me. This is *exactly* why the current change > should be reverted. > > I'm not against changing the tokenization of "yield from" and the GH ticket > thread also contains an alternative proposal for this from Bob [1], but like > Matthew also said [2]: if that's something we want to do, let's have a proper > discussion about it and let it go through an RFC and be a documented change. > > And not, like it is now, an undocumented, random change creating an > inconsistency in the Tokenizer. Thank you for raising this issue. Let me share my viewpoint as the person who has made the change. First of all, while allowing comments between yield and from was obviously intentional, the side-effect of embedding the comment in the yield_from token was something I did not sufficiently consider. That's my bad. I agree that it's not very likely that many people started placing comments between yield and from, but certainly not impossible. The main problem with reverting this change is that such code will completely stop working. Patch updates are routinely installed on servers without additional testing, so it's very much possible that we would brick peoples production environments. To compare the impact on PHP_CodeSniffer: From what I understand (please correct me if I'm wrong), there are really only two possible scenarios when encountering comments between yield and from: * PHP_CodeSniffer doesn't see the comment and acts as if it weren't there, and potentially removes it when fixing errors automatically. * PHP_CodeSniffer thinks this is a syntax error and thus reports it. While that's not optimal, it is limited to a subset of the already small set of people using this new feature, namely the ones also using PHP_CodeSniffer. It is also limited to development environments, where PHP_CodeSniffer is executed. What I'd suggest instead is fixing this for master in a way that makes it simple for PHP_CodeSniffer to add support, or to just revert it there. If it is indeed true that nobody uses this feature after 8 months, then waiting another 4 for a proper fix doesn't seem like an issue. In the likely case of not finding consensus on this issue: Setting up a vote doesn't seem straight-forward to me either. Would we consider 8.3 the baseline, with reverting the change requiring the 2/3 majority, or do we consider 8.2 the baseline? Ilija
Re: [PHP-DEV] [RFC] Asymmetric Visibility, v2
Hi Rob On Sat, Jul 20, 2024 at 3:47 PM Rob Landers wrote: > > On Sat, Jul 20, 2024, at 03:14, Larry Garfield wrote: > > On Wed, May 29, 2024, at 2:15 PM, Larry Garfield wrote: > > > > https://wiki.php.net/rfc/asymmetric-visibility-v2 > > Hi folks. After a side quest to polish off hooks, we're nearly ready to > bring aviz to a vote. > > We've made one change since we last discussed it: Specifically, Ilija > realized that __set's behavior is already inconsistent, so supporting it for > aviz properties with invisible set would make it even more inconsistent, not > less. For that reason, we've changed the __set behavior such that a > non-readonly aviz property will not trigger __set. Further details are in > the RFC, but in short, all of the use cases for that behavior now have better > alternatives, such as property types, hooks, and aviz itself. So there's > really no point to falling back to __set in edge cases. > > https://wiki.php.net/rfc/asymmetric-visibility-v2#interaction_with_set_and_unset > > Hmm, > > There’s a lot of existing code that relies on this behavior, it’s pretty much > the best way to create proxies without requiring code generation. > > This is a pretty massive breaking change. There was a miscommunication between Larry and me. The change is not to any existing behavior. __get/__set are currently called under two circumstances: * The properties _full_ visibility is not met. * The property was explicitly unset. We're not changing this behavior. Instead, we decided not calling __set for asymmetric visibility, when only the set visibility isn't met. Before making this decision, implicitly implying protected(set) for a readonly property would have led to __set being called (because the scope protection now comes from asymmetric visibility, rather than readonly itself), which would have been a change to the current behavior. So, in short: If only the set visibility isn't met, we're now throwing an error. This is consistent with readonly today, and with get-only hooks. If somebody wants to change this in the future, they can do so without any BC breaks, and most likely should make the behavior consistent across all of these three cases. Ilija
Re: [PHP-DEV] Sync CODEOWNERS
Hi Christoph On Thu, Jul 18, 2024 at 2:09 PM Christoph M. Becker wrote: > > So I suggest to sync CODEOWNERS for all active branches (and maybe even > security branches). > > What do you think? I think back when it was introduced it wasn't clear (at least to me) that PRs would request reviews based on the CODEOWNERS file of the target branch, rather than the default branch. Given that's how it works, syncing it in all active branches makes sense to me. Ilija
Re: [PHP-DEV] [RFC] Property Hook improvements
Hi Derick On Wed, Jul 10, 2024 at 11:32 AM Derick Rethans wrote: > > On Mon, 1 Jul 2024, Larry Garfield wrote: > > > > > https://wiki.php.net/rfc/hook_improvements > > 1. Remove the proactive guard against recursive backing value access > > "which would eventually trigger a stack overflow" > > Would it though? PHP 8.4 has an actual protection against this now, > instead of getting the white-screen-of-death. > > Not that it matters much, but it's probably good to be precise in the > language. Yes, this excerpt is referring to the PHP stack overflow detection, which will throw an exception rather than segfault. That is, on all the architectures that already support stack overflow detection. We're just reusing the same mechanism already used for normal function calls. --- Hi Claude > 1. Removing the guard against recursion is sensible to me, as the sort of bug > it is supposed to prevent is not specific to property accessors. IMO, a > better solution to the issue is to implement a global upper limit on the call > stack size. Currently, I am able to generate a call stack of more than 10 > millions items before an OOM error occurs; PHP should have thrown a stack > overflow error long before that. But this is entirely orthogonal to property > hooks. PHP allows using recursion for algorithms on arbitrarily large data because function calls do not cause the C stack to grow. Limiting the PHP stack size would thus require these algorithms to be rewritten to iterative ones, which can make them more complicated. What we could do is check for stack size on OOM and indicate that the OOM was most likely caused by infinite recursion. Anyway, This does not really apply to hooks, because it will (usually) cause VM reentry for recursion, growing the stack size and thus triggering the "fast" stack overflow detection error. Ilija
Re: [PHP-DEV] Re: VCS Account Request: youkidearitai
Hi Yuya! On Sat, Jun 29, 2024 at 11:11 PM youkidearitai wrote: > > 2024年4月9日(火) 7:55 youkidearitai : > > > > Mainly review and approve pull request to mbstring extension. > > (probably everything related to Unicode and other character encoding) > > Alex Dowad (alexdowad) suggested that give to me. > > https://github.com/php/php-src/pull/13906#issuecomment-2041585626 > > Hello, Internals > I want to continue to contribute to mbstring extension. > And also I want become to code owner to mbstring extension. > Could I become grant write access? > > Please check below: > https://github.com/php/php-src/commit/0bcc1e613bc8b62859fc75afeeb83651555fb44a Sorry for the delay. I granted you access to the php-src repository. Please familiarize yourself with our merging process [1]. Note that I unfortunately cannot (and wouldn't know how to) create main.php.net accounts. They are pretty much unused nowadays except for the wiki. I can ask Derick to create one once he's back from vacation. Feel free to ping me if you have any questions. Thank you for contributing to PHP! Ilija [1] https://wiki.php.net/vcs/gitworkflow
Re: [PHP-DEV] [Early Feedback] Pattern matching
Hi Rob On Wed, Jun 26, 2024 at 10:36 PM Rob Landers wrote: > >> On Wed, Jun 26, 2024, at 21:50, Morgan wrote: >> >> So the issue has nothing to do with this hypothetical infinity of >> unobservable nulls, and comes entirely down to the fact that with this >> pattern a variable may pass >> a) because it does not have a key named 'foo', or >> b) because it has a key named 'foo' with a string value. >> >> In other words, "this key is optional, but if it is defined it must > match this pattern". > > Seriously, write out using it both ways. I asked in the beginning for someone > to give a realistic example showing a practical difference in the final > implementation and I haven’t seen one. I will gracefully eat my hat. The main > issue is that key-existence-check is logically inconsistent with itself (if > you say the key shouldn’t exist or be a string, you’d be surprised to get > null from that key!) function test($value) { if ($value is ['foo' => ?string]) { $foo = $value['foo']; } } test([]); With your approach, the example above would emit a warning, even though the context within test() doesn't look like it should. If ?string means that the index might or might not exist, all code that accesses them must check for existence, even when not needing to handle null itself. That doesn't seem desirable to me. I also think the issue goes further. If anything|null means that the offset might not exist, does that include mixed? That makes ['foo' => mixed] essentially useless. Ilija
Re: [PHP-DEV] [Early Feedback] Pattern matching
Hi Rob On Tue, Jun 25, 2024 at 9:05 AM Rob Landers wrote: > > On Tue, Jun 25, 2024, at 01:20, Ilija Tovilo wrote: > > On Mon, Jun 24, 2024 at 9:54 PM Robert Landers > wrote: > > > > To be honest, this is one of the smaller concerns I have with the new > > syntax. There might be some misunderstanding here, though. A > > non-existent key is NULL, always has been, and always will be. > > This is just not accurate. Inexistent indexes are not null in PHP, > they are undefined. PHP implicitly coerces undefined to null, because > undefined is not a value accessible to users. The same occurs when > accessing $undefinedVariable. For arrays, this fact is observable > through `foreach`, warnings when accessing the index, and likely > others. > > This is a bit like telling someone who fell off a ladder that they didn’t > “technically” fall, instead the Earth and them pulled at each other until > they collided and the ground + body absorbed the energy. > > While yes, you are “technically” correct, what you describe is essentially > unobservable from the context of the running code (unless you turn the > warning into an error/exception). For all direct accesses of array values > ($arr['key']) an array is infinitely full of nulls (I have actually depended > on this property at one point for a bloom filter). If null array values were indeed unobservable, then [] would be === to [null] (or at least ==), and a foreach over [null] would result in 0 iterations. But neither of those are the case. > So yes, `[?'foo' => string]` and `['foo' => ?string]` are indeed > different. The former accepts `[]`, while the latter accepts `['foo' > => null]`. > > Are they actually different in practice though? That was my point. After the > “is” in both cases, you’ll have to use null-coalescence to retrieve the > value. For all intents, they are the same resulting code. If you can show a > difference in the resulting code and how it is an improvement, I may be > inclined to agree, but I can’t think of one. Sure. If a null value were to mean "not set", then ['foo' => string] should accept ['foo' => 'foo', 'bar' => null], which is absolutely observable if the code assumes it won't see any additional indexes. Ilija
Re: [PHP-DEV] [Early Feedback] Pattern matching
On Mon, Jun 24, 2024 at 9:54 PM Robert Landers wrote: > > > The first means b is an optional key, but if it’s there, can only be a > > string. The second says b is a required key, but it may be a string or > > null. If there were a binding involved, that determines the type of the > > binding in incompatible ways. I’m fine with requiring bindings to be > > nullable for optional keys, but it strikes me as strictly less flexible and > > not consistent with the rest of PHP’s behavior, at least not under E_ALL. > > > > To be honest, this is one of the smaller concerns I have with the new > syntax. There might be some misunderstanding here, though. A > non-existent key is NULL, always has been, and always will be. This is just not accurate. Inexistent indexes are not null in PHP, they are undefined. PHP implicitly coerces undefined to null, because undefined is not a value accessible to users. The same occurs when accessing $undefinedVariable. For arrays, this fact is observable through `foreach`, warnings when accessing the index, and likely others. So yes, `[?'foo' => string]` and `['foo' => ?string]` are indeed different. The former accepts `[]`, while the latter accepts `['foo' => null]`. > $arr = ['a' => 'a string']; > $arr is ['a' => string, ?'b' => $value, ...]; > > This syntax implies that a non-existent key is a special case, and if > it passes as-is, it will be. If there is a binding and the key is > missing, what happens to that binding? This is the same problem as `|`. Variable bindings within optional keys must be forbidden. I already mentioned that to Larry when we thought about this idea. Ilija
Re: [PHP-DEV] Re: RFC karma request
Hi Bilage On Thu, Jun 20, 2024 at 7:55 PM Bilge wrote: > > > > Re: Static class, I would like to propose an RFC for this feature. My > > wiki account is: bilge. > > > > Cheers, > > Bilge > > > If there is some reason why I should not be able to do this, please let > it be known. Sorry for the delay. I granted you RFC karma. Good luck! Ilija
Re: [PHP-DEV] [RFC] Static Constructor
Hi Erick On Wed, Jun 19, 2024 at 2:34 PM Erick de Azevedo Lima wrote: > > You can read the RFC here: > https://wiki.php.net/rfc/static_constructor I see that you're using zend_class_init_statics() as a hook to call __static_construct(). This makes the initialization order unpredictable, because static properties are initialized lazily when the class is first used (when instantiated, when accessing constants, etc.). Essentially, this recreates the same problem described in the "new in initializer" RFC: https://wiki.php.net/rfc/new_in_initializers#unsupported_positions > New expressions continue to not be supported in (static and non-static) > property initializers and class constant initializers. The reasons for this > are twofold: > [snip] > For static property initializers and class constant initializers a different > evaluation order issue arises. Currently, these initializers are evaluated > lazily the first time a class is used in a certain way (e.g. instantiated). > Once initializers can contain potentially side-effecting expressions, it > would be preferable to have a more well-defined evaluation order. However, > the straightforward approach of evaluating initilizers when the class is > declared would break certain existing code patterns. In particular, > referencing a class that is declared later in the same file would no longer > work. Lazy evaluation might be ok if order is clearly defined. Making the order undefined makes it hard (or impossible) to understand which symbols declared in the current file may be used from __static_construct(). The alternative mentioned above (calling __static_construct() at class-declaration-time) likely breaks too much existing code (because it would also require calling static initializers just before that, which may reference symbols declared later on), and is further complicated by early-binding. I'm not sure what the best approach is here. Ilija
Re: [PHP-DEV] Wiki registration and request RFC karma
Hi Erick On Mon, Jun 17, 2024 at 11:46 PM Erick de Azevedo Lima wrote: > > I've been reading the internals mailing list for some time and giving my 2 > cents here and there. > Now I have created a new wiki account (erickcomp) and I'd like to request RFC > karma for a new RFC I worked on during my recent studies of the php-src. I have granted you RFC karma. Good luck! Note that the RFC process suggests first introducing the concept on the list first, before writing an RFC. You might save yourself some time if the idea turns out not to be desired. Ilija
Re: [PHP-DEV] Property hooks, and &get-only write behavior
Hi Gina On Thu, Jun 13, 2024 at 3:24 AM Gina P. Banyard wrote: > > On Wednesday, 12 June 2024 at 16:06, Ilija Tovilo > wrote: > > > Hi everyone > > > > [...] > > > > I hope that these semantics are acceptable for everyone. > > > > Ilija > > Hello Ilija, > > I might know what sort of technical issues you've been running into, as I > have some similar issues correctly handling references with my > Container/Offset RFC implementation. > The way I'm handling this is by splitting the "get"/read handler/hook to > *just* be a by-value return, and for fetching creating a new "fetch" handler > that is called in the respective situations. > Therefore, I am wondering if it wouldn't possibly make more sense to add a > "fetch" hook, and ban returning by-ref for the get hook. I don't know exactly what issue you encountered, but it sounds unrelated to mine. The issue was not related to the design of &get, but the way object handlers work for properties. In particular, get_property_ptr_ptr does not provide zval space for temporary values. There are three object handlers relevant: * get_property_ptr_ptr, which returns a direct pointer to the property zval. This is usually used for read+write operations such as ++, or when assigning by-ref, like =&. * read_property * write_property read_property and write_property are also used as fallbacks when get_property_ptr_ptr returns NULL. This is the case for hooks. Essentially, for hooks, we want to make sure the engine does not bypass the set hook by writing to the zval directly. Pure assignments aren't really an issue, I was able to implement them using write_property. When a value is assigned to a &get-only property, we call &get and assign the value to the resulting reference. However, for read+write operations, the flow works like this: * get_property_ptr_ptr returns NULL, indicating that a direct modification of the zval is not possible. * The engine calls read_property instead. * It modifies the resulting value. * It calls write_property with the new value. With this workflow, &get gets called twice, once for read_property and once for write_property, which is the unexpected part. I would expect &get only to be called once, and for this value to be modified in-place. To implement this properly, get_property_ptr_ptr should return the reference instead. However, the way get_property_ptr_ptr is built doesn't allow for this, because unlike read_property, it doesn't have a `zval *rv` parameter the caller must supply that the callee may use in the case of temporary values (such as &__get, or here, &get). So, rather than causing a big internal API change to add the `zend *rv` parameter (this API is very old), I opted to go with the original plan. I hope that helps. Ilija
[PHP-DEV] Property hooks, and &get-only write behavior
Hi everyone While reviewing the property hooks patch, Bob found a discrepancy in the implementation and the specification. There is an unfortunate mistake in one of the examples that suggests specific by-reference behavior that we did not intend to support. Specifically, from https://wiki.php.net/rfc/property-hooks#references (you'll have to scroll down a bit): ```php class C { public string $a { &get { $b = $this->a; return $b; } } } $c = new C(); $c->a = 'beep'; // $c is unchanged. ``` This example implies that the assignment `$c->a = 'beep';` will invoke `&get` to write to the contents of the reference. However, this is not the case. Direct assignments will, in the absence of `set`, fall back to the builtin write operation for backed properties, or throw for virtual properties. Instead of `$c->a = 'beep';`, the example _should_ have said: ```php $ref = &$c->a; $ref = 'beep'; ``` Here, `$ref = &$c->a;` _will_ call `&get`. What the example intended to show is that the reference returned by `&get` is detached from `$c->a` itself. That said, I did give this idea some thought. I think it would not have been unreasonable to automatically call `&get` and assign to the returned reference when there is a `&get` hook but _no_ `set` hook. However, when trying to implement this, I encountered some technical issues that make this currently unfeasible. So, we'd like to move forward with the original semantics: If there is no `set` hook, then the assignment will trigger a builtin write for backed properties, or an error for virtual properties. If you wish to achieve the behavior described above (using `&get` to write to the property), you can implement `set` to perform the write to the reference yourself. I hope that these semantics are acceptable for everyone. Ilija
Re: [PHP-DEV] [RFC] Asymmetric Visibility, v2
Hi Tim On Tue, Jun 4, 2024 at 7:54 PM Tim Düsterhus wrote: > > One thing that would get pretty wonky would be private-read properties: > Private property names are currently internally "mangled" to include the > class name. This allows to define the same private property in multiple > classes of an inheritance chain, without those classes needing to know > about the private properties of each other and making the addition and > removal of a private property not a BC break. For all intents and > purposes those private properties to not exist, unless you are the class > itself. > > I have no idea what the semantics of a public-write, private-read > property should be - and this problem is pretty similar to the > sibling-discussion about making private-set properties implicitly final, > because otherwise the semantics get wonky. > > I believe that the case of making a property public-write, private-read > is best left to a virtual set-only hook. Indeed. A private property with a more permissible set operation is the wrong approach. What we'd want here is a public property with a restricted get operation. This is not quite expressible with the current syntax. We'd need something like `public private(get)`, or `public $prop { private get; }` with the C# equivalent. However, this is quite an edge case, and since it requires additional syntax I don't think it's something we should support without a specific use-case. You can emulate it with a set-only virtual property, if you really want to. Ilija
Re: [PHP-DEV] Fwd: Request for RFC Karma to Propose any_empty and all_empty Methods
Hi Elminson! On Mon, May 27, 2024 at 6:51 PM Elminson De Oleo Baez wrote: > > I hope this message finds you well. I am writing to request RFC karma for my > wiki account in order to propose a new RFC. > > My proposal involves the introduction of two new methods, any_empty and > all_empty, for working with arrays. These methods are designed to provide > boolean outputs indicating whether any of the elements in an array are empty, > or if all elements are empty, respectively. I believe these methods will be > valuable additions to PHP’s array manipulation functionalities. > > Below is a brief overview of the proposed methods: > > any_empty(array $array): bool - This method will return true if any element > in the provided array is empty, and false otherwise. > all_empty(array $array): bool - This method will return true if all elements > in the provided array are empty, and false otherwise. > These methods aim to simplify common array checks and improve code > readability and efficiency. > > I look forward to your approval and any guidance you can provide on moving > forward with this proposal. I'm skeptical personally about these functions. empty() doesn't have the best semantics, and it's going to be rare that all your input types exactly follow these semantics. I think a deeper dive into common validation requirements might be good, to see whether they can be abstracted in some way. It's also worth noting that validation entails much more than just yielding true or false, e.g. coercion or graceful errors. Anyway, I granted you RFC karma. Good luck! Ilija
Re: [PHP-DEV] [RFC] [Vote] #[\Deprecated] attribute
Hi Matthew On Mon, Jun 3, 2024 at 3:15 PM Matthew Weier O'Phinney wrote: > > On Wed, May 22, 2024 at 2:24 AM Benjamin Außenhofer > wrote: >> >> The vote for the RFC #[\Deprecated] attribute is now open: >> >> https://wiki.php.net/rfc/deprecated_attribute >> >> Voting will close on Wednesday 5th June, 08:00 GMT. > > I have voted no for a few reasons: > > - Ideally, I'd like to be able to mark _anything_ as deprecated. In > particular, not being able to mark a _class/interface/enum/etc_ as deprecated > makes this far less useful. While it's true that extending #[Deprecated] to classes would be useful, deprecation already exists as a language concept, and it can be extended to class-like structures without a BC break. > - The "since" parameter is basically worthless to me. It's very easy to find > out the last version that wasn't deprecated. What would be far more useful to > a consumer is an argument indicating when something will be removed (e.g. > $toRemoveInVersion, $versionForRemoval, etc.). This helps me as a user plan > for the future. Did you vote yes in the secondary vote by accident? I voted no on the $since parameter for the same reason: * "How long have I not fixed this?" is not a particularly useful question to ask. "When do I have to fix this?" is more relevant. * The format of $since is intentionally left unstandardized, and it's unclear (to me?) what it refers to. For example, some packages are split into multiple, smaller ones (e.g. Doctrine) with diverging version numbers. The sub-package version number may not be useful to the end-user, who never requires it directly. Similarly, referencing the main package version may be confusing, especially if the ranges of recent main and sub-package versions overlap. Ilija > - The "since" parameter is basically worthless to me. It's very easy to find > out the last version that wasn't deprecated. What would be far more useful to > a consumer is an argument indicating when something will be removed (e.g. > $toRemoveInVersion, $versionForRemoval, etc.). This helps me as a user plan > for the future. > > -- > Matthew Weier O'Phinney > mweierophin...@gmail.com > https://mwop.net/ > he/him
Re: [PHP-DEV] [RFC] Transform exit() from a language construct into a standard function
On Tue, May 28, 2024 at 2:10 PM Gina P. Banyard wrote: > > On Monday, 27 May 2024 at 02:31, Ilija Tovilo wrote: > > > > On Wednesday, 8 May 2024 at 14:40, Gina P. Banyard intern...@gpb.moe > > > wrote: > > > > > > > I would like to formally propose my idea for exit() as a function > > > > brought up to the list on 2024-02-24 [1] with the following RFC: > > > > https://wiki.php.net/rfc/exit-as-function > > > > > > As mentioned early on in private, I don't see a convincing reason to > > remove tokenizer/parser support for exit/die. I'd rather see this > > handled in the parser directly, by converting the standalone keywords > > to function calls. This avoids any backwards incompatibility, and > > avoids special handling in zend_compile_const. > > I must be honest, I don't really understand how parsers work, so I went with > what I could understand and implement. > But I'm struggling to see a reason for keeping the token in the first place, > and what issues hooking into zend_compile_const() has. Mostly because I think handling exit and die as constants is misleading. exit; isn't a constant any more than yield; is. Instead, you could turn exit; into the same AST as exit(); (i.e. a function call), which will make the compiler handle it automatically. If you wish, I can have a quick look. > > Another thing that's probably not too important: The PR likely breaks > > dead code elimination for exit() and die(). This could be re-added by > > checking for the never return type instead. > > Checking for a never return type seems more robust if it wasn't already > supported by DCE. > I will see if I can do this. That would be great! I agree that this can be done in retrospect. Ilija
Re: [PHP-DEV] [RFC] Transform exit() from a language construct into a standard function
Hi Gina On Sun, May 26, 2024 at 11:47 PM Gina P. Banyard wrote: > > On Wednesday, 8 May 2024 at 14:40, Gina P. Banyard wrote: > > > > > I would like to formally propose my idea for exit() as a function brought > > up to the list on 2024-02-24 [1] with the following RFC: > > https://wiki.php.net/rfc/exit-as-function > > As there haven't been any comments for nearly two weeks, I'm planning on > opening the vote for the RFC on Tuesday. As mentioned early on in private, I don't see a convincing reason to remove tokenizer/parser support for exit/die. I'd rather see this handled in the parser directly, by converting the standalone keywords to function calls. This avoids any backwards incompatibility, and avoids special handling in zend_compile_const. Another thing that's probably not too important: The PR likely breaks dead code elimination for exit() and die(). This could be re-added by checking for the never return type instead. You'd also need to special-case the lookup of exit/die in namespaced code, where it will always refer to the global function (as they cannot be declared in userland). Ilija
Re: [PHP-DEV] [DISCUSSION] Checking uninitialized class properties
On Sat, May 18, 2024 at 4:41 PM Rowan Tommins [IMSoP] wrote: > > On 18/05/2024 11:52, Luigi Cardamone wrote: > > Are there any downsides in adding a > > specific syntax to check if a property > > is initialized with any value? > > In my opinion - and I stress that others may not share this opinion - > the entire concept of "uninitialized properties" is a wart on the > language, which we should be doing our best to eliminate, not adding > more features around it. I fully agree. The reason NULL was deemed the "billion-dollar mistake" is not because NULL isn't a useful value, it's because it is implicitly part of types that are commonly never NULL. For example, in C, there's no way through the type system to convey whether a pointer returned from a function may or may not be NULL (ignoring hints through attributes). As such, the user must look at documentation (which may be inaccurate), or even guess whether NULL must be handled. A failure to do so will not result in any compilation errors, but crashes at runtime. Luckily, PHP does not suffer from this issue. null is only allowed when the type says so. However, if you leave your class properties uninitialized, you're essentially recreating this issue. How do you know whether it is safe to access class properties? May they be uninitialized under certain conditions? The type system has no way of conveying this information. There's no way to know, without looking at the implementation. Static analysis won't be able to catch it for you either. So, as Rowan suggested, it is better to hint at this "uninitialized" value that must be handled through the type system, currently most conveniently through single-case enums and union types. Ilija
Re: [PHP-DEV] [Discussion] Why can't I do "{$a::class}"?
On Sun, May 19, 2024 at 1:15 PM Ilija Tovilo wrote: > > Hi Peter > > On Sun, May 19, 2024 at 10:30 AM Peter Stalman wrote: > > > > echo " {A::$static_property} \n"; // doesn't work (unless $static_property > > is a variable) > > echo " {$a::$static_property} \n"; // works > > > > echo " {A::static_method()} \n"; // doesn't work (just text) > > echo " {$a::static_method()} \n"; // works > > > > echo " {A::constant} \n"; // doesn't work > > echo " {$a::constant} \n"; // doesn't work either, but why? > > There were also suggestions to extend strings in a more generic way, > akin to JavaScripts template strings [3], but I didn't have any use > for this myself. > > [3] > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals What I actually wanted to reference were "tagged templates": https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates Sorry for the confusion. Ilija
Re: [PHP-DEV] [Discussion] Why can't I do "{$a::class}"?
Hi Peter On Sun, May 19, 2024 at 10:30 AM Peter Stalman wrote: > > echo " {A::$static_property} \n"; // doesn't work (unless $static_property is > a variable) > echo " {$a::$static_property} \n"; // works > > echo " {A::static_method()} \n"; // doesn't work (just text) > echo " {$a::static_method()} \n"; // works > > echo " {A::constant} \n"; // doesn't work > echo " {$a::constant} \n"; // doesn't work either, but why? It would be straightforward to allow all expressions that start with a `$` in string interpolation, as I've noticed a couple of years ago [1]. This restriction seems rather arbitrary. I think I held off proposing this because I was planning on proposing a more complete form of string interpolation [2]. However, the backwards-compatible syntax was largely disliked, so I withdrew the RFC. I wasn't particularly fond of introducing more forms of strings (e.g. $"", f"", etc.) to avoid the BC break, because there are already plentiful: * '' * "" * `` (yes, these allow interpolation) * <
Re: [PHP-DEV] [DISCUSSION] Checking uninitialized class properties
Hi Luigi On Fri, May 17, 2024 at 11:40 PM Luigi Cardamone wrote: > > Here is an example to describe my problem. Imagine a simple > DTO like this: > > class MyDTO{ > public ?int $propA; > public ?int $propB; > } > > Imagine that a Form processor or a generic mapper fill some of > these fields with a null value: > > $dto = new MyDTO(); > $dto->propA = null; > > Sometimes we use DTOs to handle PATCH requests and not all > the properties are mapped with a value. In a scenario like this, > "null" is often a valid value. > > At this point, I need a way to find if a property was initialized or > not but unfortunately "isset" is not a solution. When I write: > > echo isset($dto->propA) ? 'init' : 'not-init'; IMO, "uninitialized" should not be treated as a value. Code outside of the constructor should generally not have to deal with uninitialized properties. Instead, make sure the constructor leaves your object fully initialized, in this case likely by setting it to null. If you need some additional "undefined" state, I think that's better modeled in other ways, maybe by wrapping it in an object. ADTs [1] should help with that in the future. enum Age { case Unknown; case Unborn; case Years(int $years); } public Age $age; Or: enum Age { case Unborn; case Years(int $years); } public ?Age $age; Silly example, but you get the point. Ilija [1] https://wiki.php.net/rfc/tagged_unions
Re: [PHP-DEV] [RFC] [Vote] Type Guards for Classes
Hi Patrik On Thu, May 16, 2024 at 10:31 PM Patrik Václavek wrote: > > Introduce a new type guard syntax for classes: > > ```php > (Foo) $variable; > ``` > > This syntax will internally perform the following operations: > > 1. Check if `$variable` is an instance of `Foo`. > 2. If the check fails, throw a `TypeError` with a message indicating the > expected and actual types. Note that this feature is covered under the pattern matching RFC with the "Throwing alternative" extension. https://wiki.php.net/rfc/pattern-matching#throwing_alternative In addition to just class types, it would support all other patterns, including scalar types, union types, array shapes, etc. Ilija
[PHP-DEV] Introduce ReflectionConstant
Hi everyone We recently received a feature request to allow reflection of global constants, primarily to check whether they are deprecated. https://github.com/php/php-src/issues/13570 I created a simple PR to introduce a minimal `ReflectionConstant` class. https://github.com/php/php-src/pull/13669 If there are no objections, I'd like to merge this PR in a few days. Ilija
Re: [PHP-DEV] [RFC][Vote announcement] Property hooks
Hi Matthew We're going to skip over the reiterations of Juliettes arguments, as they have already been responded to. On Thu, Apr 11, 2024 at 12:08 AM Matthew Weier O'Phinney wrote: > > On Mon, Apr 8, 2024 at 4:41 PM Ilija Tovilo wrote: >> >> https://externals.io/message/122445#122667 >> > > 2. I'm not a huge fan of the short syntax, but the improvements in the most > recent draft are _mostly_ ones I can live with. The part that's still unclear > is when and where hooks need a `;` termination, and _why_ the `;` is used, > instead of `,`. When using `match()` expressions, you use `,` to separate the > expressions, but for some reason, the proposal uses `;` ... but only when > using short expressions. And if you have a full-form mixed with a short form, > the `;` is only needed for the short-form expression. This feels arbitrary, > and it will be easy to get it wrong for people comfortable with `match()` > statements. I agree that the distinction of `,` and `;` isn't clear-cut. I would categorize hooks as declarations, because they are really just functions attached to the property. Declarations are `;`-terminated, or have a body (`{}`) (properties, (non-)abstract methods, class consts, etc). `,`, on the other hand, is used for lists of expressions and other things (array elements, match cases, function arguments). The other argument for the syntax we chose is that it's already used in C#. > (Larry tells me that it's `match()` being weird here, but considering that > for many developers, their only point of reference for this sort of syntax IS > `match()`, making it feel like the language is ignoring its own syntax when > creating new syntax.) Given my description from above, I don't believe this is true either. Match arms most closely relate to the array element syntax, which already uses `,`. > 3. While I'd likely prefer Marco's approach to references (just don't allow > them), the fact that they mirror how `__get()` and `__set()` _currently_ work > gives a migration path for users who are familiar with that paradigm's > gotchas. In other words, it's consistent with the current language, and will > make migrating from `__set/__get` to hooks easier. It's a lot of complexity, > but the table you created helps with that. That table MUST make it to the > docs for the feature! Precisely. Our decision to support references (as much as possible) comes from the desire to maintain compatibility (as much as possible), not only with `__get`/`__set` but also plain props. Ilija
Re: [PHP-DEV] [RFC][Vote announcement] Property hooks
Hi Juliette On 9-4-2024 16:03, Juliette Reinders Folmer wrote: > On 8-4-2024 23:39, Ilija Tovilo wrote: >> >> https://wiki.php.net/rfc/property-hooks >> > > I realize it is late in the discussion period to speak up, but for months > I've been trying to find the words to express my concerns in a polite and > constructive way and have failed. First of all, thank you for taking the time to analyze the RFC and form your thoughts. I know how much time and effort it takes. Nonetheless, I hope you understand that receiving feedback of this sort literally minutes before a vote is planned is close to the worst way to go about it, both from a practical standpoint, but also RFC author and reviewer morale. Many of the points you raise have been there for many months to a year, and many have been raised and discussed many times over. I will try to address each, and give people some time to respond. > And not just one syntax, but five and the differences in the semantics of the > function syntaxes are significant. I think this is somewhat of a misrepresentation. For reference, you can find the grammar here. It's only about <50 lines. https://github.com/php/php-src/blob/bf390b47c7522fd4d130be26b8ee97520f985275/Zend/zend_language_parser.y#L1088-L1131 Notably, `get` and `set` hooks don't actually have a different grammar. Instead, hooks have names, optional parameter lists, and an optional short (`=>`) or long (`{}`) body. Whether all possible combinations are considered different syntax is open to interpretation. It's true that additional rules apply to each hook, e.g. whether they are allowed to declare a parameter list, and how the return value is used. I could see an argument being made for allowing an empty parameter list (`()`) for `get`, for consistency. Let us know if you have any other ideas to make them more congruent. In any case, I don't believe PHP_CodeSniffer would need to handle each combination separately. What would your optimal solution look like? I feel this discussion cannot progress unless we have more concrete discussion points. > * The implicit "set" parameter, which does not have a explicit parameter, > does not have parentheses and has the magically created $value variable. To be a bit more exact, the parameter list is simply inferred to `( $value)` so that the user does not need to spell it out. This is not akin to other magic variables we had previously, like `$http_response_header`. > TL;DR: this RFC tries to do too much in one go and introduces a huge amount > of cognitive complexity with all the exceptions and the differences in > behaviour between virtual and backed properties. This cognitive complexity is > so high that I expect that the feature will catch most developers out a lot > of the time. Many people still say that this RFC doesn't do enough because it doesn't support x/y/z. This is including you: > * The arrow function variant for `set` with all the above differences + the > differences inherent to arrow functions with the above mentioned exceptions + > the implicit assignment, which breaks the expected behaviour of arrow > functions by assigning the result of the expression instead of returning it > (totally understandable, but still a difference). > * Properties _may_ or _may not_ have a default value anymore, depending on > whether the hooks cause it to be a backed or a virtual property. > * Properties with hooks can _be_ readonly, but cannot be declared as readonly. > * Only available for object properties, static properties are excluded. > * Readonly properties which are not denoted as readonly, but still are due to > their virtual nature (get without access to $this->prop), which can only be > figured out by, again, studying the contents of the hook function. There are more below that I will be commenting on in more detail. I understand that, in a perfect world, each language feature would automatically work with every other. In reality, this is either not possible, or complex and time consuming. We opted for the middle ground, allowing things that were workable and omitting things that weren't, or required unrelated changes. If this is not acceptable, then I genuinely don't know how to approach non-trivial RFCs. > * Properties can now be declared as abstract, but only with explicit hook > requirements, otherwise the abstract keyword is not allowed. > * Properties can now be declared on interfaces, but only with explicit hook > requirements, otherwise they are not allowed. The semantics of plain interface properties are not clear. Intuitively, `public $prop;` looks equivalent to `public $prop { get; set; }`, but due to reference semantics it is not. It's not even equivalent to `{ &get; set; }`, because by-reference hooked properties still have some inherent limita
Re: [PHP-DEV] [RFC][Vote announcement] Property hooks
Hi Robert On Tue, Apr 9, 2024 at 9:34 PM Robert Landers wrote: > > On Tue, Apr 9, 2024 at 8:56 PM Larry Garfield wrote: > > > > The Aviz RFC was put to a vote last year but didn't pass. > > It would be really nice if votes weren't just a yes/no vote, but > yes/needs-more-work/no vote, where needs-more-work and no are > effectively the same in terms of passing the RFC, but needs-more-work > just means there is more to do (either addressing ugly syntax or the > idea is sound, but as it says, it needs more work), and can thus be > simply revoted on after concerns are addressed -- instead of creating > a whole new RFC that needs to pass the "not too similar to other RFCs > rule." The asymmetric visibility RFC did include a poll for no votes. https://wiki.php.net/rfc/asymmetric-visibility#proposed_voting_choices > I got the impression from the Aviz discussions that most people were > against Aviz due to the syntax, not the feature itself. It would be > absolutely tragic if this failed to pass simply because people > expected Aviz here. According to the poll, syntax was one, but not the primary reason for its rejection. The primary reason was that some people don't believe the feature is necessary at all. IIRC, people were arguing that readonly covers 80% of use-cases, because it protects against writes to the property both publicly and privately. I don't agree with this viewpoint, because I think readonly is bad for ergonomics. In fact, we already had an RFC that attempted to fix clone for readonly (https://wiki.php.net/rfc/readonly_amendments) but this fix was not complete (because it's still not possible to pass values from clone to __clone). "Clone with" is another thing needed to fix this, and at this point it just feels like applying more band-aids. For DTOs, I believe value types (i.e. data classes, https://externals.io/message/122845) solve the problem of "spooky actions at a distance" in a cleaner and more ergonomic way. For services and other intentional reference types, readonly often isn't the right choice either, just to make the property not publicly writable. Asymmetric visibility would be a much more fitting choice. Anyway, we didn't include asymmetric visibility in this RFC because: * We wanted to avoid getting rejected by people who fundamentally dislike asymmetric visibility. * We didn't feel it was fair to "sneak" the feature back in through some other RFC, when it was explicitly rejected. Instead, we are planning to re-propose asymmetric visibility once property hooks are merged, as it may become more apparent why it is useful. Ilija
[PHP-DEV] [RFC][Vote announcement] Property hooks
Hi everyone Heads-up: Larry and I would like to start the vote of the property hooks RFC tomorrow: https://wiki.php.net/rfc/property-hooks We have worked long and hard on this RFC, and hope that we have found some middle-ground that works for the majority. One last concern we have not officially clarified on the list: https://externals.io/message/122445#122667 >> I personally do not feel strongly about whether asymmetric types make it >> into the initial implementation. Larry does, however, and I think it is not >> fair to exclude them without providing any concrete reasons not to. [snip] > > My concern is more about the external impact of what is effectively a change > to the type system of the language: [snip] will tools like PhpStan and Psalm > require complex changes to analyse code using such properties? In particular, this paragraph is referencing the ability to widen the accepted $value parameter type of the set hook, described at the bottom of https://wiki.php.net/rfc/property-hooks#set. I have talked to Ondřej Mirtes, the maintainer of PHPStan, and he confirmed that this should not be complex to implement in PHPStan. In fact, PHPStan already offers the @property-read and @property-write class annotations, which can be used to describe "virtual" properties handled within __get/__set, already providing asymmetric types of sorts. Hence, this concern should be a non-issue. Thank you to everybody who has contributed to the discussion! Ilija
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi Rowan On Fri, Apr 5, 2024 at 12:28 AM Rowan Tommins [IMSoP] wrote: > > On 03/04/2024 00:01, Ilija Tovilo wrote: > > Regardless of the implementation, there are a lot of interactions we will > want to consider; and we will have to keep considering new ones as we add to > the language. For instance, the Property Hooks RFC would probably have needed > a section on "Interaction with Data Classes". That remark was implying that data classes really are just classes with some additional tweaks. That gives us the ability to handle them differently when desired. However, they will otherwise behave just like classes, which makes it not so different from your suggestion. > On a practical note, a few things I've already thought of to consider: > > - Can a data class have readonly properties (or be marked "readonly data > class")? If so, how will they behave? Yes. The CoW semantics become irrelevant, given that nothing may trigger a separation. However, data classes also include value equality, and hashing in the future. These may still be useful for immutable data. > - Can you explicitly use the "clone" keyword with an instance of a data > class? Does it make any difference? Manual cloning is not useful, but it's also not harmful. So I'm leaning towards allowing this. This way, data classes may be handled generically, along with other non-data classes. > - Tied into that: can you implement __clone(), and when will it be called? Yes. `__clone` will be called when the object is separated, as you would expect. > - If you implement __set(), will copy-on-write be triggered before it's > called? Yes. Separation happens as part of the property fetching, rather than the assignment itself. Hence, for `$foo->bar->baz = 'baz';`, once `Bar::__set('baz', 'baz')` is called, `$foo` and `$foo->bar` will already have been separated. > - Can you implement __destruct()? Will it ever be called? Yes. As with any other object, this will be called once the last reference to the object goes away. There's nothing special going on. It's worth noting that CoW makes `__clone` and `__destruct` somewhat nondeterministic, or at least non-obvious. > > Consider this example, which would > work with the current approach: > > > > $shapes[0]->position->zero!(); > > I find this concise example confusing, and I think there's a few things to > unpack here... I think you're putting too much focus on CoW. CoW should really be considered an implementation detail. It's not _fully_ transparent, given that it is observable through `__clone` and `__destruct` as mentioned above. But it is _mostly_ transparent. Conceptually, the copy happens not when the method is called, but when the variable is assigned. For your example: ```php $shape = new Shape(new Position(42,42)); $copy = $shape; // Conceptually, a recursive copy happens here. $copy->position->zero!(); // $shape is already detached from $copy. The ! merely indicates that the value is modified. ``` > The array access doesn't need any special marker, because there's no > ambiguity. This is only true if you ignore ArrayAccess. `$foo['bar']` does not necessarily indicate that `$foo` is an array. If it were a `Vector`, then we would absolutely need an indication to separate it. It's true that `$foo->bar` currently indicates that `$foo` is a reference type. This assumption would break with this RFC, but that's also kind of the whole point. > What is going to be CoW cloned, and what is going to be modified in place? I > can't actually know without knowing the definition behind both $item and > $item->shape. It might even vary depending on input. For the most part, data classes should consist of other value types, or immutable reference types (e.g. DateTimeImmutable). This actually makes the rules quite simple: If you assign a value type, the entire data structure is copied recursively. The fact that PHP delays this step for performance is unimportant. The fact that immutable reference types aren't cloned is also unimportant, given that they don't change. Ilija
Re: [PHP-DEV] Proposal: retrieve line, filename and if user defined for ReflectionAttribute
Hi Joel On Fri, Apr 5, 2024 at 3:10 PM Joel Wurtz wrote: > > Like a lot of libraries, we offer the possibility to configure behaviors with > Attributes. However in some cases it's wrongly configured by the user and > this wrong configuration cannot be detected on the attribute constructor but > afterwards. > > In this case we may want to pinpoint which attribute (in which file and at > which line) cause this bad configuration. Since there was no method to > retrieve those information in the ReflectionAttribute I proposed a PR > https://github.com/php/php-src/pull/13889 to add those informations. > > I do believe this will allow better DX for end user when correctly used, I would propose negating and renaming isUserDefined() to isInternal(), since we already have several such methods. As hinted in the PR, I believe the implementation is wrong, or rather doesn't do what you would expect it to do. I'm also wondering whether it may be more useful to expose the attribute class as getClass(), without instantiating the attribute itself. This would allow you to check for isInternal() there, along with all the other class reflection information. Ilija
Re: [PHP-DEV] RFC idea: using the void type to control maximum arity of user-defined functions
On Thu, Apr 4, 2024 at 5:58 PM Tim Düsterhus wrote: > > On 4/4/24 16:36, Pablo Rauzy wrote: > > I strongly agree in theory, but this could break existing code, and > > moreover such a proposal was already rejected: > > https://wiki.php.net/rfc/strict_argcount > > The RFC is 9 years old by now. My gut feeling is be that using an actual > variadic parameter for functions that are variadic is what people do, > because it makes the function signature much clearer. Actually variadic > parameters are available since PHP 5.6, which at the time of the > previous RFC was the newest version. Since then we had two major > releases, one of which (7.x) is already out of support. > > I think it would be reasonable to consider deprecating passing extra > arguments to a non-variadic function. IIRC one of the bigger downsides of this change are closure calls that may provide arguments that the callee does not care about. https://3v4l.org/0QdoS ``` function filter($array, callable $c) { $result = []; foreach ($array as $key => $value) { if ($c($value, $key)) { $result[$key] = $value; } } return $result; } var_dump(filter(['foo', '', 'bar'], function ($value) { return strlen($value); })); // Internal functions already throw on superfluous args var_dump(filter(['foo', '', 'bar'], 'strlen')); ``` The user may currently choose to omit the $key parameter of the closure, as it is never used. In the future, this would throw. We may decide to create an exemption for such calls, but I'm not sure replacing one inconsistency with another is a good choice. Ilija
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi Larry On Wed, Apr 3, 2024 at 12:03 AM Larry Garfield wrote: > > On Tue, Apr 2, 2024, at 6:04 PM, Ilija Tovilo wrote: > > > I think you misunderstood. The intention is to mark both call-site and > > declaration. Call-site is marked with ->method!(), while declaration > > is marked with "public mutating function". Call-site is required to > > avoid the engine complexity, as previously mentioned. But > > declaration-site is required so that the user (and IDEs) even know > > that you need to use the special syntax at the call-site. > > Ah, OK. That's... unfortunate, but I defer to you on the implementation > complexity. As I've argued, I believe the different syntax is a positive. This way, data classes are known to stay unmodified unless: 1. You're explicitly modifying it yourself. 2. You're calling a mutating method, with its associated syntax. 3. You're creating a reference from the value, either explicitly or by passing it to a by-reference parameter. By-reference argument passing is the only way that mutations of data classes can be hidden (given that they look exactly like normal by-value arguments), and its arguably a flaw of by-reference passing itself. In all other cases, you can expect your value _not_ to unexpectedly change. For this reason, I consider it as an alternative approach to readonly classes. > > Disallowing ordinary by-ref objects is not trivial without additional > > performance penalties, and I don't see a good reason for it. Can you > > provide an example on when that would be problematic? > > There's two aspects to it, that I see. > > data class A { > public function __construct(public string $name) {} > } > > data class B { > public function __construct( > public A $a, > public PDO $conn, > ) {} > } > > $b = new B(new A(), $pdoConnection); > > function stuff(B $b2) { > $b2->a->name = 'Larry'; > // This triggers a CoW on $b2, separating it from $b, and also creating a > new instance of A. What about $conn? > // Does it get cloned? That would be bad. Does it not get cloned? That > seems weird that it's still the same on > // a data object. > > $b2->conn->beginTransaction(); > // This I would say is technically a modification, since the state of the > connection is changing. But then > // should this trigger $b2 cloning from $b1? Neither answer is obvious to > me. > } IMO, the answer is relatively straight-forward: PDO is a reference type. For all intents and purposes, when you're passing B to stuff(), B is copied. Since B::$conn is a "reference" (read pointer), copying B doesn't copy the connection, only the reference to it. B::$a, however, is a value type, so copying B also copies A. The fact that this isn't _exactly_ what happens under the hood due to CoW is an implementation detail, it doesn't need to change how you think about it. From the users standpoint, $b and $b2 can already separate values once stuff() is called. This is really no different from arrays: ```php $b = ['a' => ['name' => 'Larry'], 'conn' => $pdoConnection]; $b2 = $b; // $b is detached from $b2, $b['conn'] remains a shared object. ``` > The other aspect is, eg, serialization. People will come to expect > (reasonably) that a data class will have certain properties (in the abstract > sense, not lexical sense). For instance, most classes are serializable, but > a few are not. (Eg, if they have a reference to PDO or a file handle or > something unserializable.) Data classes seem like they should be safe to > serialize always, as they're "just data". If data classes are limited to > primitives and data classes internally, that means we can effectively > guarantee that they will be serializable, always. If one of the properties > could be a non-serializable object, that assumption breaks. I'm not sure that's a convincing argument to fully disallow reference types, especially since it would prevent you from storing DateTimeImmutables and other immutable values in data classes and thus break many valid use-cases. That would arguably be very limiting. > There's probably other similar examples besides serialization where "think of > this as data" and "think of this as logic" is how you'd want to think, which > leads to different assumptions, which we shouldn't stealthily break. I think your assumption here is that non-data classes cannot contain data. This doesn't hold, and especially will not until data classes become more common. Readonly classes can be considered strict versions of data classes in terms of mutability, minus some of the other semantic changes (e.g. identity). Ilija
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi Rowan On Tue, Apr 2, 2024 at 10:10 PM Rowan Tommins [IMSoP] wrote: > > On 02/04/2024 01:17, Ilija Tovilo wrote: > > I'd like to introduce an idea I've played around with for a couple of > weeks: Data classes, sometimes called structs in other languages (e.g. > Swift and C#). > > I'm not sure if you've considered it already, but mutating methods should > probably be constrained to be void (or maybe "mutating" could occupy the > return type slot). Otherwise, someone is bound to write this: > > $start = new Location('Here'); > $end = $start->move!('There'); > > Expecting it to mean this: > > $start = new Location('Here'); > $end = $start; > $end->move!('There'); > > When it would actually mean this: > > $start = new Location('Here'); > $start->move!('There'); > $end = $start; I think there are some valid patterns for mutating methods with a return value. For example, Set::add() might return a bool to indicate whether the value was already present in the set. > I seem to remember when this was discussed before, the argument being made > that separating value objects completely means you have to spend time > deciding how they interact with every feature of the language. Data classes are classes with a single additional zend_class_entry.ce_flags flag. So unless customized, they behave as classes. This way, we have the option to tweak any behavior we would like, but we don't need to. Of course, this will still require an analysis of what behavior we might want to tweak. > Does the copy-on-write optimisation actually require the entire class to be > special, or could it be triggered by a mutating method on any object? To > allow direct modification of properties as well, we could move the call-site > marker slightly to a ->! operator: > > $foo->!mutate(); > $foo->!bar = 42; I suppose this is possible, but it puts the burden for figuring out what to separate onto the user. Consider this example, which would work with the current approach: $shapes[0]->position->zero!(); The left-hand-side of the mutating method call is fetched by "read+write". Essentially, this ensures that any array or data class is separated (copied if RC >1). Without such a class-wide marker, you'll need to remember to add the special syntax exactly where applicable. $shapes![0]!->position!->zero(); In this case, $shapes, $shapes[0], and $shapes[0]->position must all be separated. This seems very easy to mess up, especially since only zero() is actually known to be separating and can thus be verified at runtime. > The main drawback I can see (outside of the implementation, which I can't > comment on) is that we couldn't overload the === operator to use value > semantics. In exchange, a lot of decisions would simply be made for us: they > would just be objects, with all the same behaviour around inheritance, > serialization, and so on. Right, this would either require some other marker that switches to this mode of comparison, or operator overloading. Ilija
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi Niels On Tue, Apr 2, 2024 at 8:16 PM Niels Dossche wrote: > > On 02/04/2024 02:17, Ilija Tovilo wrote: > > Hi everyone! > > > > I'd like to introduce an idea I've played around with for a couple of > > weeks: Data classes, sometimes called structs in other languages (e.g. > > Swift and C#). > > As already hinted in the thread, I also think inheritance may be dangerous in > a first version. > I want to add to that: if you extend a data-class with a non-data-class, the > data-class behaviour gets lost, which is logical in a sense but also > surprised me in a way. Yes, that's definitely not intended. I haven't implemented any inheritance checks yet. But if inheritance is allowed, then it should be restricted to classes of the same kind (by-ref or by-val). > Also, FWIW, I'm not sure about the name "data" class, perhaps "value" class > or something alike is what people may be more familiar with wrt semantics, > although dataclass is also a known term. I'm happy with value class, struct, record, data class, what have you. I'll accept whatever the majority prefers. > I do have a question about iterator behaviour. Consider this code: > ``` > data class Test { > public $a = 1; > public $b = 2; > } > > $test = new Test; > foreach ($test as $k => &$v) { > if ($k === "b") > $test->a = $test; > var_dump($k); > } > ``` > > This will reset the iterator of the object on separation, so we will get an > infinite loop. > Is this intended? > If so, is it because the right hand side is the original object while the > left hand side gets the clone? > Is this consistent with how arrays separate? That's a good question. I have not really thought about iterators yet. Modification of an array iterated by-reference does not restart the iterator. Actually, by-reference capturing of the value also captures the array by-reference, which is not completely intuitive. My initial gut feeling is to handle data classes the same, i.e. capture them by-reference when iterating the value by reference, so that iteration is not restarted. Ilija
Re: [PHP-DEV] Requiring GPG Commit Signing
On Tue, Apr 2, 2024 at 9:43 PM Rowan Tommins [IMSoP] wrote: > > Similarly, if you discover a compromised key or signing account, you can look > for uses of that key or account, which might be a tiny number from a non-core > contributor; if you discover a compromised account pushing unsigned commits, > you have to audit every commit in the repository. Right, that and what Jakub mentioned are fair arguments. > I agree it's not a complete solution, but no security measure is; it's always > about reducing the attack surface or limiting the damage. Right. That was the original intention of my e-mail: To point out that we might also want to consider other mitigations. Not that we shouldn't do commit signing. Ilija
Re: [PHP-DEV] Requiring GPG Commit Signing
Hi Rowan On Tue, Apr 2, 2024 at 8:48 PM Rowan Tommins [IMSoP] wrote: > > In fact, you don't need to compromise anybody's key: you could socially > engineer a situation where you have push access to the repository, or break > the security in some other way. As I understand it, this is exactly what > happened 3 years ago: someone gained direct write access to the git.php.net > server, and added commits "authored by" Nikita and others to the history in > the repository. Right, but I would like to believe that attaining push access _without gaining access to a maintainers account_ should be substantially harder on GitHub than our self-hosted git server. :) > If all commits are signed, a compromised key or account can only be used to > sign commits with that specific identity: your GitHub account can't be used > to sign commits as Derick or Nikita, only as you. The impact is limited to > one identity, not the integrity of the entire repository. But, does it matter? I'm not sure we look at some commits closer than others, based on its author. It's true that it might be easier to identify malicious commits if they all come from the same user, but it wouldn't prevent them. To be clear: I'm not against commit signing, I've been doing it for years. I'm just unsure if it's a sufficient solution (apart from releases, which are a whole different can of worms). Ilija
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi Larry On Tue, Apr 2, 2024 at 5:31 PM Larry Garfield wrote: > > On Tue, Apr 2, 2024, at 12:17 AM, Ilija Tovilo wrote: > > Hi everyone! > > > > I'd like to introduce an idea I've played around with for a couple of > > weeks: Data classes, sometimes called structs in other languages (e.g. > > Swift and C#). > > > > * Data classes are ordinary classes, and as such may implement > > interfaces, methods and more. I have not decided whether they should > > support inheritance. > > What would be the reason not to? As you indicated in another reply, the main > reason some languages don't is to avoid large stack copies, but PHP doesn't > have large stack copies for objects anyway so that's a non-issue. > > I've long argued that the fewer differences there are between service classes > and data classes, the better, so I'm not sure what advantage this would have > other than "ugh, inheritance is such a mess" (which is true, but that ship > sailed long ago). One issue that just came to mind is object identity. For example: class Person { public function __construct( public string $firstname, public string $lastname, ) {} } class Manager extends Person { public function bossAround() {} } $person = new Person('Boss', 'Man'); $manager = new Manager('Boss', 'Man'); var_dump($person === $manager); // ??? Equality for data objects is based on data, rather than the object handle. How does this interact with inheritance? Technically, Person and Manager represent the same data. Manager contains additional behavior, but does that change identity? I'm not sure what the answer is. That's just the first thing that came to mind. I'm confident we'll discover more such edge cases. Of course, I can invest the time to find the questions before deciding to disallow inheritance. > > * Mutating method calls on data classes use a slightly different > > syntax: `$vector->append!(42)`. All methods mutating `$this` must be > > marked as `mutating`. The reason for this is twofold: 1. It signals to > > the caller that the value is modified. 2. It allows `$vector` to be > > cloned before knowing whether the method `append` is modifying, which > > hugely reduces implementation complexity in the engine. > > As discussed in R11, it would be very beneficial if this marker could be on > the method definition, not the method invocation. You indicated that would > be Hard(tm), but I think it's worth some effort to see if it's surmountably > hard. (Or at least less hard than just auto-detecting it, which you > indicated is Extremely Hard(tm).) I think you misunderstood. The intention is to mark both call-site and declaration. Call-site is marked with ->method!(), while declaration is marked with "public mutating function". Call-site is required to avoid the engine complexity, as previously mentioned. But declaration-site is required so that the user (and IDEs) even know that you need to use the special syntax at the call-site. > So to the extent there is a consensus, equality, stringifying, and a hashcode > (which we don't have yet, but will need in the future for some things I > suspect) seem to be the rough expected defaults. I'm just skeptical whether the default __toString() is ever useful. I can see an argument for it for quick debugging in languages that don't provide something like var_dump(). In PHP this seems much less useful. It's impossible to provide a default implementation that works everywhere (or pretty much anywhere, even). Equality is already included. Hashing should be added separately, and probably not just to data classes. > > * In the future, it should be possible to allow using data classes in > > `SplObjectStorage`. However, because hashing is complex, this will be > > postponed to a separate RFC. > > Would data class properties only be allowed to be other data classes, or > could they hold a non-data class? My knee jerk response is they should be > data classes all the way down; the only counter-argument I can think of it > would be how much existing code is out there that is a "data class" in all > but name. I still fear someone adding a DB connection object to a data class > and everything going to hell, though. :-) Disallowing ordinary by-ref objects is not trivial without additional performance penalties, and I don't see a good reason for it. Can you provide an example on when that would be problematic? Ilija
Re: [PHP-DEV] Requiring GPG Commit Signing
Hi Derick On Tue, Apr 2, 2024 at 4:15 PM Derick Rethans wrote: > > What do y'all think about requiring GPG signed commits for the php-src > repository? Let me repost my internal response for visibility. I'm currently struggling to understand what kind of attack signing commits prevents. If your GitHub account is compromised, GitHub allows the attacker to commit via web interface and will happily sign their commits with a gpg key auto-generated for your account. See: https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification > GitHub will automatically use GPG to sign commits you make using the web > interface. Commits signed by GitHub will have a verified status. You can > verify the signature locally using the public key available at > https://github.com/web-flow.gpg. Even if this wasn't the case, the attacker may simply register their own gpg key in your account, with the commits appearing as verified. If your ssh key is compromised instead, and you use ssh to sign your commits, the attacker may sign their malicious commits with that same key they may use to push. The only thing this really seems to prevent is pushing commits via a compromised ssh key, while commits need to be signed with gpg. If that's the intention, we should require using gpg rather than ssh for signing (or using a different ssh key, I suppose). Additionally, it may help for people who push via HTTP+auth token, but that's probably not advisable in the first place. Something that may also help is restricting pushes to patch branches (PHP-x.y.z) to release managers. These branches are not commonly looked at by the public, and so it may be easier to sneak malicious commits into them. In addition, we should keep GitHub privileges narrow, especially branch protection configuration. As mentioned by others, this does not prevent the xz issue. But paired with an auto-deployment solution, it could definitely help. It would be even better if release managers cannot change CI, and CI maintainers cannot create releases, as this essentially enforces the 4-eyes principle. The former may be hard to enforce, as CI lives in the same repository. Another solution might be to require PRs, and PR verifications. But this will inevitably create overhead for maintainers. Ilija
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi Alexander On Tue, Apr 2, 2024 at 4:53 AM Alexander Pravdin wrote: > > On Tue, Apr 2, 2024 at 9:18 AM Ilija Tovilo wrote: > > > > I'd like to introduce an idea I've played around with for a couple of > > weeks: Data classes, sometimes called structs in other languages (e.g. > > Swift and C#). > > While I like the idea, I would like to suggest something else in > addition or as a separate feature. As an active user of readonly > classes with all promoted properties for data-holding purposes, I > would be happy to see the possibility of cloning them with passing > some properties to modify: > > readonly class Data { > function __construct( > public string $foo, > public string $bar, > public string $baz, > ) {} > } > > $data = new Data(foo: 'A', bar: 'B', baz: 'C'); > > $data2 = clone $data with (bar: 'X', baz: 'Y'); What you're asking for is part of the "Clone with" RFC: https://wiki.php.net/rfc/clone_with This issue is valid and the RFC would improve the ergonomics of readonly classes. However, note that it really only addresses a small part of what this RFC tries achieve: > Some APIs further exacerbate the issue by requiring multiple copies for multiple modifications (e.g. `$response->withStatus(200)->withHeader('X-foo', 'foo');`). Readonly works fine for compact data structures, even if it is copied more than it needs. For large data structures, like large lists, a copy for each modification would be detrimental. https://3v4l.org/GR6On See how the performance of an insert into an array tanks if a copy of the array is performed in each iteration (due to an additional reference to it). Readonly is just not viable for data structures such as lists, maps, sets, etc. Ilija
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi Marco On Tue, Apr 2, 2024 at 2:56 AM Deleu wrote: > > > > On Mon, Apr 1, 2024 at 9:20 PM Ilija Tovilo wrote: >> >> I'd like to introduce an idea I've played around with for a couple of >> weeks: Data classes, sometimes called structs in other languages (e.g. >> Swift and C#). >> >> snip >> >> Some other things to note about data classes: >> >> * Data classes are ordinary classes, and as such may implement >> interfaces, methods and more. I have not decided whether they should >> support inheritance. > > I'd argue in favor of not including inheritance in the first version. Taking > inheritance out is an impossible BC Break. Not introducing it in the first > stable release gives users a chance to evaluate whether it's something we > will drastically miss. I would probably agree. I believe the reasoning some languages don't support inheritance for value types is because they are stored on the stack. Inheritance encourages large structures, but copying very large structures over and over on the stack may be slow. In PHP, objects always live on the heap, and due to CoW we don't have this problem. Still, it may be beneficial to disallow inheritance first, and relax this restriction if it is necessary. >> * Mutating method calls on data classes use a slightly different >> syntax: `$vector->append!(42)`. All methods mutating `$this` must be >> marked as `mutating`. The reason for this is twofold: 1. It signals to >> the caller that the value is modified. 2. It allows `$vector` to be >> cloned before knowing whether the method `append` is modifying, which >> hugely reduces implementation complexity in the engine. > > I'm not sure if I understood this one. Do you mean that the `!` modifier here > (at call-site) is helping the engine clone the variable before even diving > into whether `append()` has been tagged as mutating? Precisely. The issue comes from deeper nested values: $circle->position->zero(); Imagine that Circle is a data class with a Position, which is also a data class. Position::zero() is a mutating method that sets the coordinates to 0:0. For this to work, not only the position needs to be copied, but also $circle. However, the engine doesn't yet know ahead of time whether zero() is mutating, and as such needs to perform a copy. One idea was to evaluate the left-hand-side of the method call, and repeat it with a copy if the method is mutating. However, this is not trivially possible, because opcodes consume their operands. So, for an expression like `getCircle()->position->zero()`, the return value of `getCircle()` is already gone. `!` explicitly distinguishes the call from non-mutating calls, and knows that a copy will be needed. But as mentioned previously, I think a different syntax offers additional benefits for readability. > From outside it looks odd that a clone would happen ahead-of-time while > talking about copy-on-write. Would this syntax break for non-mutating methods? If by break you mean the engine would error, then yes. Only mutating methods may (and must) be called with the $foo->bar!() syntax. Ilija
[PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Hi everyone! I'd like to introduce an idea I've played around with for a couple of weeks: Data classes, sometimes called structs in other languages (e.g. Swift and C#). In a nutshell, data classes are classes with value semantics. Instances of data classes are implicitly copied when assigned to a variable, or when passed to a function. When the new instance is modified, the original instance remains untouched. This might sound familiar: It's exactly how arrays work in PHP. ```php $a = [1, 2, 3]; $b = $a; $b[] = 4; var_dump($a); // [1, 2, 3] var_dump($b); // [1, 2, 3, 4] ``` You may think that copying the array on each assignment is expensive, and you would be right. PHP uses a trick called copy-on-write, or CoW for short. `$a` and `$b` actually share the same array until `$b[] = 4;` modifies it. It's only at this point that the array is copied and replaced in `$b`, so that the modification doesn't affect `$a`. As long as a variable is the sole owner of a value, or none of the variables modify the value, no copy is needed. Data classes use the same mechanism. But why value semantics in the first place? There are two major flaws with by-reference semantics for data structures: 1. It's very easy to forget cloning data that is referenced somewhere else before modifying it. This will lead to "spooky actions at a distance". Having recently used JavaScript (where all data structures have by-reference semantics) for an educational IR optimizer, accidental mutations of shared arrays/maps/sets were my primary source of bugs. 2. Defensive cloning (to avoid issue 1) will lead to useless work when the value is not referenced anywhere else. PHP offers readonly properties and classes to address issue 1. However, they further promote issue 2 by making it impossible to modify values without cloning them first, even if we know they are not referenced anywhere else. Some APIs further exacerbate the issue by requiring multiple copies for multiple modifications (e.g. `$response->withStatus(200)->withHeader('X-foo', 'foo');`). As you may have noticed, arrays already solve both of these issues through CoW. Data classes allow implementing arbitrary data structures with the same value semantics in core, extensions or userland. For example, a `Vector` data class may look something like the following: ```php data class Vector { private $values; public function __construct(...$values) { $this->values = $values; } public mutating function append($value) { $this->values[] = $value; } } $a = new Vector(1, 2, 3); $b = $a; $b->append!(4); var_dump($a); // Vector(1, 2, 3) var_dump($b); // Vector(1, 2, 3, 4) ``` An internal Vector implementation might offer a faster and stricter alternative to arrays (e.g. Vector from php-ds). Some other things to note about data classes: * Data classes are ordinary classes, and as such may implement interfaces, methods and more. I have not decided whether they should support inheritance. * Mutating method calls on data classes use a slightly different syntax: `$vector->append!(42)`. All methods mutating `$this` must be marked as `mutating`. The reason for this is twofold: 1. It signals to the caller that the value is modified. 2. It allows `$vector` to be cloned before knowing whether the method `append` is modifying, which hugely reduces implementation complexity in the engine. * Data classes customize identity (`===`) comparison, in the same way arrays do. Two data objects are identical if all their properties are identical (including order for dynamic properties). * Sharing data classes by-reference is possible using references, as you would for arrays. * We may decide to auto-implement `__toString` for data classes, amongst other things. I am still undecided whether this is useful for PHP. * Data classes protect from interior mutability. More concretely, mutating nested data objects stored in a `readonly` property is not legal, whereas it would be if they were ordinary objects. * In the future, it should be possible to allow using data classes in `SplObjectStorage`. However, because hashing is complex, this will be postponed to a separate RFC. One known gotcha is that we cannot trivially enforce placement of `modfying` on methods without a performance hit. It is the responsibility of the user to correctly mark such methods. Here's a fully functional PoC, excluding JIT: https://github.com/php/php-src/pull/13800 Let me know what you think. I will start working on an RFC draft once work on property hooks concludes. Ilija
Re: [PHP-DEV] GitHub milestones
Hi Jakub On Wed, Mar 27, 2024 at 4:55 PM Jakub Zelenka wrote: > > We actually decided not to do it for 8.3 because it would be just waste of > time to set all PR's with that milestone. The thing is that PR should just > get merged when it's ready and we won't be delaying release because some PR's > in that milestone are not ready so it does not have any meaning. Thank you, I was unaware that this was a conscious decision. I agree that it's not particularly useful for the next minor version. If they are ready, nothing is blocking a merge to master. > I'm not really sure if there's any point to have non-draft PR's targeting > next major version because they cannot be merged to master until it is > decided the next version will be the major one. So those PR's should be draft > but it might make sense to create milestone for them to show quickly why they > are in draft. Draft PRs that target the next major version can make sense if they are part of an RFC. I generally believe that every RFC should have at least a proof-of-concept. In my experience, the implementation is the only reliable way to reveal conceptual issues. But then again, such RFCs should be in the RFC listing, as mentioned in my previous message. So I agree that there's not a big need to milestones. > So I don't think there is much point in adding 8.4 milestone but 9.0 might be > useful. That sounds reasonable to me, as it shouldn't cost much. Milestones don't need to be complete either, it can be added where it makes sense, for things that might be forgotten otherwise. Ilija
Re: [PHP-DEV] Request for RFC karma
Hi! On Wed, Mar 27, 2024 at 12:59 PM 하늘아부지 wrote: > > I request RFC karma to discuss the issue at this link. > https://github.com/php/php-src/issues/13813 > > Wiki account : daddyofsky Either somebody beat me to it, or you already had RFC privileges. Note that step 1 in the RFC process suggests introducing the RFC idea to the mailing list first. This way, you can get an initial feel whether your idea is desired by other developers, and thus whether it's worth your time. Good luck with the RFC! Ilija
Re: [PHP-DEV] GitHub milestones
Hi Peter On Wed, Mar 27, 2024 at 9:44 AM Peter Kokot wrote: > > I was wondering if it would be useful to add GitHub milestones for the > PHP-8.4 and PHP-9.0 (or PHP-next or something like this)? > https://github.com/php/php-src/milestones > > Because some pull requests might target versions after the PHP-8.4 and it > might be useful to have them additionally sorted to not forget about them. > Not to tag all PRs of course but only those which are meant to go into some > of the future branches. Milestones have already been used in the past (https://github.com/php/php-src/milestones?state=closed), so there's no reason not to do the same for 8.4 and 9.0. Most likely, we just forgot. It's probably not documented as part of the release process. RFCs that have actionable tasks for the next major version should be listed under "Pending Implementation / Landing" on the RFC page: https://wiki.php.net/rfc#pending_implementationlanding Adding them to a GitHub Milestone would make it a bit more explicit. As for PRs that don't have an RFC, adding them to a milestone definitely makes sense. However, PRs that may only be merged in the next major version, and _don't_ require an RFC are extremely rare. Ilija
Re: [PHP-DEV] Proposal: AS assertions
Hi Rowan On Tue, Mar 19, 2024 at 8:39 PM Rowan Tommins [IMSoP] wrote: > > As well pattern matching, which Ilija mentioned, another adjacent feature is > a richer set of casting operators. Currently, we can assert that something is > an int; or we can force it to be an int; but we can't easily say "make this > an int if safe, but throw otherwise" or "make this an int if safe, but > substitute null/$someValue otherwise". > > I've been considering how we can improve that for a while, but not settled on > a firm proposal - there's a lot of different versions we *could* support, so > choosing a minimal set is hard. I've thought about this in the context of pattern matching a while back. I was thinking about something like `$x is ~int`, where the pattern match is successful iff `$x` is coercible to `int` without loss of information. Given that patterns may be nested, `array<~int>` could check that all elements of an array are coercible to `int`. The same could work for literal patterns, e.g. `~5`, where `5`, `5.0` and `'5'` are all accepted. This can potentially be combined with the variable binding pattern, `$var @ pattern`. The syntax looks a bit confusing at first, but it basically makes sure that the matched value conforms to `pattern`, and then binds it to `$var`. Hence, something like `$foo as Foo { $bar @ ~int }` would 1. make sure `$foo` is an instance of `Foo`, 2. make sure `$foo->bar` is coercible to `int`, and then assigned the coerced value to `$bar`. (It gets more complicated, because the assignment must be delayed until the entire pattern matches.) If the pattern matching fails at any point, it throws. This is just an idea, neither the `as` operator nor the `~` pattern have been implemented. I don't know whether they are feasible. Anyway, we're probably going off-topic. :) Ilija
Re: [PHP-DEV] Proposal: AS assertions
Hi Marco On Tue, Mar 19, 2024 at 7:04 PM Marco Aurélio Deleu wrote: > > > On 19 Mar 2024, at 14:51, Ilija Tovilo wrote: > > > > Hi Robert > > > >> On Tue, Mar 19, 2024 at 5:24 PM Robert Landers > >> wrote: > >> > > See https://wiki.php.net/rfc/pattern-matching#throwing_alternative. I > > believe this idea would combine nicely with pattern matching. It has > > many more uses there than just simple class type matching, and could > > even be used for things like destructuring. > > That looks like a PHP dream. Has there been any work regarding that? https://github.com/iluuu1994/php-src/pull/102/files The implementation is mostly complete (it might slightly diverge from the current specification. Bob has called for a different implementation approach that might be more complex but potentially easier to optimize, I'll have to play around with it. There are also still some design decisions that we aren't completely sure about. For now, Larry and I are just trying to get property hooks over the finish line. Ilija
Re: [PHP-DEV] Proposal: AS assertions
Hi Robert On Tue, Mar 19, 2024 at 5:24 PM Robert Landers wrote: > > I've been thinking about this as an RFC for awhile, but with generics > being far off (if at all), I'd like to propose a useful idea: reusing > the AS keyword in a different context. > > Example: > > $x = $attributeReflection->newInstance() as MyAttribute; > > This would essentially perform the following code: > > assert(($x = $attributeReflection->newInstance()) instanceof MyAttribute); See https://wiki.php.net/rfc/pattern-matching#throwing_alternative. I believe this idea would combine nicely with pattern matching. It has many more uses there than just simple class type matching, and could even be used for things like destructuring. Ilija
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
Hi Rowan On Sun, Mar 17, 2024 at 3:41 PM Rowan Tommins [IMSoP] wrote: > > The remaining difference I can see in the current RFC which seems to be > unnecessary is that combining &get with set is only allowed on virtual > properties. Although it may be "virtual" in the strict sense, any &get > hook must actually be referring to some value stored somewhere - that > might be a backed property, another field on the current class, a > property of some other object, etc: > > public int $foo { &get => $this->foo; set { $this->foo = $value; } } > > public int $bar { &get => $this->_bar; set { $this->_bar = $value; } } > > public int $baz { &get => $this->delegatedObj->baz; set { > $this->delegatedObj->baz = $value; } } > > This sentence from the RFC applies equally to all three of these examples: > > > That is because any attempted modification of the value by reference > would bypass a |set| hook, if one is defined. > > I suggest that we either trust the user to understand that that will > happen, and allow combining &get and set on any property; or we do not > trust them, and forbid it on any property. I'm indeed afraid that people will blindly make their array properties by-reference, without understanding the implications. Allowing by-reference behavior for virtual read/write properties is a tradeoff, for cases where it may be necessary. Exposing private properties by-reference is already possible outside of hooks (https://3v4l.org/VNhf7), that's not something we can prevent for secondary backing properties. However, we can at least make sure that a reference to the baking value of a hooked property doesn't escape. I realize this is somewhat inconsistent, but I believe it is reasonable. If you want to expose the underlying property by-reference, you need to jump through some additional hoops. > > Apart from the things already mentioned, it's unclear to me whether, > > with such `set;` declarations, a `get`-only backed property should > > even be legal. With the complete absence of a write operation, the > > assignment within the `set` itself would fail. To make this work, the > > absence of `set;` would need to mean something like "writable, but > > only within another hook", which introduces yet another form of > > asymmetric visibility. > > Any write inside the get hook already by-passes the set hook and refers > to the underlying property, so there would be no need for any default > set behaviour other than throwing an error. > > It's not likely to be a common scenario, but the below works with the > current implementation https://3v4l.org/t7qhR/rfc#vrfc.property-hooks > > class Example { > public int $nextNumber { > get { > $this->nextNumber ??= 0; > return $this->nextNumber++; > } > // Mimic the current behaviour of a virtual property: > https://3v4l.org/cAfAI/rfc#vrfc.property-hooks > set => throw new Error('Property Example::$nextNumber is > read-only'); > } > } Again, it depends on how you think about it. As you have argued, for a get-only property, the backing value should not be writable without an explicit `set;` declaration. You can interpret `set;` as an auto-generated hook, or as a marker that indicates that the backing value is accessible without a hook. As mentioned in my previous e-mail, auto-generated hooks is something we'd really like to avoid. So, if the absence of `set;` means that the backing value is not writable, the hook itself must be exempt from this rule. Another thing to consider: The current implementation inherits the backing value and all hooks from its parent. If the suggestion is to add an explicit `set;` declaration to make it more obvious that the property is writable, how does this help overridden properties? ```php class P { public $prop { get => strtolower($this->prop); set; } } class C extends P { public $prop { get => strtoupper(parent::$prop::get()); } } ``` Even though `P::$prop` signals that it is writable, there is no such indication in `C::$prop`. You may suggest to also add `set;` to the child, but then what if the parent adds a custom implementation for `set;`? ```php class P { public $prop { get => strtolower($this->prop); set { echo $value, "\n"; $this->prop = $value; } } } class C extends P { public $prop { get => strtoupper(parent::$prop::get()); set; } } ``` The meaning for `set;` is no longer clear. Does it mean that there's a generated hook that accesses the backing field? Does it mean that the backing field is accessible without a hook? Or does it mean that it accesses the parent hook? The truth is, with inheritance there's no way to look at the property declaration and fully understand what's going on, unless all hooks must be spelled out for the sake of clarity (e.g. `get => parent::$prop::get()`). > We are already allowing more than Kotlin by lett
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
Hi Rowan On Sat, Mar 16, 2024 at 8:23 PM Rowan Tommins [IMSoP] wrote: > > I still think there will be a lot of users coming from other languages, or > from using __get and __set, who will look at virtual properties first. Making > things less surprising for those people seems worth some effort, but I'm not > asking for a complete redesign. For clarity, you are asking for a way to make the "virtualness" of properties more explicit, correct? We touch on a keyword and why we think it's suboptimal in the FAQ section. Unfortunately, I cannot think of many alternatives. The `$field` variable made it a bit more obvious, but only marginally. I do believe that, for the most part, the user should not have to think much about whether the property is backed or virtual. The behavioral differences are mostly intuitive. For example: ```php class Test { // This property has a set hook that writes to the backing value. Since // we're using the backing value, it makes sense for there to be a way to // retrieve it. Without that, it wouldn't be useful. public $prop { set { $this->prop = strtoupper($value); } } // Similarly, a property with only a get hook that accesses the backing // value would need a way to write to the property for the get to be useful. public $prop { get => strtoupper($this->prop); } // A property with a get hook that does not use the backing value does not // need an implicit set operation, as writing to the backing value would be // useless, given that nobody will read it. public $prop { get => 42; } // Similarly, in the esoteric write-only case that does not use the backing // value, having an implicit get operation would always lead to a // "uninitialized property" error, and is not useful as such. public $prop { set { echo "Prop set\n"; } } } ``` Furthermore, `serialize`, `var_dump` and the other functions operating on raw property values will include the property only if it is backed. This also seems intuitive to me: If you never use the backing value, the backing value would always be uninitialized, so there's no reason to include it. One case that is not completely obvious is lazy-initialized properties. ```php class Test { public $prop { get => $this->prop ??= expensiveOperation(); } } ``` It's not immediately obvious that there is a public set operation here. The correct way to fix this would be with asymmetric visibility, which was previously declined. Either way, I don't consider this case alone enough to completely switch our approach. Please let me know if you are aware of any other potentially non-intuitive cases. I will admit that it is unfortunate that a user of the property has to look through the hook implementation to understand whether a property is writable. As you have previously suggested, one option might be to add an explicit `set;` declaration. Maybe it's a bit more obvious now, after my previous e-mail, why we are trying to avoid this. Apart from the things already mentioned, it's unclear to me whether, with such `set;` declarations, a `get`-only backed property should even be legal. With the complete absence of a write operation, the assignment within the `set` itself would fail. To make this work, the absence of `set;` would need to mean something like "writable, but only within another hook", which introduces yet another form of asymmetric visibility. > > Dynamic properties are not particularly relevant today. The point was > > not to show how similar these two cases are, but to explain that > > there's an existing mechanism in place that works very well for hooks. > > We may invent some new mechanism to access the backing value, like > > `field = 'value'`, but for what reason? This would only make sense if > > the syntax we use is useful for something else. However, given that > > without guards it just leads to recursion, which I really can't see > > any use for, I don't see the point. > > I can think of several reasons we *could* explore other syntax: > > 1) To make it clearer in code whether a particular line is accessing via the > hooks, or by-passing them 2) To make the code in the hooks shorter (e.g. > `$field` is significantly shorter than `$this->someDescriptiveName`) 3) To > allow code to by-pass the hooks at will, rather than only when called from > the hooks (e.g. having a single method that resets the state of several > lazy-loaded properties) > > Those reasons are probably not enough to rule out the current syntax; but > they show there are trade-offs being made. Fair enough. 1 and 2 are reasons why we added the `$field` macro as an alternative syntax in the original draft. I don't quite understand point 3. In Kotlin, `field` is only usable within its associated hook. Other languages I'm aware of do not provide a way to access the backing value directly, neither inside nor ou
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
Hi Rowan On Sat, Mar 16, 2024 at 9:32 AM Rowan Tommins [IMSoP] wrote: > > On 16 March 2024 00:19:57 GMT, Larry Garfield wrote: > > >Well, reading/writing from within a set/get hook is an obvious use case to > >support. We cannot do cached properties easily otherwise: > > > >public string $expensive { > > get => $this->expensive ??= $this->compute(); > > set { > >if (strlen($value) < 50) throw new Exception(); > >$this->expensive = $value; > > } > >} > > > To play devil's advocate, in an implementation with only virtual properties, > this is still perfectly possible, just one declaration longer: > > private string $_expensive; > public string $expensive { > get => $this->_expensive ??= $this->compute(); > set { > if (strlen($value) < 50) throw new Exception(); > $this->_expensive = $value; > } > } > > Note that in this version there is an unambiguous way to refer to the raw > value from anywhere else in the class, if you wanted a clearAll() method for > instance. > > I can't stress enough that this is where a lot of my thinking comes from: > that backed properties are really the special case, not the default. Anything > you can do with a backed property you can do with a virtual one, but the > opposite will never be true. > > > The minimum version of backed properties is basically just sugar for that - > the property is still essentially virtual, but the language declares the > backing property for you, leading to: > > public string $expensive { > get => $field ??= $this->compute(); > set { > if (strlen($value) < 50) throw new Exception(); > $field = $value; > } > } > > I realise now that this isn't actually how the current implementation works, > but again I wanted to illustrate where I'm coming from: that backed > properties are just a convenience, not a different type of property with its > own rules. That's not really how we think about it. Our design decisions have been guided by a few factors: 1. The RFC intentionally makes plain properties and properties with hooks as fully compatible as possible. A subclass can override a plain property by adding hooks to it. Many other languages only allow doing so if the parent property already has generated accessors (`{ get; set; }`). For many of them, switching from a plain property to one with accessors is actually an ABI break. One requires generating assembly/IR instructions that access a field in some structure, the other one is a method call. This is not relevant in our case. In most languages, a consequence of `{ get; set; }` is that such properties cannot be passed by reference. This part _is_ relevant to PHP, because PHP makes heavy use of explicit by-reference passing for arrays, but not much else. However, as outlined in the RFC, arrays are not a good use-case for hooks to begin with. So instead of fragmenting the entirety of all PHP code bases into plain and `{ get; set; }` properties where it doesn't actually make a semantic difference, and then not even using them when it would matter (arrays), we have decided to avoid generated hooks altogether. The approach of making plain and hooked properties compatible also immediately means that a property can have both a "backing value" (inherited from the parent property) and hooks (from the child property). This goes against your model that backed properties are really just two properties, one for the backing value and a virtual one for the hooks. Our approach has the nice side effect of properties only containing hooks when they actually do something. We don't need to deal with optimizations like "the hook is auto-generated, revert to accessing the property directly to make it faster", or even just having the generated hook taking up unnecessary memory. You can think of our properties this way: ```php class Property { public ?Data $storage; public ?callable $getHook; public ?callable $setHook; public function get() { if ($hook = $this->getHook) { return $hook(); } else if ($storage) { return $storage->get(); } else { throw new Error('Property is write-only'); } } public function set($value) { if ($hook = $this->setHook) { $hook($value); } else if ($storage) { $storage->set($value); } else { throw new Error('Property is read-only'); } } } ``` Properties can inherit both storage and hooks from their parent. Hopefully, that helps with the mental model. Of course, in reality it is a bit more complicated due to guards and references. 2. Although you say backed properties are just syntactic, they really are not. For example, renaming a public property, making it private and replacing it with a new passthrough virtual property breaks serialization, as serialization works on the object's raw values. On the other hand, adding a hook to an existing property doesn't influence its backing va
Re: [PHP-DEV] automatic formatting checks for pull requests?
On Sun, Feb 18, 2024 at 4:11 PM Gina P. Banyard wrote: > > On Saturday, 17 February 2024 at 22:18, Ilija Tovilo > wrote: > > > * The new code style should be applied only to newly added sections or > > changed code, not entire files. Otherwise, we'll have many changes in > > large files, with endless merge conflicts when merging up from lower > > branches. > > Surely the best way is to apply the formatting tool on all branches that are > supported (even in security support). > Have them be merged upwards, and then add the revisions of the commits to a > .git-blame-ignore-revs file so that git blame doesn't care about them. > > This should resolve the issue of making future merges difficult. Presumably, this would lead to merge conflicts in every open pull request. Maybe a resolution strategy could be automated, not ideal nonetheless. Additionally, given that the PR has discovered a clang-format bug that changes behavior (https://github.com/php/php-src/pull/13417#issuecomment-1950920114), I'd be wary of applying the formatting blindly to our stable branches.
Re: [PHP-DEV] automatic formatting checks for pull requests?
Hi Hans On Sat, Feb 17, 2024 at 3:31 PM Gina P. Banyard wrote: > > On Saturday, 17 February 2024 at 11:24, Hans Henrik Bergan > wrote: > > > Can we add automatic formatting checks for pull requests? > > Made a PR: https://github.com/php/php-src/pull/13417 > > It would be nice to have some formatting rules to harmonize the codebase as > it is somewhat the wild west, > but as far as my understanding goes is that Clang format struggles to > understand our codebase (namely macros) and is difficult to set-up for > php-src. Right. Consistent code style is nice, but what we have now is really not that bad. There are a couple things I'd want if we enforce code style: * Fixing the style should be easy, running a single command without first pushing to CI. * It should be fast too, so that I can easily run it for every commit, preferably even on-save in my editor. * The new code style should be applied only to newly added sections or changed code, not entire files. Otherwise, we'll have many changes in large files, with endless merge conflicts when merging up from lower branches. * The formatting tool should work for all php-src code, not just plain C code. We don't want to be forced to refactor old macros just because we need to add a single line to some long-standing code. Last time I tried clang-format, it utterly failed with our macros. I haven't looked at your PR in detail, so I'm not sure which of these points it satisfies. It would be great if you could quickly describe how it works, and what the goals are. Essentially, I'm just sceptical that this isn't more trouble than it's worth. Ilija
Re: [PHP-DEV] RE: Testing new list server
Hi On Sat, Feb 17, 2024 at 12:17 AM Jorg Sowa wrote: > > Hello Derick, > there is something wrong. I don't get all of the emails from the new setup, > only part. Examples of emails I didn't receive: > - https://externals.io/message/122391 > - https://externals.io/message/122390 > - https://externals.io/message/122388 > > I'm using Gmail and Spam doesn't contain any of them. Same here. I'm using Gmail and have not received various e-mails. Ilija
Re: [PHP-DEV] Requesting RFC karma
Hi Hans On 16.02.24 13:05, Hans Henrik Bergan wrote: My name is "Hans Henrik Bergan", usually go by the nickname "divinity76", I've contributed to OSS (including PHP) for years, and am currently involved in 3 things that might require an RFC, and requesting RFC karma for wiki account "divinity76". RFC karma was granted, good luck! Ilija
Re: [PHP-DEV] Registration apply for php wiki
Hi JaeHan! On Fri, Feb 16, 2024 at 10:53 AM 하늘아부지 wrote: > > Hi, My name is JaeHan Seo. (wiki username is daddyofsky) > > I hope I can make a proposal by registering on the php wiki. I can give you karma. Usually, RFC karma (which is what I'm guessing you're asking for) is granted for specific RFC ideas. Could you quickly summarize what your idea is? See step 1 in the RFC howto: https://wiki.php.net/rfc/howto Thank you! Ilija
Re: [PHP-DEV] [Discussion] Thoughts on casting to null
Hi Robert On Wed, Feb 14, 2024 at 1:29 AM Robert Landers wrote: > > I won't be the first to say this, at first glance, casting to null > sounds silly, but short arrow functions must always return something, > by design. That's when casting to null makes any sense at all (that I > can think of): you want to write a succinct, short function but > guarantee the result is discarded. Instead, if you really must use a > short array function, you have to do something even weirder: > > EventLoop::repeat($pingInterval, fn() => $client->ping() ? null : null); While (void) $client->ping() would solve your problem, it's not very useful outside this scenario. I think there are two issues you're implicitly referring to. 1. Arrow functions cannot be void, because they always return something. 2. Arrow functions cannot contain multiple statements. As for the former, PHP actually had a similar issue for never closures that was solved a while ago [1]. In the report I suggested that the same could be done for void, by making void arrow functions evaluate and drop the right hand side of =>, and always return nothing (i.e. null). This would solve your issue, although probably mostly by accident (because void functions return null, and your caller expects exactly null). Regardless, I think this change would be useful, if just to signal that the return value of an arrow function is not intended to be used. The latter would require some sort of block or grouped expression. Short closures have been discussed extensively in the past, so I won't get into that. There's also the comma operator in some languages like C and JavaScript that evaluates a list of expressions and returns the result of the last one, although probably not universally liked (i.e. fn () => ($client->ping(), null)). [1] https://github.com/php/php-src/commit/f957e3e7f17eac6e9195557f3fa79934f823fd38 Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] php-src docs
Hi Yuya It seems you accidentally sent your response to me instead of the list. On Sun, Feb 11, 2024 at 5:10 PM youkidearitai wrote: > > 2024年2月11日(日) 21:18 Ilija Tovilo : > > > > Hi everyone. > > > > I would like to start an initiative to centralize documentation of the > > PHP internals. > > https://github.com/php/php-src/pull/13338 > > https://iluuu1994.github.io/php-src/ (will be moved to php.github.io > > once merged) > > > > Let me know of any thoughts and suggestions you might have. > > Hi, Ilija. > Thank you for your great suggestion. > > It seems make sense to have a set of documents about the structure of > php-src in php-src. > Easily create pull requests to them. > > Although I have to learn reStructuredText, It is not seems major problem. For some context, I initially planned to go with the mdBook from the Rust project (https://github.com/rust-lang/mdBook/) as Markdown is a bit more approachable. After writing the sample zval chapter, I noticed some pain points in terms of formatting, most significantly tables. That said, reStructuredText is far from perfect itself. As mentioned previously, the other reason for choosing Sphinx was that it is quite extensible. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] php-src docs
Hi Mönôme On Sun, Feb 11, 2024 at 2:20 PM Mönôme Epson wrote: > > > centralize documentation of the PHP internals > I'm glad to hear that you're planning to centralize the documentation for PHP > internals. > > > Let me know of any thoughts and suggestions you might have. > I have a preference for devel-docs/ instead of docs/ . This would make the > doc-en repository a PHP subproject. Can you clarify? Do you mean a separate repository called devel-docs? I don't hear the term "devel" much outside of package managers. The suggestion was to put it directly in the php-src repository, where there's not much confusion about its content (precisely the php-src repository). Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] php-src docs
Hi everyone. I would like to start an initiative to centralize documentation of the PHP internals. As mentioned in a recent e-mail, there's a lot of useful documentation scattered across the internet, like the PHP internals book, various blogs of contributors, wiki.php.net, to name a few. Information is currently hard to discover over multiple mediums. Some of these mediums are also prone to go stale and can't be updated by internals (e.g. blog posts). After a brief discussion at the foundation, we think it's best to incorporate the documentation directly into the php-src repository. This makes it very easy to discover, contribute to, and allows updating documentation right alongside its technical changes. To make browsing the documentation easier, the documentation is built with Sphinx and published to GitHub Pages. I've prepared a PR here: https://github.com/php/php-src/pull/13338 https://iluuu1994.github.io/php-src/ (will be moved to php.github.io once merged) Let me know of any thoughts and suggestions you might have. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Why are serialized strings wrapped in double quotes? (s::"")
Hi Sandy On Tue, Feb 6, 2024 at 9:19 PM Sanford Whiteman wrote: > > I'd like a little background on something we've long accepted: why > does the serialization format need double quotes around a string, even > though the byte length is explicit? > > Example: > > s:5:"hello"; > > All else being equal I would think we could have just > > s:5:hello; > > Was this just to make strings look more 'stringy', even though the > format isn't meant to be human-readable? I don't have the historical context, but I'm assuming that's it. PHPs serialization format is not efficient, and I don't think that was ever the primary focus. If you need something more efficient, you can try https://github.com/igbinary/igbinary which is aimed to be a drop-in replacement. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] Re: [RFC][Vote] RFC1867 for non-POST HTTP verbs
Hi everyone On Mon, Jan 22, 2024 at 10:23 AM Ilija Tovilo wrote: > > I started the vote on the "RFC1867 for non-POST HTTP verbs" RFC. > https://wiki.php.net/rfc/rfc1867-non-post The RFC has been accepted with 23 yes and 1 no vote. As promised to Sara, I will be setting up a poll for the suggested, alternative function name in the coming days. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Discussion: making continue and break into an expression
Hi Larry On Thu, Jan 25, 2024 at 6:38 PM Larry Garfield wrote: > > On Thu, Jan 25, 2024, at 11:28 AM, Ilija Tovilo wrote: > > > > This leads to very similar issues as break/continue inside blocks. See: > > https://wiki.php.net/rfc/match_blocks#technical_implications_of_control_statements > > > I'm curious, how did `throw` expressions manage to avoid these issues? Or > was it just "Ilija did the hard work of tracking down the weirdness?" Can't really take the credit for this. This issue went over my head, as this was my first RFC. Exceptions work a bit differently, in that they use something called live-ranges. Essentially, we look at the generated opcodes and figure out which variables are "live" (i.e. valid and unfreed) during which opcodes. For something like echo foo() + bar(): Pseudo opcodes: V1 = CALL foo 0001 V2 = CALL bar 0003 V3 = ADD V1 V2 0004 ECHO V3 V1 would be live for -0003, V2 for 0001-0003, V3 for 0003-0004. If an exception is thrown (or rethrown across function boundaries) the VM checks which temporary variables are currently live and frees them. So if CALL bar were to throw, we'd see that V1 is currently live and needs to be freed. For something like foo() + throw new Exception(), if you replace the second CALL with a throw, you'll see that the live-range for V1 doesn't change, and so this "just works". There was, however, a related issue with the optimizer. echo foo() + throw new Exception(); V1 = CALL foo 0001 THROW 0003 V3 = ADD V1 false 0004 ECHO V3 Where the optimizer would remove the dead instructions after the throw, breaking live-range analysis. V1 = CALL foo 0001 THROW V1 no longer had a consuming opcode, and as such the algorithm could no longer determine the live-range of V1. This would cause V1 to leak. The solution was simply to disable dead code elimination for this case. The solution was suggested by Tyson Andre and implemented by Nikita. In theory, break/continue expressions might try to re-use live-ranges. I recall thinking about this, but I can't seem to remember if there was a reason not to do it. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Discussion: making continue and break into an expression
Hi Robert On Thu, Jan 25, 2024 at 10:16 AM Robert Landers wrote: > > Now that throwing is an expression, it allows for some very concise > programming. What are your thoughts on making a break/continue into an > expression as well? > > Instead of: > > while(true) { > ... > if(is_null($arr['var'])) continue; > if($something) continue; else break; > ... > } > > You could write > > while(true) { > ... > $arr['var'] ?? continue; > $something ? continue : break; > ... > } This leads to very similar issues as break/continue inside blocks. See: https://wiki.php.net/rfc/match_blocks#technical_implications_of_control_statements I'll try to explain. The VM works with temporary variables. For the expression foo() + bar() two temporary variables for the result of foo() and bar() will be created, which are then used for the + operation. Normally, + will consume both operands, i.e. use and then free them. However, with break/continue etc. being expressions, it would become possible to skip over the consuming instructions. do { echo foo() + break; } while (true); echo 'Done'; Pseudo opcodes: V1 = CALL foo 0001 JMP 0005 0002 V2 = ADD V1 false ; false is here represents a bottom value that will never actually be used 0003 ECHO V2 0004 JMP 0005 ECHO 'Done' Since JMP will skip over the ADD instruction, V1 remains unused. A similar problem already exists for break/continue in foreach itself. foreach ($foos as $foo) { foreach ($bars as $bar) { break 2; } } foreach holds a copy of $bars (in case it gets modified) that normally gets cleaned up when the loop ends. With break over multiple loop-boundaries, we can completely skip over this freeing mechanism. PHP solves this by inserting an explicit FE_FREE instruction before the break 2, which itself is essentially just a JMP to the end of the outer loop. Hopefully it's now more evident why this is a problem: while (true) { foo() && break; } foo() returns a value that would normally be consumed by the && operation. However, with break, we may skip over the && operation entirely. As such, the break itself becomes responsible for freeing these values. This requires significant changes in the compiler to track variables that are currently "live" (i.e. haven't been consumed yet), and emitting FREE opcodes for them as needed. I've implemented this for match blocks here: https://github.com/php/php-src/compare/master...iluuu1994:php-src:match-blocks-var-tracking However, note that due to complexity, I've decided to disallow using break/continue and the likes in such contexts to avoid this issue completely, which isn't possible for what you are suggesting. There's another related issue. foo(bar(), break); Function calls in PHP consist of multiple instructions, namely an INIT_CALL, 0-n SEND and a DO_CALL opcode. INIT_CALL creates a stack frame, SEND pushes arguments onto the stack frame, and DO_CALL starts the execution of the function and frees both arguments and stack frame when the function ends. If prior to a SEND opcode we break, we skip over the DO_CALL, so the stack frame needs to be freed manually. The patch linked above solves this by inserting CLEAN_UNFINISHED_CALLS opcodes that do as the name suggests. This mechanism is already used for exceptions. This should work for you, but was insufficient for match blocks, for reasons I won't get into here. All this to say: Don't expect the implementation here to be trivial. Regards, Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Re: Wiki Access request
On Wed, Jan 24, 2024 at 1:22 PM Ilija Tovilo wrote: > > Thank you for the list. It looks more digestible, and most of it is > already congruent with CONTRIBUTING.md. > > * https://www.zend.com/resources/writing-php-extensions requires > sharing your contact infomation to obtain. I'm not sure how other > people feel about this. Oh, nevermind. That sidebar is clickable. I missed that there's an online version. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Re: Wiki Access request
Hi Carlos On Wed, Jan 24, 2024 at 12:29 PM Barel wrote: > > > Feel free to share this list here, on GitHub or otherwise. I'm > > skeptical whether throwing partially outdated resources at people is > > actually helpful. > > I trimmed the list that I collected so that it only included the most > significant items which have a lot of information. This list would be: > - https://www.phpinternalsbook.com/ PHP Internals book > - https://phpinternals.net/ PHP Internals web site with documentation about > a lot of the structures and macros used in the code > - https://www.npopov.com/ Nikita Popov's blog > - http://blog.jpauli.tech/ Julien Pauli's blog > - https://phpinternals.news/ Derick Rethans' podcast > - https://www.zend.com/resources/writing-php-extensions Zend's guide about > writing PHP extensions > - https://wiki.php.net/internals The internals page in the wiki > - https://www.informit.com/store/extending-and-embedding-php-9780672327049 > Sara Golemon's printed book > > Do you think this sounds good? If you do, I will create the PR to update > the contributing doc Thank you for the list. It looks more digestible, and most of it is already congruent with CONTRIBUTING.md. * Juliens blog is a bit large, but contains some very detailed posts that should still be relevant. It might make sense to list them explicitly. * https://www.zend.com/resources/writing-php-extensions requires sharing your contact infomation to obtain. I'm not sure how other people feel about this. * https://www.informit.com/store/extending-and-embedding-php-9780672327049 might be outdated, as it was published back in the PHP 5.1 era. I'll leave it up to Sara whether she thinks the book should be recommended at this time. The rest seems to be already listed. Regards, Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC][Vote] RFC1867 for non-POST HTTP verbs
Hi Sara Thank you for your feedback. On Tue, Jan 23, 2024 at 8:41 PM Sara Golemon wrote: > > On Mon, Jan 22, 2024 at 1:24 AM Ilija Tovilo wrote: > > > I started the vote on the "RFC1867 for non-POST HTTP verbs" RFC. > > https://wiki.php.net/rfc/rfc1867-non-post > > > > 1/ This function reaches into the SAPI to pull out the "special" body > data. That's great, but what about uses where providing an input string > makes sense. For that, and for point 2, I'd suggest > `http_parse_query(string $query, ?array $options = null): array|object`. The RFC previously included support for the $input_stream variable (string is not very appropriate for multipart because the input may be arbitrarily large). The implementation wasn't complex, but it required duplication of all the reads to support both a direct read from the SAPI and a read from the stream, duplication of some limit checks and special passing of the streams to avoid SAPI API breakage. As for actual use cases, I found limited evidence that this function would be useful for worker-based services _right now_. Most services handle request parsing in some other layer. For example, RoadRunner has a Go server that stores the file to disk, and then just passes the appropriate path to PHP in the $_FILES array. It seems to me that a custom input would be useful exclusively for a web server written in PHP. The one that was pointed out to me (AdapterMan) handles requests as strings, which would not scale for multipart requests. I don't mind getting back to this if AdapterMan rewrites request handling to use streams. Adding back the $input_stream parameter can be done with no BC breaks. But for the time being, I don't think the motivation is big enough to justify the added complexity. Additionally, because multipart is used exclusively as a request content type, it isn't useful in a general sense either, because a PHP request will typically only receive one request (but potentially multiple responses, in case it communicates with other servers). > 2/ `request_` represents a new psuedo-namespace, functions are easier to > find and associate if we keep them grouped. I recommend 'http_` because it > compliments the very related function `http_build_query()`, and for the > version of this function which refers directly to the request: > `http_parse_request(?array $options = null) : array|object`. That's fair. If the name bothers you I can create an amendment RFC. I think http_parse_body() would be a bit more appropriate, because request implies more than just the body. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Re: Wiki Access request
Hi Carlos You should now have access to the /internals sub-pages. On Mon, Jan 22, 2024 at 11:38 AM Barel wrote: > > I didn't know that there was a list of tech resources listed in the > "CONTRIBUTING.md" file. I have been researching resources that have > information about PHP internals and have found a lot, more than twenty. I > can think of three possibilities: > - List them all in the contributing document but this can be a bit > overwhelming as, like I said, the number of resources available is big Feel free to share this list here, on GitHub or otherwise. I'm skeptical whether throwing partially outdated resources at people is actually helpful. > - Add a new page in the Github repo with all these links and link to that > page from the contributing page > - Keep the list in the wiki and link to that page from the contributing page One of the reasons I'd like to move off of the wiki for these things is that there is no review process. We highly value volunteer work, but trusting new contributors to blindly make changes to official guides is obviously somewhat problematic. I'd prefer to make CONTRIBUTING.md the "official list" for now. This file is allowed to be long. And as mentioned, I suspect that the list of 20+ items may be trimmed. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] [RFC][Vote] RFC1867 for non-POST HTTP verbs
Hi everyone I started the vote on the "RFC1867 for non-POST HTTP verbs" RFC. https://wiki.php.net/rfc/rfc1867-non-post Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Newly Created Wiki Account - Quick Introduction
Hi Jair! On Mon, Jan 22, 2024 at 5:14 AM Jair Humberto wrote: > > My name is Jair Humberto, I am brazilian and have been working with PHP > since 2006. I am super excited about the new things PHP has launched lately > and finally realised that I can contribute as well. I think this is a new > fase of my career, I want to learn a bit more about the php source code and > processes and I plan to start contributing with small things, but my first > main goal (long term) is to make possible generics in PHP, one day! Welcome! Note that you do not need a wiki account for contributing to PHP. Source code is managed on GitHub (https://github.com/php/php-src) and changes are made via pull requests. Nowadays, you'll only really need a wiki account to create RFCs. Check the CONTRIBUTING.md file for some good resources on the internals of PHP. https://github.com/php/php-src/blob/master/CONTRIBUTING.md#technical-resources If you need any guidance, feel free to contact me. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Wiki Access request
Hi Carlos! On Fri, Jan 19, 2024 at 4:56 PM Barel wrote: > > This page in the Wiki https://wiki.php.net/internals/references has a lot > of links which are outdated and should be removed or changed. If you can > provide wiki edit access for me I can work on updating them and also try to > find more recent links to add to that reference page (suggestions welcome!!) > > My username for the wiki site is "barelon" in case this is needed I can absolutely give you write access to these pages. Updating this list to reflect more up-to-date resources certainly makes sense. As you probably know, there are a number of different places where php internals are documented, and I think that, long-term, it makes sense to try to consolidate these efforts. We briefly spoke about this in the last foundation meeting. We have some documentation in the php-src repository itself, a significant amount in the php internals book (https://www.phpinternalsbook.com/), some in the wiki, some on blogs of current or previous contributors, etc. There are a number of things that are important when it comes to documentation, like convenience, access, history, discoverability, etc. While not the worst, I don't think the wiki is the best place for this work. Handling documentation directly through Git in PHPs main repository (or at least a repository in the PHP organization) would likely tick the most boxes. Providing documentation with PRs might also improve the understanding of intention of the changes for reviewers. As for links to other references, I believe the CONTRIBUTING.md file is currently the most up-to-date. https://github.com/php/php-src/blob/master/CONTRIBUTING.md#technical-resources Rather than duplicating this list on the wiki, it might make sense to reference the CONTRIBUTING.md file, and extend it as necessary. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php