On Thu, Feb 13, 2020 at 6:06 PM Larry Garfield <[email protected]> wrote:
> On Thu, Feb 13, 2020, at 3:47 AM, Nikita Popov wrote: > > Hi internals, > > > > This has been discussed a while ago already, now as a proper proposal: > > https://wiki.php.net/rfc/token_as_object > > > > tl;dr is that it allows you to get token_get_all() output as an array of > > PhpToken objects. This reduces memory usage, improves performance, makes > > code more uniform and readable... What's not to like? > > > > An open question is whether (at least to start with) PhpToken should be > > just a data container, or whether we want to add some helper methods to > it. > > If this generates too much bikeshed, I'll drop methods from the proposal. > > > > Regards, > > Nikita > > I love everything about this. > > 1) I would agree with Nicolas that a static constructor would be better. > I don't know about polyfilling it, but it's definitely more > self-descriptive. > > 2) I'm skeptical about the methods. I can see them being useful, but also > being bikeshed material. For instance, if you're doing annotation parsing > then docblocks are not ignorable. They're what you're actually looking for. > > Two possible additions, feel free to ignore if they're too complicated: > > 1) Should it return an array of token objects, or a lazy iterable? If I'm > only interested in certain types (eg, doc strings, classes, etc.) then a > lazy iterable would allow me to string some filter and map operations on to > it and use even less memory overall, since the whole tree is not in memory > at once. > I'm going to take you up on your offer and ignore this one :P Returning tokens as an iterator is inefficient because it requires full lexer state backups and restores for each token. Could be optimized, but I wouldn't bother with it for this feature. I also personally have no use-case for a lazy token stream. (It's technically sufficient for parsing, but if you want to preserve formatting, you're going to be preserving all the tokens anyway.) > 2) Rather than provide bikesheddable methods, would it be feasible to take > a queue from PDO and let users specify a subclass of PhpToken to fetch > into? That way the properties are always there, but a user can attach > whatever methods make sense for them. > It would be technically feasible. If we go with a static method for construction, then one might even say that there's reasonable expectation that PhpToken::getAll(...) is going to return PhpToken[] and MyPhpTokenExtension::getAll() is going to return MyPhpTokenExtension[]. I'm a bit apprehensive about this though, specifically because you mention PDO... which, I think, isn't exactly a success story when it comes to this. If we do this, then the behavior would be that the object gets created, the properties populated, and *no constructor gets called*. The last part is important -- when you start calling constructors and magic methods, that's where the mess starts and you get PDO. Regards, Nikita
