Re: [PHP-DEV] [RFC] token_get_all() TOKEN_AS_OBJECT mode

Nikita Popov Fri, 14 Feb 2020 00:49:12 -0800

On Thu, Feb 13, 2020 at 6:06 PM Larry Garfield <la...@garfieldtech.com>
wrote:

> On Thu, Feb 13, 2020, at 3:47 AM, Nikita Popov wrote:
> > Hi internals,
> >
> > This has been discussed a while ago already, now as a proper proposal:
> > https://wiki.php.net/rfc/token_as_object
> >
> > tl;dr is that it allows you to get token_get_all() output as an array of
> > PhpToken objects. This reduces memory usage, improves performance, makes
> > code more uniform and readable... What's not to like?
> >
> > An open question is whether (at least to start with) PhpToken should be
> > just a data container, or whether we want to add some helper methods to
> it.
> > If this generates too much bikeshed, I'll drop methods from the proposal.
> >
> > Regards,
> > Nikita
>
> I love everything about this.
>
> 1) I would agree with Nicolas that a static constructor would be better.
> I don't know about polyfilling it, but it's definitely more
> self-descriptive.
>
> 2) I'm skeptical about the methods.  I can see them being useful, but also
> being bikeshed material.  For instance, if you're doing annotation parsing
> then docblocks are not ignorable.  They're what you're actually looking for.
>
> Two possible additions, feel free to ignore if they're too complicated:
>
> 1) Should it return an array of token objects, or a lazy iterable?  If I'm
> only interested in certain types (eg, doc strings, classes, etc.) then a
> lazy iterable would allow me to string some filter and map operations on to
> it and use even less memory overall, since the whole tree is not in memory
> at once.
>

I'm going to take you up on your offer and ignore this one :P Returning
tokens as an iterator is inefficient because it requires full lexer state
backups and restores for each token. Could be optimized, but I wouldn't
bother with it for this feature. I also personally have no use-case for a
lazy token stream. (It's technically sufficient for parsing, but if you
want to preserve formatting, you're going to be preserving all the tokens
anyway.)

> 2) Rather than provide bikesheddable methods, would it be feasible to take
> a queue from PDO and let users specify a subclass of PhpToken to fetch
> into?  That way the properties are always there, but a user can attach
> whatever methods make sense for them.
>

It would be technically feasible. If we go with a static method for
construction, then one might even say that there's reasonable expectation
that PhpToken::getAll(...) is going to return PhpToken[] and
MyPhpTokenExtension::getAll() is going to return MyPhpTokenExtension[].

I'm a bit apprehensive about this though, specifically because you mention
PDO... which, I think, isn't exactly a success story when it comes to this.
If we do this, then the behavior would be that the object gets created, the
properties populated, and *no constructor gets called*. The last part is
important -- when you start calling constructors and magic methods, that's
where the mess starts and you get PDO.

Regards,
Nikita

Re: [PHP-DEV] [RFC] token_get_all() TOKEN_AS_OBJECT mode

Reply via email to