[PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-25 Thread G. P. B.
Hello internals,

As the two week discussion period has elapsed the vote is now open.

We did acknowledge the suggestion of dropping the token name from the error
message directly, but in our opinion this is an orthogonal change to the
one proposed, and has the risk of not landing in PHP 8.0.

The vote will close on the 9th of July.

https://wiki.php.net/rfc/rename-double-colon-token

Best regards

George P. Banyard


Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread Andrea Faulds

Hi again,

A further and perhaps more important thought: I think the token names 
are actually the least confusing part of parser errors, even for the 
famous T_PAAMAYIM_NEKUDOTAYIM. Changing it to T_DOUBLE_COLON may not 
help much, because the parser only tells you what the next token it 
expected was, not *why* it expected it, i.e. what is wrong with the 
syntax. A user might think they need to add a :: but it's not their 
actual problem.


For example, if you google for “T_PAAMAYIM_NEKUDOTAYIM” errors, one of 
the classic examples where you got such an error was:


  var_dump(empty(TRUE));

If the error had said T_DOUBLE_COLON it would still be mystifying: why 
did PHP think you needed a :: there? And just adding one won't fix the 
problem! The actual issue was that empty() used to not support arbitrary 
expressions, only variables, and the expected T_PAAMAYIM_NEKUDOTAYIM is 
because the only way TRUE could be part of a variable would be if it was 
a class name (TRUE::$foo). The way to fix it is to replace empty() with 
the ! operator, but you'd have a hard time figuring that out from the 
error. I think this is the real reason T_PAAMAYIM_NEKUDOTAYIM was 
famous: even if you knew it meant double colon, the error message is 
still cryptic.


The good news is T_PAAMAYIM_NEKUDOTAYIM is no longer quite the menace it 
once was. PHP 7's parser and syntax overhaul (thank you Nikita!) fixed 
it in some places:


  $ php -r 'var_dump(isset(TRUE));'
  Fatal error: Cannot use isset() on the result of an expression (you 
can use "null !== expression" instead)


And other places where you might have once seen T_PAAMAYIM_NEKUDOTAYIM 
now give a different unhelpful parser error, which renaming 
T_PAAMAYIM_NEKUDOTAYIM will not help with:


  $ php -r 'unset(TRUE);'
  Parse error: syntax error, unexpected ')', expecting '[' in Command 
line code on line 1


So if there was ever a time to rename T_PAAMAYIM_NEKUDOTAYIM, it would 
have been many years ago. There's much less benefit to renaming it now, 
especially given it says “'::' (T_PAAMAYIM_NEKUDOTAYIM)” if you manage 
to get an error containing it, so you don't even need to google it. The 
specific name a token is given is the least of the problems there.


As for parser errors, I don't know how easy they would be to improve… is 
it even possible for us to do so without using a hand-written parser 
instead of an auto-generated one? (I have no idea.)


Regards,
Andrea


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread Rowan Tommins

On 27/06/2020 10:56, Andrea Faulds wrote:
Of course, if T_PAAMAYIM_NEKUDOTAYIM was never encountered by userland 
developers, this RFC wouldn't exist. The thing is, I don't think 
T_DOUBLE_COLON should be encountered by userland developers either — 
in my view, as an implementation detail, token names shouldn't be part 
of parser error messages at all. If we were to remove token names from 
the parser errors, we would avoid the problem this RFC seeks to solve. 
For most tokens we could simply display the characters it corresponds 
to (e.g. "::" for T_PAAMAYIM_NEKUDOTAYIM, which we already do!), and 
for those with variable content (e.g. T_STRING) we could display a 
human-readable description of what is expected (e.g. "an identifier"). 



Just to confirm, I am actively working on exactly this, and although 
slightly delayed by an outbreak of sunny weather, fully expect to have a 
patch (and, if deemed necessary, RFC) in plenty of time for 8.0.


I intend to post a new thread with examples of old and new messages once 
I've finalised the details.



Regards,

--
Rowan Tommins (né Collins)
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread Dan Ackroyd
On Sat, 27 Jun 2020 at 11:20, Benjamin Eberlei  wrote:

> 1. The token name has historic value and should be preserved as such

I don't think it has historic value. It might have nostalgic value but
nostalgia for:

> I think everyone hits it once or twice, has the "wtf?" moment, googles, and
> never forgets it again - even if they can't pronounce it...

is not a great piece of nostalgia.

But regardless of the past value, it's the current and future
trade-offs we should be considering.

i.e. is the cost of changing it bigger or smaller than the benefit of
not having a token name that can only be reasoned about after
googling.

I haven't seen anyone say any cost for changing it. Even if the work
is done to improve the error messages, avoiding having a token that
you can't grok the meaning from the name of it, is a small
improvement.

So the tradeoff appears to be very small or zero cost, versus a
benefit for everyone who hasn't already stumbled into it as a blocker
for their understanding of internals.

cheers
Dan
Ack

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread G. P. B.
Hello Andrea, Benjamin, and others,

Better error messages are obviously better than just replacing the name of
the token, however this argument is saying that because this isn't perfect
let's do nothing.

I wasn't aware that Rowan had made good progress on his patch, however, if
this patch needs an RFC, which it might not, there is still no guarantee on
having better error messages for PHP 8.0.

Now, this may just me, but I find the argument that "::" is present
relatively weak as when I scan a PHP parse error, the first thing I look
for is the token name as there are only 3 parse errors which have the added
information of what the token represents, namely T_SL, T_SR, and
T_PAAMAYIM_NEKUDOTAYIM. T_SL and T_SR also have this issue but I would
expect beginners to not hit this as often as I don't expect them to do
anything with bit shifts, and the only other way that they may encounter it
is by doing a bad git merge.

I find it highly frustrating, and borderline offensive, that we are being
asked to go from a simple, non BC breaking, easy to enact change, to a
semi-major overhaul of how error messages should look like. I personally
have no interest in learning Bison and how to implement that as a separate
improvement to PHP. I'm just glad that Rowan decided to take on this
challenge.

Even if better error messages come about, some people will still need to
deal with the token, such as static analyser, code sniffers, code style
tools, etc. Obviously people working on these tools aren't beginners and
deal with the weird token name just fine, or use the T_DOUBLE_COLON alias
if they can. But why should it still be like this? This change has no BC
break, makes English the consistent language for token names. Moreover,
something being historic doesn't mean it shouldn't be touched.

My perception is that most of the community finds it baffling why anyone
would be against this change.

On Sat, 27 Jun 2020 at 15:57, Andrea Faulds  wrote:

> As for parser errors, I don't know how easy they would be to improve… is
> it even possible for us to do so without using a hand-written parser
> instead of an auto-generated one? (I have no idea.)

This may be done using Bison 3.6, which got released in May of this year,
as seen by this PR: https://github.com/php/php-src/pull/5416 However, I
don't expect us to be able to use this as Bison 3.6, won't be present on
most distrib until a couple of years.

Best regards
George P. Banyard


Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread Stanislav Malyshev
Hi!

> Better error messages are obviously better than just replacing the name of
> the token, however this argument is saying that because this isn't perfect
> let's do nothing.

I don't think this is the argument. I think the argument is rather than
half-fix the problem wrong way, let's fully fix it the right way.

> I find it highly frustrating, and borderline offensive, that we are being
> asked to go from a simple, non BC breaking, easy to enact change, to a
> semi-major overhaul of how error messages should look like. I personally

I am not sure why is this "offensive" to you. Is there something in
overhauling error messages that goes against your principles or
religious beliefs? Sometimes overhauling is exactly what is the right
way to go, and not showing internal parser tokens seems to be quite
reasonable idea. If this idea is somehow "offensive" to you, I feel
sorry for you but it's not a reason to abandon this idea. It is also
quite a common things - many proposals in PHP have been rejected because
the community thought it's not the right way to approach things, I
myself have had some proposals rejected because of this. There's no
reason to feel offended by that.

> My perception is that most of the community finds it baffling why anyone
> would be against this change.

My perception is that most of the community couldn't care less for how
the tokens are named, and really shouldn't. It's an internal thing and
should stay this way. If they made to care, that's our fault and we
should fix it.

-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-28 Thread Rowan Tommins

On 27/06/2020 23:53, G. P. B. wrote:


[...] there are only 3 parse errors which have the added
information of what the token represents, namely T_SL, T_SR, and
T_PAAMAYIM_NEKUDOTAYIM.



I'm not sure what you mean by there being only three. Since PHP 5.4, all 
parser errors include the content of the token as well as its name, even 
when this is basically repeating the same word, such as "unexpected 
'echo' (T_ECHO)". There's even a bug in the implementation meaning that 
casts repeat the name three times, e.g. "unexpected '(int)' (int) 
(T_INT_CAST)".


This is why I was able to get a working prototype for new error messages 
fairly quickly, and have had a draft PR open for nearly two weeks, but 
also why I've spent a bit of time refining it to make sure edge cases 
are handled better.




I find it highly frustrating, and borderline offensive, that we are being
asked to go from a simple, non BC breaking, easy to enact change, to a
semi-major overhaul of how error messages should look like.


I understand the general sentiment, that there were a lot of people in 
the previous thread saying re-wording was a good idea, and not many 
offering to work on it. I also appreciate that I hadn't provided any 
public updates on my progress, and what hurdles needed to be over-come. 
But you did know I was working on a patch, so the simple solution would 
have been to ask me before opening the vote.



Regards,

--
Rowan Tommins (né Collins)
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-29 Thread Claude Pache


> Le 28 juin 2020 à 00:53, G. P. B.  a écrit :
> 
> My perception is that most of the community finds it baffling why anyone
> would be against this change.
> 

What baffles me, is the amount of discussion around changing the name of ONE 
token, whereas it is clear that a bunch of tokens (a few of them explicitly 
mentioned during the discussion phase of the RFC) have incomprehensible names, 
or worse, misleading names (T_STRING instead of T_IDENTIFIER), several of them 
appearing more often in error messages than T_DOUBLE_COLON (so, to be 
super-clear, I’m not talking about T_SL). And at this point I have still 
restricted my view to the issue of token name, whereas there is an obvious way 
to do better at relatively low cost, namely replacing the token name with a 
user-friendly description of the token.

I really appreciate the efforts made to make PHP better; thanks for that. But I 
friendly suggest that, each time you are going to propose an improvement, you 
take first a step back and consider how your change fits in its wider context.

—Claude

Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-07-10 Thread Kalle Sommer Nielsen
Hello Internals

The voting has closed with 44 for and 30 against. The RFC has not been
accepted as it did not meet the 2/3 requirement to be approved, thank
you to everyone for joining the discussions and turned out to vote.


-- 
regards,

Kalle Sommer Nielsen
ka...@php.net

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread Andrea Faulds

Hi,

G. P. B. wrote:

https://wiki.php.net/rfc/rename-double-colon-token


I have voted No to this, and I hope I can convince some others to do the 
same.


T_PAAMAYIM_NEKUDOTAYIM is such a famous token that there is probably 
nobody in internals who doesn't know what it means, and for new 
contributors, it is easy to find the definition, and note that it is 
hardly the only token name that they will need to look up. It is also a 
fun nod to the history of PHP, and I think it would be a shame to lose 
that.


I mention the internals usage first and foremost, because it should be 
remembered that token names are merely an implementation detail of the 
PHP interpreter, unless you're using token_get_all (which by the way 
already has the alias T_DOUBLE_COLON). In other words, it's not 
something the vast majority of userland developers should ever encounter 
or have to think about.


Of course, if T_PAAMAYIM_NEKUDOTAYIM was never encountered by userland 
developers, this RFC wouldn't exist. The thing is, I don't think 
T_DOUBLE_COLON should be encountered by userland developers either — in 
my view, as an implementation detail, token names shouldn't be part of 
parser error messages at all. If we were to remove token names from the 
parser errors, we would avoid the problem this RFC seeks to solve. For 
most tokens we could simply display the characters it corresponds to 
(e.g. "::" for T_PAAMAYIM_NEKUDOTAYIM, which we already do!), and for 
those with variable content (e.g. T_STRING) we could display a 
human-readable description of what is expected (e.g. "an identifier").


I think the case for not renaming T_PAAMAYIM_NEKUDOTAYIM, and instead 
improving the error messages, is stronger when you consider that is not 
the only token with a name that might confuse people outside internals. 
For example, T_STRING is a very common token, but the name is probably 
going to surprise most userland developers who encounter it in an error 
message, because it doesn't mean a literal string. Even for tokens with 
more conventional names, it is unnecessary extra information. I think 
renaming just T_PAAMAYIM_NEKUDOTAYIM is not a full solution to the 
problem this RFC intends to solve.


Apropos of that:

> We did acknowledge the suggestion of dropping the token name from the
> error message directly, but in our opinion this is an orthogonal
> change to the one proposed, and has the risk of not landing in PHP
> 8.0.

Is PHP 8.0 an all-important? If we _don't_ rename the tokens, but simply 
improve the error message, that might be allowable in a patch release 
(e.g. 8.0.1).


(I also don't think we should rush things if we are unsure about them, 
given the consequences that has had in the past.)


Thanks,
Andrea

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread Benjamin Eberlei
On Sat, Jun 27, 2020 at 11:57 AM Andrea Faulds  wrote:

> Hi,
>
> G. P. B. wrote:
> > https://wiki.php.net/rfc/rename-double-colon-token
>
> I have voted No to this, and I hope I can convince some others to do the
> same.
>
> T_PAAMAYIM_NEKUDOTAYIM is such a famous token that there is probably
> nobody in internals who doesn't know what it means, and for new
> contributors, it is easy to find the definition, and note that it is
> hardly the only token name that they will need to look up. It is also a
> fun nod to the history of PHP, and I think it would be a shame to lose
> that.
>
> I mention the internals usage first and foremost, because it should be
> remembered that token names are merely an implementation detail of the
> PHP interpreter, unless you're using token_get_all (which by the way
> already has the alias T_DOUBLE_COLON). In other words, it's not
> something the vast majority of userland developers should ever encounter
> or have to think about.
>
> Of course, if T_PAAMAYIM_NEKUDOTAYIM was never encountered by userland
> developers, this RFC wouldn't exist. The thing is, I don't think
> T_DOUBLE_COLON should be encountered by userland developers either — in
> my view, as an implementation detail, token names shouldn't be part of
> parser error messages at all. If we were to remove token names from the
> parser errors, we would avoid the problem this RFC seeks to solve. For
> most tokens we could simply display the characters it corresponds to
> (e.g. "::" for T_PAAMAYIM_NEKUDOTAYIM, which we already do!), and for
> those with variable content (e.g. T_STRING) we could display a
> human-readable description of what is expected (e.g. "an identifier").
>
> I think the case for not renaming T_PAAMAYIM_NEKUDOTAYIM, and instead
> improving the error messages, is stronger when you consider that is not
> the only token with a name that might confuse people outside internals.
> For example, T_STRING is a very common token, but the name is probably
> going to surprise most userland developers who encounter it in an error
> message, because it doesn't mean a literal string. Even for tokens with
> more conventional names, it is unnecessary extra information. I think
> renaming just T_PAAMAYIM_NEKUDOTAYIM is not a full solution to the
> problem this RFC intends to solve.
>
> Apropos of that:
>
>  > We did acknowledge the suggestion of dropping the token name from the
>  > error message directly, but in our opinion this is an orthogonal
>  > change to the one proposed, and has the risk of not landing in PHP
>  > 8.0.
>
> Is PHP 8.0 an all-important? If we _don't_ rename the tokens, but simply
> improve the error message, that might be allowable in a patch release
> (e.g. 8.0.1).
>
> (I also don't think we should rush things if we are unsure about them,
> given the consequences that has had in the past.)
>

This is the same reason that I voted no.

1. The token name has historic value and should be preserved as such
2. as several people suggested in the original RFC discussion, we should
just not display tokens in error messages anymore. That should have been
the RFCs solution, not renaming the token

>
> Thanks,
> Andrea
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>
>


Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

2020-06-27 Thread Nikita Popov
On Sat, Jun 27, 2020 at 11:57 AM Andrea Faulds  wrote:

> Hi,
>
> G. P. B. wrote:
> > https://wiki.php.net/rfc/rename-double-colon-token
>
> I have voted No to this, and I hope I can convince some others to do the
> same.
>
> T_PAAMAYIM_NEKUDOTAYIM is such a famous token that there is probably
> nobody in internals who doesn't know what it means, and for new
> contributors, it is easy to find the definition, and note that it is
> hardly the only token name that they will need to look up. It is also a
> fun nod to the history of PHP, and I think it would be a shame to lose
> that.
>
> I mention the internals usage first and foremost, because it should be
> remembered that token names are merely an implementation detail of the
> PHP interpreter, unless you're using token_get_all (which by the way
> already has the alias T_DOUBLE_COLON). In other words, it's not
> something the vast majority of userland developers should ever encounter
> or have to think about.
>
> Of course, if T_PAAMAYIM_NEKUDOTAYIM was never encountered by userland
> developers, this RFC wouldn't exist. The thing is, I don't think
> T_DOUBLE_COLON should be encountered by userland developers either — in
> my view, as an implementation detail, token names shouldn't be part of
> parser error messages at all. If we were to remove token names from the
> parser errors, we would avoid the problem this RFC seeks to solve. For
> most tokens we could simply display the characters it corresponds to
> (e.g. "::" for T_PAAMAYIM_NEKUDOTAYIM, which we already do!), and for
> those with variable content (e.g. T_STRING) we could display a
> human-readable description of what is expected (e.g. "an identifier").
>
> I think the case for not renaming T_PAAMAYIM_NEKUDOTAYIM, and instead
> improving the error messages, is stronger when you consider that is not
> the only token with a name that might confuse people outside internals.
> For example, T_STRING is a very common token, but the name is probably
> going to surprise most userland developers who encounter it in an error
> message, because it doesn't mean a literal string. Even for tokens with
> more conventional names, it is unnecessary extra information. I think
> renaming just T_PAAMAYIM_NEKUDOTAYIM is not a full solution to the
> problem this RFC intends to solve.
>
> Apropos of that:
>
>  > We did acknowledge the suggestion of dropping the token name from the
>  > error message directly, but in our opinion this is an orthogonal
>  > change to the one proposed, and has the risk of not landing in PHP
>  > 8.0.
>
> Is PHP 8.0 an all-important? If we _don't_ rename the tokens, but simply
> improve the error message, that might be allowable in a patch release
> (e.g. 8.0.1).
>
> (I also don't think we should rush things if we are unsure about them,
> given the consequences that has had in the past.)
>
> Thanks,
> Andrea
>

Hey Andrea,

You convinced me! To vote "yes" on this RFC :P

I initially thought that this proposal is simply redundant, because the
issue would be fully addressed by

a) not displaying token names in errors, which I expect to happen for PHP
8.0, and
b) renaming T_PAAMAYIM_NEKUDOTAYIM as far as internal usage is concerned,
which is an implementation detail on which internals@ input is not required,

but I now realize that there might be friction on point b) without this
RFC, so it is better to accept it.

Regards,
Nikita