Re: [PATCH] Do not allow identifiers that start with a negative number.

Joel E. Denny Sat, 15 Jan 2011 13:52:17 -0800

Alex and Akim, you might also want to review Paul Eggert's changes as they 
rewrite a significant portion of Alex's work, and Paul has already pushed 
his changes to master.  My review is below.

On Sun, 9 Jan 2011, Paul Eggert wrote:

> On 01/09/2011 05:39 AM, Joel E. Denny wrote:
> > capturing the full sequence of characters that the user might have 
> > accidentally intended as part of the symbol name enables Bison to give 
> > more helpful errors and warnings.
> 
> That is an advantage, but it is a relatively minor one compared to the
> confusion engendered by overly complex rules.  Even now I don't
> fully understand them, and I expect users are likely to be in the
> same boat.  For example, Bison currently complains if you jam a
> number next to an identifier, like this:
> 
>    123FOO
> 
> because that's confusing.

Even programmers who don't use Bison would likely agree that, without 
whitespace, that should be one token not two.  The only question is 
whether it's a valid token.

Unfortunately, at least as far back as 1.875 and as recent as 2.4.3, Bison 
has treated that as two tokens.  Thus, Bison accepted

  %token BAR 123FOO

as

  %token BAR 123 FOO

Akim fixed this in branch-2.5 and master so that 123FOO is an error.  
Your patch does not change this behavior, so I assume you're fine with 
Akim's fix.

> Presumably it should also complain about this:
> 
>   -123
> 
> because that's an identifier (-) jammed next to a number (123).

What programmer wouldn't read that as a negative integer?

Before your patch, branch-2.5 and master unfortunately accepted

  %token FOO -123

as

  %token FOO - 123

Clearly that's a bug.  Instead, Bison should read -123 as a negative 
integer and report an error as it has previously.  Fixing that would be 
easy.

However, after your patch, Bison actually accepts -123 as an identifier. 
That isn't intuitive at all.

> But then, why not also complain about this?
> 
>   -.234FOO
> 
> as that also looks like a number (-.234) jammed next to an identifier?

Bison understands integers.  It doesn't understand "." as a decimal point.  
It should be fairly easy for users to grasp that concept.  Even Yacc 
treats "." as a dot not a decimal point.  Our documentation simply needs 
to be clear about what is an identifier:

  An identifier can be any sequence of letters, underscores, periods, 
  dashes, and digits that does not start with an integer (unsigned or 
  negative).

Thus, it's easy to understand that -.234FOO is an identifier, -123 is an 
integer not an identifier, and 123FOO is an error.  I see nothing 
unintuitive about those cases.

Even so, I could also live with either of the following, but I think 
they're unnecessarily restrictive:

  An identifier can be any sequence of letters, underscores, periods, 
  dashes, and digits that does not start with an integer or float 
  (unsigned or negative).

  An identifier can be any sequence of letters, underscores, periods, 
  dashes, and digits that does not start with a digit or dash.

In summary, you apparently want to say that negative integers can be 
identifiers while unsigned integers cannot.  I think you're heading in the 
wrong direction.

> > Feel free to propose a patch.
> 
> How about the following patch?  It tries to simplify things, at some
> cost in niceties in diagnostics.  I pushed it into the trunk.  Further
> comments are welcome.

You started your email by arguing that you want to make improvements to 
help the user.  However, when I look through your test suite changes 
(where you should be testing all such improvements), all I see is a loss 
of diagnostic information provided to the user.  Specifically, many 
"possibly meant" messages are now missing.

Sorry, my opinion is that your patch does more harm than good and should 
be backed out.  However, I would very much like for Alex and Akim to give 
their opinions.

Re: [PATCH] Do not allow identifiers that start with a negative number.

Reply via email to