https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97604
Bug ID: 97604 Summary: Bad digit separators accepted in pp-numbers Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: rejects-valid Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: jsm28 at gcc dot gnu.org CC: emsr at gcc dot gnu.org Target Milestone: --- cpplib lexes pp-numbers in lex_number. Following bug 64626 that includes some logic to disallow a pp-number ending with C++ digit separators. However, that logic is insufficient to cover all cases where the lexing includes too many characters in the pp-number. Compile the following with -std=c++17: int a = 0x0'e-0xe; This gives a bogus error: t.cc:1:9: error: unable to find numeric literal operator 'operator""-0xe' 1 | int a = 0x0'e-0xe; | ^~~~~~~~~ t.cc:1:9: note: use '-fext-numeric-literals' to enable more built-in suffixes The pp-number syntax starts a pp-number with "digit" or ". digit" and then allows various things to follow, one of which is "' nondigit" and another one of which is "e sign". The longest possible preprocessing token starting with the first 0 in the above example is 0x0'e because the text preceding "e-" ends with "'" and so is not a pp-number. So 0x0'e is a preprocessing token, followed by "-", and the above is in fact a subtraction of two separate integer literals, i.e. valid C++ input. "'" must only be accepted in a pp-number when followed by a digit or nondigit, and if that nondigit is e, E, p or P, it terminates the pp-number if a sign follows. Although I haven't given examples here, you can probably construct rejects-valid examples (ones involving macro expansion, at least) also for the case of wrongly accepting a digit separator followed by a UCN / UTF-8 character (an identifier-nondigit that is not a nondigit) or '.'. The case of consecutive digit separators shouldn't introduce rejects-valid bugs because '' isn't valid at the start of a preprocessing token, but bug 83873 would be fixed by following the syntax in lex_number and rejecting them there rather than trying to catch them later.