[PATCH] D117522: [clang-tidy] Add modernize-macro-to-enum check

Aaron Ballman via Phabricator via cfe-commits Fri, 25 Mar 2022 06:17:58 -0700

aaron.ballman accepted this revision.
aaron.ballman added a comment.
This revision is now accepted and ready to land.


Thanks for the discussion on this new check, it LGTM!



================
Comment at: clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp:47
+    CRLF,
+    CRLFCR,
+  };
----------------
LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > aaron.ballman wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > LegalizeAdulthood wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > I'm a bit confused by this one as this is not a valid line 
> > > > > > > > ending (it's either three valid line endings or two valid line 
> > > > > > > > endings, depending on how you look at it). Can you explain why 
> > > > > > > > this is needed?
> > > > > > > It's a state machine, where the states are named for what we've 
> > > > > > > seen so far and we're looking for //two// consecutive line 
> > > > > > > endings, not just one.  Does it make sense now?
> > > > > > Thanks, I understood it was a state machine, but it's a confused 
> > > > > > one to me. `\r` was the line ending on Mac Classic, I've not seen 
> > > > > > it used outside of that platform (and I've not seen anyone write 
> > > > > > code for that platform in a long time). So, to me, the only valid 
> > > > > > combinations of line endings to worry about are: `LF LF`; `CRLF 
> > > > > > CRLF`; `CRLF LF`; `LF CRLF`.
> > > > > > 
> > > > > > `LF LF` returns false (Nothing -> LF -> return false)
> > > > > > `CRLF CRLF` returns false (Nothing -> CR -> CRLF -> CRLFCR -> 
> > > > > > return false)
> > > > > > `CRLF LF` returns true (Nothing -> CR -> CRLF -> LF -> finish loop)
> > > > > > `LF CRLF` returns true (Nothing -> LF -> CR -> CRLF -> finish loop)
> > > > > > 
> > > > > > (If you also intend to support Mac Classic line endings for some 
> > > > > > reason, this gets even more complicated.)
> > > > > I was trying to follow "be liberal in what you accept as input and 
> > > > > conservative in what you generate as output" maxim.  I can remove the 
> > > > > `CR` as a line ending case if you think it's too obscure.
> > > > If Clang supports it as a line ending, we probably should too, but... 
> > > > how do we handle CRLF vs "I mixed a CR with an LF by accident" kind of 
> > > > inputs? (Maybe we just treat that as CRLF and if the behavior is bad, 
> > > > the user shouldn't mix their line endings that way; I think that's 
> > > > defensible.) That seems to be similar to the scenario that's confusing 
> > > > me above where the user mixed an LF and CRLF by accident.
> > > Well, as far as Clang is concerned it's all just "whitespace" that gets 
> > > eaten up by the preprocessor.  Actually, that gives me a thought.  A 
> > > preprocessing directive is considered to end at the physical line ending, 
> > > so I should look to see what sort of characters it considers to "end the 
> > > line".
> > > 
> > > For the accidental mix-up, I'm not going to worry about that here.  Your 
> > > input files are assumed to be "well formed".  The worst that happens in 
> > > this check is that two blocks of macros that //look// like they are 
> > > separated by a blank line are considered as a single clump by this check.
> > > 
> > > In other words, the worst that can happen is:
> > >   - Two clumps of macros are considered together.
> > >   - One clump of macros that is discarded because it doesn't follow the 
> > > constraints "taints" an adjacent clump of macros that do follow the 
> > > constraints.
> > > 
> > > Either way, nothing harmful happens to your code.  It will still compile 
> > > and be syntactically and semantically equivalent to what was there before.
> > > 
> > > Actually, that gives me a thought. A preprocessing directive is 
> > > considered to end at the physical line ending, so I should look to see 
> > > what sort of characters it considers to "end the line".
> > 
> > All of `\r`, `\n`, `\r\n` I believe (you can double-check in 
> > `Lexer::LexTokenInternal()`
> > 
> > > Either way, nothing harmful happens to your code. It will still compile 
> > > and be syntactically and semantically equivalent to what was there before.
> > 
> > Oh, that's a very good point, thank you. I think that's reasonable fallback 
> > behavior for these weird edge cases.
> Well..... maybe.
> 
> If you look at `Lexer::ReadToEndOfLine` which is used to skip to the end of a 
> preprocessor directive you'll see that it considers the first of `'\r'`, 
> `'\n'` or `'\0'` (end of file) as the end of the "line".  This is around line 
> 2835 of Lexer.cpp in my tree.
Yeah, I saw that as well (that's typically used for error recovery in the 
preprocessor). We also have `Lexer::SkipWhitespace()` which skips all vertical 
whitespace, but not `\0`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117522/new/

https://reviews.llvm.org/D117522

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D117522: [clang-tidy] Add modernize-macro-to-enum check

Reply via email to