aaron.ballman accepted this revision. aaron.ballman added a comment. This revision is now accepted and ready to land.
Thanks for the discussion on this new check, it LGTM! ================ Comment at: clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp:47 + CRLF, + CRLFCR, + }; ---------------- LegalizeAdulthood wrote: > aaron.ballman wrote: > > LegalizeAdulthood wrote: > > > aaron.ballman wrote: > > > > LegalizeAdulthood wrote: > > > > > aaron.ballman wrote: > > > > > > LegalizeAdulthood wrote: > > > > > > > aaron.ballman wrote: > > > > > > > > I'm a bit confused by this one as this is not a valid line > > > > > > > > ending (it's either three valid line endings or two valid line > > > > > > > > endings, depending on how you look at it). Can you explain why > > > > > > > > this is needed? > > > > > > > It's a state machine, where the states are named for what we've > > > > > > > seen so far and we're looking for //two// consecutive line > > > > > > > endings, not just one. Does it make sense now? > > > > > > Thanks, I understood it was a state machine, but it's a confused > > > > > > one to me. `\r` was the line ending on Mac Classic, I've not seen > > > > > > it used outside of that platform (and I've not seen anyone write > > > > > > code for that platform in a long time). So, to me, the only valid > > > > > > combinations of line endings to worry about are: `LF LF`; `CRLF > > > > > > CRLF`; `CRLF LF`; `LF CRLF`. > > > > > > > > > > > > `LF LF` returns false (Nothing -> LF -> return false) > > > > > > `CRLF CRLF` returns false (Nothing -> CR -> CRLF -> CRLFCR -> > > > > > > return false) > > > > > > `CRLF LF` returns true (Nothing -> CR -> CRLF -> LF -> finish loop) > > > > > > `LF CRLF` returns true (Nothing -> LF -> CR -> CRLF -> finish loop) > > > > > > > > > > > > (If you also intend to support Mac Classic line endings for some > > > > > > reason, this gets even more complicated.) > > > > > I was trying to follow "be liberal in what you accept as input and > > > > > conservative in what you generate as output" maxim. I can remove the > > > > > `CR` as a line ending case if you think it's too obscure. > > > > If Clang supports it as a line ending, we probably should too, but... > > > > how do we handle CRLF vs "I mixed a CR with an LF by accident" kind of > > > > inputs? (Maybe we just treat that as CRLF and if the behavior is bad, > > > > the user shouldn't mix their line endings that way; I think that's > > > > defensible.) That seems to be similar to the scenario that's confusing > > > > me above where the user mixed an LF and CRLF by accident. > > > Well, as far as Clang is concerned it's all just "whitespace" that gets > > > eaten up by the preprocessor. Actually, that gives me a thought. A > > > preprocessing directive is considered to end at the physical line ending, > > > so I should look to see what sort of characters it considers to "end the > > > line". > > > > > > For the accidental mix-up, I'm not going to worry about that here. Your > > > input files are assumed to be "well formed". The worst that happens in > > > this check is that two blocks of macros that //look// like they are > > > separated by a blank line are considered as a single clump by this check. > > > > > > In other words, the worst that can happen is: > > > - Two clumps of macros are considered together. > > > - One clump of macros that is discarded because it doesn't follow the > > > constraints "taints" an adjacent clump of macros that do follow the > > > constraints. > > > > > > Either way, nothing harmful happens to your code. It will still compile > > > and be syntactically and semantically equivalent to what was there before. > > > > > > Actually, that gives me a thought. A preprocessing directive is > > > considered to end at the physical line ending, so I should look to see > > > what sort of characters it considers to "end the line". > > > > All of `\r`, `\n`, `\r\n` I believe (you can double-check in > > `Lexer::LexTokenInternal()` > > > > > Either way, nothing harmful happens to your code. It will still compile > > > and be syntactically and semantically equivalent to what was there before. > > > > Oh, that's a very good point, thank you. I think that's reasonable fallback > > behavior for these weird edge cases. > Well..... maybe. > > If you look at `Lexer::ReadToEndOfLine` which is used to skip to the end of a > preprocessor directive you'll see that it considers the first of `'\r'`, > `'\n'` or `'\0'` (end of file) as the end of the "line". This is around line > 2835 of Lexer.cpp in my tree. Yeah, I saw that as well (that's typically used for error recovery in the preprocessor). We also have `Lexer::SkipWhitespace()` which skips all vertical whitespace, but not `\0`. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D117522/new/ https://reviews.llvm.org/D117522 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits