[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D83296#2299062 , @nridge wrote:

> What does this change mean for users of clang-format -- better formatting of 
> complicated (e.g. multi-line) macro invocations?

In addition to what Sam said, this also attempts to be an improvement in 
maintainability. Given this is a fairly complex change, you might ask how this 
helps :)
The idea is that we bundle the complexity of macro handling in a clearly 
separated part of the code that can be tested and developed ~on its own.
Currently, we have multiple macro regex settings that then lead to random code 
throughout clang-format that tries to handle those identifiers special.
Once this is done, we can delete all those settings, as the more generalized 
macro configuration will supersede them.




Comment at: clang/lib/Format/MacroExpander.cpp:190
+  return true;
+for (const auto  : Args[I->getValue()]) {
+  // A token can be part of a macro argument at multiple levels.

sammccall wrote:
> nit: this is confusingly a const reference to a non-const pointer... `auto *` 
> or `FormatToken *`?
Yikes, thanks for catching!



Comment at: clang/unittests/Format/MacroExpanderTest.cpp:183
+  EXPECT_ATTRIBUTES(Result, Attributes);
+}
+

sammccall wrote:
> may want a test that uses of an arg after the first are not expanded, because 
> that "guards" a bunch of nasty potential bugs
Discussed offline: the above test tests exactly this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-28 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

In D83296#2299062 , @nridge wrote:

> What does this change mean for users of clang-format -- better formatting of 
> complicated (e.g. multi-line) macro invocations?

Nothing from this change is exposed yet, it's part of a series.
The end goal is as you say: perfect formatting of code in and around arbitrary 
macros, by passing (simplified) macro definitions as configuration.

D88299  is next, Manuel assures me it gets 
easier from there :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-28 Thread Nathan Ridge via Phabricator via cfe-commits
nridge added a comment.

What does this change mean for users of clang-format -- better formatting of 
complicated (e.g. multi-line) macro invocations?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-25 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
klimek marked 5 inline comments as done.
Closed by commit rGe336b74c995d: [clang-format] Add a MacroExpander. (authored 
by klimek).

Changed prior to commit:
  https://reviews.llvm.org/D83296?vs=292970=294285#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef CLANG_UNITTESTS_FORMAT_TESTLEXER_H
+#define CLANG_UNITTESTS_FORMAT_TESTLEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,187 @@
+#include "../../lib/Format/Macros.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+   Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+
+  struct MacroAttributes {
+clang::tok::TokenKind Kind;
+MacroRole Role;
+unsigned Start;
+unsigned End;
+llvm::SmallVector ExpandedFrom;
+  };

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-22 Thread Sam McCall via Phabricator via cfe-commits
sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

Ship it!




Comment at: clang/lib/Format/FormatToken.h:177
+/// EndOfExpansion:   0  0 0   21 0 1
+struct MacroContext {
+  MacroContext(MacroRole Role) : Role(Role) {}

klimek wrote:
> sammccall wrote:
> > "context" is often pretty vague - "MacroSource" isn't a brilliant name but 
> > at least seems to hint at the direction (that the FormatToken is the 
> > expanded token and the MacroSource gives information about what it was 
> > expanded from)
> > 
> > I don't feel strongly about this though, up to you.
> MacroSource sounds like it is about the macro source (i.e. the tokens written 
> for the macro).
> I'd be happy to rename to MacroExpansion or MacroExpansionInfo or somesuch if 
> you think that helps?
Oops, accidental "source" pun. MacroOrigin would be another name in the same 
spirit. But MacroExpansion sounds good to me, too.



Comment at: clang/lib/Format/MacroExpander.cpp:190
+  return true;
+for (const auto  : Args[I->getValue()]) {
+  // A token can be part of a macro argument at multiple levels.

nit: this is confusingly a const reference to a non-const pointer... `auto *` 
or `FormatToken *`?



Comment at: clang/lib/Format/MacroExpander.cpp:142
+  auto Definition = Parser.parse();
+  Definitions[Definition.Name] = Definition;
+}

klimek wrote:
> sammccall wrote:
> > nit: this is a copy for what seems like no reason - move `Parser.parse()` 
> > inline to this line?
> Reason is that we need the name.
oops, right.
std::move() the RHS? (mostly I just find the copies surprising, so draws 
attention)



Comment at: clang/unittests/Format/MacroExpanderTest.cpp:183
+  EXPECT_ATTRIBUTES(Result, Attributes);
+}
+

may want a test that uses of an arg after the first are not expanded, because 
that "guards" a bunch of nasty potential bugs



Comment at: clang/unittests/Format/TestLexer.h:15
+
+#ifndef CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+#define CLANG_UNITTESTS_FORMAT_TEST_LEXER_H

I guess clang-tidy wants ..._TESTLEXER_H here


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-19 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 292970.
klimek marked 7 inline comments as done.
klimek added a comment.

Worked in review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+#define CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,187 @@
+#include "../../lib/Format/Macros.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+   Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+
+  struct MacroAttributes {
+clang::tok::TokenKind Kind;
+MacroRole Role;
+unsigned Start;
+unsigned End;
+llvm::SmallVector ExpandedFrom;
+  };
+
+  void expectAttributes(const TokenList ,
+const std::vector ,
+const std::string , unsigned Line) {
+EXPECT_EQ(Tokens.size(), Attributes.size()) 

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-19 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/FormatToken.h:177
+/// EndOfExpansion:   0  0 0   21 0 1
+struct MacroContext {
+  MacroContext(MacroRole Role) : Role(Role) {}

sammccall wrote:
> "context" is often pretty vague - "MacroSource" isn't a brilliant name but at 
> least seems to hint at the direction (that the FormatToken is the expanded 
> token and the MacroSource gives information about what it was expanded from)
> 
> I don't feel strongly about this though, up to you.
MacroSource sounds like it is about the macro source (i.e. the tokens written 
for the macro).
I'd be happy to rename to MacroExpansion or MacroExpansionInfo or somesuch if 
you think that helps?



Comment at: clang/lib/Format/MacroExpander.cpp:81
+  nextToken();
+}
+if (Current->isNot(tok::r_paren))

sammccall wrote:
> this accepts `FOO(A,B,)=...` as equivalent to `FOO(A,B)=...`. Not sure if 
> worth fixing.
We're generally accepting too much; I'd either want to restrict it fully, or 
basically be somewhat minimum/forgiving. Given that we can't get errors back to 
the user, I was aiming for the latter.



Comment at: clang/lib/Format/MacroExpander.cpp:88
+
+  bool parseExpansion() {
+if (!Current->isOneOf(tok::equal, tok::eof))

sammccall wrote:
> (nit: I'd probably find this easier to follow as `if (equal) else if (eof) 
> else` with parseTail inlined, but up to you)
I basically like having the implementation match the BNR.
That said, not feeling strongly about it. You're saying you'd duplicate the 
Def.Body.push_back in the if (eof)?

  if (Current->is(tok::equal) {
nextToken();
// inline parseTail
  } else if (Current->is(tok::eof) {
Def.Body.push_back(Current);
  } else {
return false;
  }

Generally, I personally find it easier to read the early exit.



Comment at: clang/lib/Format/MacroExpander.cpp:142
+  auto Definition = Parser.parse();
+  Definitions[Definition.Name] = Definition;
+}

sammccall wrote:
> nit: this is a copy for what seems like no reason - move `Parser.parse()` 
> inline to this line?
Reason is that we need the name.



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

klimek wrote:
> sammccall wrote:
> > klimek wrote:
> > > sammccall wrote:
> > > > you're pushing here without copying. This means the original tokens 
> > > > from the ArgsList are mutated. Maybe we own them, but this seems at 
> > > > least wrong for multiple expansion of the same arg.
> > > > 
> > > > e.g.
> > > > ```
> > > > #define M(X,Y) X Y X
> > > > M(1,2)
> > > > ```
> > > > 
> > > > Will expand to:
> > > > - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> > > > - 2, ExpandedArg, ExpandedFrom = [M]
> > > > - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token 
> > > > pointer as the first one
> > > > 
> > > > Maybe it would be better if pushToken performed the copy, and returned 
> > > > a mutable pointer to the copy. (If you can make the input const, that 
> > > > would prevent this type of bug)
> > > Ugh. I'll need to take a deeper look, but generally, the problem is we 
> > > don't want to copy - we're mutating the data of the token while 
> > > formatting the expanded token stream, and then re-use that info when 
> > > formatting the original stream.
> > > We could copy, add a reference to the original token, and then have a 
> > > step that adapts in the end, and perhaps that's cleaner overall anyway, 
> > > but will be quite a change.
> > > The alternative is that I'll look into how to specifically handle 
> > > double-expansion (or ... forbid it).
> > > (or ... forbid it).
> > 
> > I'm starting to think this is the best option.
> > 
> > The downsides seem pretty acceptable to me:
> >  - it's another wart to document: on the other hand it simplifies the 
> > conceptual model, I think it helps users understand the deeper behavior
> >  - some macros require simplification rather than supplying the actual 
> > definition: already crossed this bridge by not supporting macros in macro 
> > bodies, variadics, pasting...
> >  - loses information: one expansion is enough to establish which part of 
> > the grammar the arguments form in realistic cases. (Even in pathological 
> > cases, preserving the conflicting info only helps you if you have a plan to 
> > resolve the conflicts)
> >  - it's another wart to document: 
> > 
> > Are there any others?
> > 
> My main concern is that it's probably the most surprising feature to not 
> support.
Forbade multi-expansion.



Comment at: clang/lib/Format/Macros.h:82
+///
+/// Furthermore, expansion and definition are independent - a macro defined
+/// as "A()=a" and "A=a" can both be expanded  as "A()" and "A".


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-18 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

Just checking this is waiting on you rather than me...

If the multiple-expansion thing is blocking progress, I think we're much better 
off getting a limited version of this feature landed than losing momentum 
trying to solve it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-08-20 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

sammccall wrote:
> klimek wrote:
> > sammccall wrote:
> > > you're pushing here without copying. This means the original tokens from 
> > > the ArgsList are mutated. Maybe we own them, but this seems at least 
> > > wrong for multiple expansion of the same arg.
> > > 
> > > e.g.
> > > ```
> > > #define M(X,Y) X Y X
> > > M(1,2)
> > > ```
> > > 
> > > Will expand to:
> > > - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> > > - 2, ExpandedArg, ExpandedFrom = [M]
> > > - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token pointer 
> > > as the first one
> > > 
> > > Maybe it would be better if pushToken performed the copy, and returned a 
> > > mutable pointer to the copy. (If you can make the input const, that would 
> > > prevent this type of bug)
> > Ugh. I'll need to take a deeper look, but generally, the problem is we 
> > don't want to copy - we're mutating the data of the token while formatting 
> > the expanded token stream, and then re-use that info when formatting the 
> > original stream.
> > We could copy, add a reference to the original token, and then have a step 
> > that adapts in the end, and perhaps that's cleaner overall anyway, but will 
> > be quite a change.
> > The alternative is that I'll look into how to specifically handle 
> > double-expansion (or ... forbid it).
> > (or ... forbid it).
> 
> I'm starting to think this is the best option.
> 
> The downsides seem pretty acceptable to me:
>  - it's another wart to document: on the other hand it simplifies the 
> conceptual model, I think it helps users understand the deeper behavior
>  - some macros require simplification rather than supplying the actual 
> definition: already crossed this bridge by not supporting macros in macro 
> bodies, variadics, pasting...
>  - loses information: one expansion is enough to establish which part of the 
> grammar the arguments form in realistic cases. (Even in pathological cases, 
> preserving the conflicting info only helps you if you have a plan to resolve 
> the conflicts)
>  - it's another wart to document: 
> 
> Are there any others?
> 
My main concern is that it's probably the most surprising feature to not 
support.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-08-19 Thread Sam McCall via Phabricator via cfe-commits
sammccall added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

klimek wrote:
> sammccall wrote:
> > you're pushing here without copying. This means the original tokens from 
> > the ArgsList are mutated. Maybe we own them, but this seems at least wrong 
> > for multiple expansion of the same arg.
> > 
> > e.g.
> > ```
> > #define M(X,Y) X Y X
> > M(1,2)
> > ```
> > 
> > Will expand to:
> > - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> > - 2, ExpandedArg, ExpandedFrom = [M]
> > - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token pointer 
> > as the first one
> > 
> > Maybe it would be better if pushToken performed the copy, and returned a 
> > mutable pointer to the copy. (If you can make the input const, that would 
> > prevent this type of bug)
> Ugh. I'll need to take a deeper look, but generally, the problem is we don't 
> want to copy - we're mutating the data of the token while formatting the 
> expanded token stream, and then re-use that info when formatting the original 
> stream.
> We could copy, add a reference to the original token, and then have a step 
> that adapts in the end, and perhaps that's cleaner overall anyway, but will 
> be quite a change.
> The alternative is that I'll look into how to specifically handle 
> double-expansion (or ... forbid it).
> (or ... forbid it).

I'm starting to think this is the best option.

The downsides seem pretty acceptable to me:
 - it's another wart to document: on the other hand it simplifies the 
conceptual model, I think it helps users understand the deeper behavior
 - some macros require simplification rather than supplying the actual 
definition: already crossed this bridge by not supporting macros in macro 
bodies, variadics, pasting...
 - loses information: one expansion is enough to establish which part of the 
grammar the arguments form in realistic cases. (Even in pathological cases, 
preserving the conflicting info only helps you if you have a plan to resolve 
the conflicts)
 - it's another wart to document: 

Are there any others?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-08-19 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

sammccall wrote:
> you're pushing here without copying. This means the original tokens from the 
> ArgsList are mutated. Maybe we own them, but this seems at least wrong for 
> multiple expansion of the same arg.
> 
> e.g.
> ```
> #define M(X,Y) X Y X
> M(1,2)
> ```
> 
> Will expand to:
> - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> - 2, ExpandedArg, ExpandedFrom = [M]
> - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token pointer as 
> the first one
> 
> Maybe it would be better if pushToken performed the copy, and returned a 
> mutable pointer to the copy. (If you can make the input const, that would 
> prevent this type of bug)
Ugh. I'll need to take a deeper look, but generally, the problem is we don't 
want to copy - we're mutating the data of the token while formatting the 
expanded token stream, and then re-use that info when formatting the original 
stream.
We could copy, add a reference to the original token, and then have a step that 
adapts in the end, and perhaps that's cleaner overall anyway, but will be quite 
a change.
The alternative is that I'll look into how to specifically handle 
double-expansion (or ... forbid it).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-08-19 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

Somehow I missed the email from your replies.

Mostly nits that you can take or leave, but I think potential bugs around 
functionlike-vs-objectlike and multiple-expansion of args.




Comment at: clang/lib/Format/FormatToken.h:177
+/// EndOfExpansion:   0  0 0   21 0 1
+struct MacroContext {
+  MacroContext(MacroRole Role) : Role(Role) {}

"context" is often pretty vague - "MacroSource" isn't a brilliant name but at 
least seems to hint at the direction (that the FormatToken is the expanded 
token and the MacroSource gives information about what it was expanded from)

I don't feel strongly about this though, up to you.



Comment at: clang/lib/Format/FormatToken.h:646
 private:
-  // Disallow copying.
+  // Only allow copying via the explicit copyInto method.
   FormatToken(const FormatToken &) = delete;

nit: comment -> copyFrom



Comment at: clang/lib/Format/MacroExpander.cpp:1
+//===--- MacroExpander.h - Format C++ code --*- C++ 
-*-===//
+//

nit: banner is for wrong filename



Comment at: clang/lib/Format/MacroExpander.cpp:81
+  nextToken();
+}
+if (Current->isNot(tok::r_paren))

this accepts `FOO(A,B,)=...` as equivalent to `FOO(A,B)=...`. Not sure if worth 
fixing.



Comment at: clang/lib/Format/MacroExpander.cpp:88
+
+  bool parseExpansion() {
+if (!Current->isOneOf(tok::equal, tok::eof))

(nit: I'd probably find this easier to follow as `if (equal) else if (eof) 
else` with parseTail inlined, but up to you)



Comment at: clang/lib/Format/MacroExpander.cpp:131
+void MacroExpander::parseDefinitions(const std::vector ) {
+  for (const std::string  : Macros) {
+Buffers.push_back(

uber-nit: seems like this loop belongs in the caller



Comment at: clang/lib/Format/MacroExpander.cpp:142
+  auto Definition = Parser.parse();
+  Definitions[Definition.Name] = Definition;
+}

nit: this is a copy for what seems like no reason - move `Parser.parse()` 
inline to this line?



Comment at: clang/lib/Format/MacroExpander.cpp:155
+  SmallVector Result;
+  const Definition  = Definitions.lookup(ID->TokenText);
+

lookup() returns a value, so this is a copy (with lifetime-extension)
I think you want `*find`



Comment at: clang/lib/Format/MacroExpander.cpp:178
+  return true;
+for (const auto  : Args[I->getValue()]) {
+  // A token can be part of multiple macro arguments.

please use a different name for this variable, or the parameter it shadows, or 
preferably both!



Comment at: clang/lib/Format/MacroExpander.cpp:179
+for (const auto  : Args[I->getValue()]) {
+  // A token can be part of multiple macro arguments.
+  // For example, with "ID(x) x":

nit: "part of a macro argument at multiple levels"?
(Current text suggests to me that it can be arg 0 and arg 1 of the same macro)



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

you're pushing here without copying. This means the original tokens from the 
ArgsList are mutated. Maybe we own them, but this seems at least wrong for 
multiple expansion of the same arg.

e.g.
```
#define M(X,Y) X Y X
M(1,2)
```

Will expand to:
- 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
- 2, ExpandedArg, ExpandedFrom = [M]
- 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token pointer as 
the first one

Maybe it would be better if pushToken performed the copy, and returned a 
mutable pointer to the copy. (If you can make the input const, that would 
prevent this type of bug)



Comment at: clang/lib/Format/Macros.h:82
+///
+/// Furthermore, expansion and definition are independent - a macro defined
+/// as "A()=a" and "A=a" can both be expanded  as "A()" and "A".

Is this saying that the functionlike vs objectlike distiction is not preserved?

This doesn't seem safe (unless the *caller* is required to retain this 
information).
e.g. 
```
#define NUMBER int
using Calculator = NUMBER(); // does expansion consume () or not?
```



Comment at: clang/lib/Format/Macros.h:86
+public:
+  using ArgsList = llvm::ArrayRef>;
+

(Seems a little odd that these pointers to external FormatTokens aren't 
const... I can believe there's a reason though)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:35
+  SmallVector Params;
+  SmallVector Tokens;
+};

sammccall wrote:
> Tokens -> Expansion? (semantics rather than type)
Changed to "Body".



Comment at: clang/lib/Format/MacroExpander.cpp:81
+do {
+  Def.Tokens.push_back(Current);
+  nextToken();

sammccall wrote:
> this assumes the expansion is nonempty, which the grammar doesn't. while{} 
> instead?
I have no clue how this ever worked tbh O.O
Has been reworked as part of the move to use = to separate the macro signature 
from the body.



Comment at: clang/lib/Format/MacroExpander.cpp:113
+void MacroExpander::parseDefinitions(
+const std::vector ) {
+  for (const std::string  : MacroExpander) {

sammccall wrote:
> weird param name!
Copy-paste gone wrong I assume.



Comment at: clang/lib/Format/MacroExpander.cpp:116
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Macro, ""));
+clang::FileID FID =

sammccall wrote:
> This is a slightly spooky buffer name - it's the magic name the PP uses for 
> pasted tokens.
> A closer fit for config is maybe "" (like macro definitions 
> passed with `-D`).
> Is it necessary to use one of clang's magic buffer names at all? If so, 
> comment! Else maybe "" or something?
We need source locations, and apparently only:
,  and 
are allowed to have source locations.




Comment at: clang/lib/Format/MacroExpander.cpp:133
+  ArgsList Args) {
+  assert(defined(ID->TokenText));
+  SmallVector Result;

sammccall wrote:
> is the caller responsible for checking the #args matches #params?
> If so, document and assert here?
> 
> Looking at the implementation, it seems you don't expand if there are too few 
> args, and expand if there are too many args (ignoring the last ones). Maybe 
> it doesn't matter, but it'd be nice to be more consistent here.
> 
> (Probably worth calling out somewhere explicitly that variadic macros are not 
> supported)
Added docs in the class comment for MacroExpander.
(so far I always expand, too few -> empty, too many -> ignore)



Comment at: clang/lib/Format/MacroExpander.cpp:194
+assert(New->MacroCtx.Role == MR_None);
+// Tokens that are not part of the user code do not need to be formatted.
+New->MacroCtx.Role = MR_Hidden;

sammccall wrote:
> (I don't know exactly how this is used, but consider whether you mean "do not 
> need to", "should not" or "cannot" here)
Replaced with "are not".



Comment at: clang/lib/Format/MacroExpander.h:10
+///
+/// \file
+/// This file contains the declaration of MacroExpander, which handles macro

sammccall wrote:
> I think this comment is too short (and doesn't really say much that you can't 
> get from the filename). IME many people's mental model of macros is based on 
> how they're used rather than how they formally work, so I think it's worth 
> spelling out even the obvious implementation choices.
> 
> I'd like to see:
>  - rough description of the scope/goal of the feature (clang-format doesn't 
> see macro definitions, so macro uses tend to be pseudo-parsed as function 
> calls. When this isn't appropriate [example], a macro definition can be 
> provided as part of the style. When such a macro is used in the code, 
> clang-format will expand it as the preprocessor would before determining the 
> role of tokens. [example])
>  - explicitly call out here that only a single level of expansion is 
> supported, which is a divergence from the real preprocessor. (This influences 
> both the implementation and the mental model)
>  - what the MacroExpander does and how it relates to MacroContext. I think 
> this should give the input and output token streams names, as it's often 
> important to clearly distinguish the two. (TokenBuffer uses "Spelled tokens" 
> and "Expanded tokens" for this, which is not the same as how these terms are 
> used in SourceManager, but related and hasn't been confusing in practice).
>  - a rough sketch of how the mismatch between what was annotated and what 
> must be formatted is resolved e.g. -- just guessing here -- the spelled 
> stream is formatted but for macro args, the annotations from the expanded 
> stream are used.
> 
> (I'm assuming this is the best file to talk about the implementation of this 
> feature in general - i'm really hoping that there is such a file. If there 
> are a bunch of related utilities you might even consider renaming the header 
> as an umbrella to make them easier to document. This is a question of 
> taste...)
Moved to Macros.h and added file comment that tries to address all of these.



Comment at: clang/lib/Format/MacroExpander.h:66
+  /// each element in \p Args is a positional 

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 281611.
klimek marked 27 inline comments as done.
klimek added a comment.

Addressed code review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+#define CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,170 @@
+#include "../../lib/Format/Macros.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+   Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+
+  struct MacroAttributes {
+clang::tok::TokenKind Kind;
+MacroRole Role;
+unsigned Start;
+unsigned End;
+llvm::SmallVector ExpandedFrom;
+  };
+
+  void expectAttributes(const TokenList ,
+const std::vector ,
+const std::string , unsigned Line) {
+EXPECT_EQ(Tokens.size(), 

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-22 Thread Marek Kurdej via Phabricator via cfe-commits
curdeius added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:46
+
+  // Parse the token stream and return the corresonding Defintion object.
+  // Returns an empty definition object with a null-Name on error.

Nit: typo "corresponding Definition".



Comment at: clang/lib/Format/MacroExpander.cpp:110
+
+MacroExpander::~MacroExpander() {}
+

Why isn't it defaulted?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-22 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

Sorry for the long delay, I've made up for it with extra comments :-\

This looks really well-thought-out and I'm rationalizing my pickiness as:

- this is conceptually complicated
- I expect this code to live a long time and be read and "modified around" by 
lots of people

Some of the comments/requests for doc might strictly be more in scope for in 
later patches (documenting functionality that doesn't exist yet). Those docs 
would help *me* now but happy if you'd rather briefly explain and add them 
later.




Comment at: clang/lib/Format/FormatToken.h:166
+  ///   \- P0  \- P1
+  /// ExpandedFrom stacks for each generated token will be:
+  /// ( -> P0

this is a great example, it might be a little more clear with more distinct 
chars and some vertical alignment:
```
Given X(A)=[A], Y(A)=,
  X({ Y(0)  } ) expands as
  [ { < 0 > } ]
StartOfExpansion  1   1
ExpandedFrom[0]   X X X X X X X
ExpandedFrom[1]   Y Y Y
```

You could extend this to cover all the fields and hoist it to be a comment on 
MacroContext if you like, I think the concreteness helps.



Comment at: clang/lib/Format/FormatToken.h:176
+
+  /// Whether this token is the first token in a macro expansion.
+  bool StartOfExpansion = false;

why the asymmetry between start/end?

given `ID(x)=X`, `ID(ID(0))` yields `0` which starts and ends two expansions, 
right?
Consider making them both integer, even if you don't need it at this point.

(also 64 bits, really?)



Comment at: clang/lib/Format/FormatToken.h:183
+
+  /// When macro expansion introduces parents, those are marked as
+  /// \c MacroParent, so formatting knows their children need to be formatted.

this isn't used in this patch - can we leave it out until used?



Comment at: clang/lib/Format/FormatToken.h:389
+  // in a configured macro expansion.
+  MacroContext MacroCtx;
+

if you're not extremely concerned about memory layout, I'd consider making this 
an `Optional` with nullopt replacing the current MR_None. 

This reduces the number of implicit invariants (AIUI MR_None can't be combined 
with any other fields being set) and means the name MacroContext more closely 
fits the thing it's modeling.



Comment at: clang/lib/Format/FormatToken.h:632
 
+  void copyInto(FormatToken ) { Tok = *this; }
+

const.

I guess it doesn't matter, but copyFrom would seem a little less weird to me in 
an OOP/encapsulation sense.
I do like this explicit form rather than clone() + move constructor though, as 
pointer identity is pretty important for tokens.



Comment at: clang/lib/Format/MacroExpander.cpp:35
+  SmallVector Params;
+  SmallVector Tokens;
+};

Tokens -> Expansion? (semantics rather than type)



Comment at: clang/lib/Format/MacroExpander.cpp:38
+
+// A simple macro parser.
+class MacroExpander::DefinitionParser {

Dmitri gave a tech talk on dropping comments like these :-)



Comment at: clang/lib/Format/MacroExpander.cpp:42
+  DefinitionParser(ArrayRef Tokens) : Tokens(Tokens) {
+assert(!Tokens.empty());
+Current = Tokens[0];

who's responsible for establishing this?

AIUI this will fail if e.g. `Macros` contains a string that contains only 
whitespace, which is a slightly weird precondition.



Comment at: clang/lib/Format/MacroExpander.cpp:63
+  bool parseParams() {
+if (Current->isNot(tok::l_paren))
+  return false;

assert instead? Caller checks this



Comment at: clang/lib/Format/MacroExpander.cpp:81
+do {
+  Def.Tokens.push_back(Current);
+  nextToken();

this assumes the expansion is nonempty, which the grammar doesn't. while{} 
instead?



Comment at: clang/lib/Format/MacroExpander.cpp:113
+void MacroExpander::parseDefinitions(
+const std::vector ) {
+  for (const std::string  : MacroExpander) {

weird param name!



Comment at: clang/lib/Format/MacroExpander.cpp:116
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Macro, ""));
+clang::FileID FID =

This is a slightly spooky buffer name - it's the magic name the PP uses for 
pasted tokens.
A closer fit for config is maybe "" (like macro definitions 
passed with `-D`).
Is it necessary to use one of clang's magic buffer names at all? If so, 
comment! Else maybe "" or something?



Comment at: clang/lib/Format/MacroExpander.cpp:133
+  ArgsList Args) {
+  assert(defined(ID->TokenText));
+  SmallVector Result;

is the caller responsible for checking the #args matches #params?
If so, document and assert here?

Looking at the 

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-13 Thread MyDeveloperDay via Phabricator via cfe-commits
MyDeveloperDay added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:150
+Tok->MacroCtx.ExpandedFrom.push_back(ID);
+if (First) {
+  Tok->MacroCtx.StartOfExpansion = true;

elide braces?



Comment at: clang/unittests/Format/MacroExpanderTest.cpp:5
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"

are you using this?



Comment at: clang/unittests/Format/MacroExpanderTest.cpp:55
+  .str();
+  EXPECT_TRUE(Tokens[I]->is(Attributes[I].Kind))
+  << Context << " in " << text(Tokens);

when these assertions fail you have no idea which of the various calls is 
actually failing how about passing in __FILE__,__LINE__ then adding that to the 
output


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D83296#2146970 , @sammccall wrote:

> In D83296#2146870 , @klimek wrote:
>
> > Monday-morning ping.
>
>
> Thanks for the reminder here... however this is taking me a bit to get my 
> head around, and we've got a release branch cut scheduled for a couple of 
> days that we're trying to polish for.
>  AFAICT there's significant followup work still needed to make use of this - 
> are you wanting this to land in the 11 release? Else i'd probably come back 
> to this after the cut...


Sorry, this is not urgent - the 11 cut has clear priority. My default is to 
ping stuff every week unless somebody tells me they prefer me not doing that :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-13 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

In D83296#2146870 , @klimek wrote:

> Monday-morning ping.


Thanks for the reminder here... however this is taking me a bit to get my head 
around, and we've got a release branch cut scheduled for a couple of days that 
we're trying to polish for.
AFAICT there's significant followup work still needed to make use of this - are 
you wanting this to land in the 11 release? Else i'd probably come back to this 
after the cut...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Monday-morning ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-07 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added subscribers: cfe-commits, mgorny.
Herald added a project: clang.

The MacroExpander allows to expand simple (non-resursive) macro
definitions from a macro identifier token and macro arguments. It
annotates the tokens with a newly introduced MacroContext that keeps
track of the role a token played in expanding the macro in order to
be able to reconstruct the macro expansion from an expanded (formatted)
token stream.

Made Token explicitly copy-able to enable copying tokens from the parsed
macro definition.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/MacroExpander.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+#define LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,167 @@
+#include "../../lib/Format/MacroExpander.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(
+MacroDefinitions, Lex.SourceMgr.get(), Lex.Style, Lex.Encoding,
+Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return