Re: [PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-27 Thread Sam McCall via cfe-commits
Sorry Doug!
7b70c2e75cd6b21cf9c2ea2f2ce5f7b9c4202549 should fix, not sure why my local
clang accepts it.

On Wed, Jul 27, 2022 at 9:57 AM Douglas Yung via Phabricator <
revi...@reviews.llvm.org> wrote:

> dyung added a comment.
>
> Hi @sammccall, your change seems to be causing a build failure on the PS4
> linux bot, can you take a look?
>
> https://lab.llvm.org/buildbot/#/builders/139/builds/25678
>
>   FAILED:
> tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o
>
>   CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /usr/bin/g++
> -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS
> -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/tools/clang/tools/extra/pseudo/gen
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/include
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/tools/clang/include
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/include
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/include
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/include
> -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/tools/clang/tools/extra/pseudo/include
> -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden
> -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings
> -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long
> -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess
> -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type
> -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment
> -Wmisleading-indentation -fdiagnostics-color -ffunction-sections
> -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3
> -DNDEBUG  -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT
> tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o
> -MF
> tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o.d
> -o
> tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o
> -c
> /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp
>
> /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp:
> In function ‘std::string
> clang::pseudo::{anonymous}::mangleSymbol(clang::pseudo::SymbolID, const
> clang::pseudo::Grammar&)’:
>
> /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp:73:50:
> error: expected primary-expression before ‘]’ token
>  73 |   static std::string *TokNames = new std::string[]{
> |  ^
>
> /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp:77:7:
> error: too many initializers for ‘std::string [1]’ {aka
> ‘std::__cxx11::basic_string [1]’}
>  77 |   };
> |   ^
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D130414/new/
>
> https://reviews.llvm.org/D130414
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-27 Thread Douglas Yung via Phabricator via cfe-commits
dyung added a comment.

Hi @sammccall, your change seems to be causing a build failure on the PS4 linux 
bot, can you take a look?

https://lab.llvm.org/buildbot/#/builders/139/builds/25678

  FAILED: 
tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o 
  CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /usr/bin/g++ 
-DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS 
-D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
-I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/tools/clang/tools/extra/pseudo/gen
 
-I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen
 
-I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang/include
 
-I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/tools/clang/include
 -I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/include 
-I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/include
 
-I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/include
 
-I/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/tools/clang/tools/extra/pseudo/include
 -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden 
-Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings 
-Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long 
-Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess 
-Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type 
-Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment 
-Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
-fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 
-DNDEBUG  -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT 
tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o 
-MF 
tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o.d 
-o 
tools/clang/tools/extra/pseudo/gen/CMakeFiles/clang-pseudo-gen.dir/Main.cpp.o 
-c 
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp
  
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp:
 In function ‘std::string 
clang::pseudo::{anonymous}::mangleSymbol(clang::pseudo::SymbolID, const 
clang::pseudo::Grammar&)’:
  
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp:73:50:
 error: expected primary-expression before ‘]’ token
 73 |   static std::string *TokNames = new std::string[]{
|  ^
  
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/clang-tools-extra/pseudo/gen/Main.cpp:77:7:
 error: too many initializers for ‘std::string [1]’ {aka 
‘std::__cxx11::basic_string [1]’}
 77 |   };
|   ^


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130414/new/

https://reviews.llvm.org/D130414

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-27 Thread Sam McCall via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
sammccall marked 2 inline comments as done.
Closed by commit rG6bdb15fe844c: [pseudo] Reorganize CXX.h enums (authored by 
sammccall).
Herald added a subscriber: mgorny.

Changed prior to commit:
  https://reviews.llvm.org/D130414?vs=447675&id=447939#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130414/new/

https://reviews.llvm.org/D130414

Files:
  clang-tools-extra/pseudo/gen/Main.cpp
  clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h
  clang-tools-extra/pseudo/include/clang-pseudo/grammar/Grammar.h
  clang-tools-extra/pseudo/lib/cxx/CXX.cpp
  clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
  clang-tools-extra/pseudo/unittests/CMakeLists.txt
  clang-tools-extra/pseudo/unittests/CXXTest.cpp
  clang-tools-extra/pseudo/unittests/GrammarTest.cpp

Index: clang-tools-extra/pseudo/unittests/GrammarTest.cpp
===
--- clang-tools-extra/pseudo/unittests/GrammarTest.cpp
+++ clang-tools-extra/pseudo/unittests/GrammarTest.cpp
@@ -109,21 +109,6 @@
   EXPECT_TRUE(G.lookupRule(ruleFor("x")).Guarded);
 }
 
-TEST_F(GrammarTest, MangleName) {
-  build(R"bnf(
-_ := declaration
-
-declaration := ptr-declarator ;
-ptr-declarator := * IDENTIFIER
-
-  )bnf");
-  ASSERT_TRUE(Diags.empty());
-  EXPECT_EQ(G.mangleRule(ruleFor("declaration")),
-"declaration_0ptr_declarator_1semi");
-  EXPECT_EQ(G.mangleRule(ruleFor("ptr-declarator")),
-"ptr_declarator_0star_1identifier");
-}
-
 TEST_F(GrammarTest, Diagnostics) {
   build(R"cpp(
 _ := ,_opt
Index: clang-tools-extra/pseudo/unittests/CXXTest.cpp
===
--- /dev/null
+++ clang-tools-extra/pseudo/unittests/CXXTest.cpp
@@ -0,0 +1,30 @@
+//===--- CXXTest.cpp --===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang-pseudo/cxx/CXX.h"
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace pseudo {
+namespace cxx {
+namespace {
+
+TEST(CXX, GeneratedEnums) {
+  const auto &Lang = clang::pseudo::cxx::getLanguage();
+  EXPECT_EQ("iteration-statement",
+Lang.G.symbolName(Symbol::iteration_statement));
+  EXPECT_EQ("iteration-statement := DO statement WHILE ( expression ) ;",
+Lang.G.dumpRule(
+rule::iteration_statement::
+DO__statement__WHILE__L_PAREN__expression__R_PAREN__SEMI));
+}
+
+} // namespace
+} // namespace cxx
+} // namespace pseudo
+} // namespace clang
Index: clang-tools-extra/pseudo/unittests/CMakeLists.txt
===
--- clang-tools-extra/pseudo/unittests/CMakeLists.txt
+++ clang-tools-extra/pseudo/unittests/CMakeLists.txt
@@ -5,6 +5,7 @@
 add_custom_target(ClangPseudoUnitTests)
 add_unittest(ClangPseudoUnitTests ClangPseudoTests
   BracketTest.cpp
+  CXXTest.cpp
   DirectiveTreeTest.cpp
   ForestTest.cpp
   GLRTest.cpp
@@ -22,6 +23,7 @@
 target_link_libraries(ClangPseudoTests
   PRIVATE
   clangPseudo
+  clangPseudoCXX
   clangPseudoGrammar
   LLVMTestingSupport
   )
Index: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
===
--- clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
+++ clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
@@ -45,28 +45,6 @@
   return T->Nonterminals[SID].Name;
 }
 
-std::string Grammar::mangleSymbol(SymbolID SID) const {
-  static const char *const TokNames[] = {
-#define TOK(X) #X,
-#define KEYWORD(X, Y) #X,
-#include "clang/Basic/TokenKinds.def"
-  nullptr};
-  if (clang::pseudo::isToken(SID))
-return TokNames[clang::pseudo::symbolToToken(SID)];
-  std::string Name = symbolName(SID).str();
-  // translation-unit -> translation_unit
-  std::replace(Name.begin(), Name.end(), '-', '_');
-  return Name;
-}
-
-std::string Grammar::mangleRule(RuleID RID) const {
-  const auto &R = lookupRule(RID);
-  std::string MangleName = mangleSymbol(R.Target);
-  for (size_t I = 0; I < R.seq().size(); ++I)
-MangleName += llvm::formatv("_{0}{1}", I, mangleSymbol(R.seq()[I]));
-  return MangleName;
-}
-
 llvm::Optional Grammar::findNonterminal(llvm::StringRef Name) const {
   auto It = llvm::partition_point(
   T->Nonterminals,
Index: clang-tools-extra/pseudo/lib/cxx/CXX.cpp
===
--- clang-tools-extra/pseudo/lib/cxx/CXX.cpp
+++ clang-tools-extra/pseudo/lib/cxx/CXX.cpp
@@ -111,42 +111,41 @@
 }
 
 bool isFunctionDeclarator(const ForestNode *Declarator) {
-  as

[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-26 Thread Haojian Wu via Phabricator via cfe-commits
hokein accepted this revision.
hokein added a comment.
This revision is now accepted and ready to land.

I think this patch is in a good state.




Comment at: clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h:54
+namespace rules {
+namespace dummy {
+enum Rule {

sammccall wrote:
> hokein wrote:
> > sammccall wrote:
> > > usaxena95 wrote:
> > > > sammccall wrote:
> > > > > hokein wrote:
> > > > > > why there is a dummy namespace here?
> > > > > for each NT, we close the previous ns+enum and open new ones.
> > > > > For this to work for the first NT, we have to open an ns+enum.
> > > > > 
> > > > > I can add a comment, but would prefer to do this some other way?
> > > > I would include this block in the clang-format off block to show these 
> > > > are for the generated code.
> > > > ```
> > > > //clang-format off
> > > > namespace dummy { enum Rule {
> > > > ...
> > > > };}
> > > > //clang-format on
> > > > ```
> > > Oops, they're not for the generated code, just for the macro definition 
> > > (clang-format gets very confused)
> > > 
> > > Restricted the scope of the formatting-disabled block and tried to 
> > > improve the hand-formatting (though I don't think any formatting of this 
> > > particular preprocessor trick is very readable)
> > ah, I get it (until I read the preprocessed CXX.h code) -- I found it 
> > really hard to understand it by literally reading the code here. 
> > 
> > To make it clear,  I think we can probably add two additional RULE_BEGIN, 
> > RULE_END macros? 
> > 
> >  in `CXXSymbols.inc`, we emit something like
> > 
> > ```
> > RULE_BEGIN(_)
> > RULE(_, translation_unit, 135)
> > RULE(_, statement_seq, 136)
> > RULE(_, declaration_seq, 137))
> > RULE_END(_)
> > ```
> > 
> > In CXX.h, we write
> > 
> > ```
> > #define RULE_BEGIN(LHS) namespace LHS { enum Rule : RULE ID {
> > #define RULE_END()  }; }
> > ```
> > 
> That would make the callsite more clear, but at the cost of adding ad-hoc 
> bits to the CXXSymbols.inc.
> I'd prefer to keep it generic if we can (otherwise we might as well just have 
> Gen generate the enum definitions directly, right?)
> 
> Added a comment to explain the dummy.
> That would make the callsite more clear, but at the cost of adding ad-hoc 
> bits to the CXXSymbols.inc.

I think this is the point. We read this cxx.h more often than the 
CXXSymbols.inc, this seems to be a right tradeoff to me.

> I'd prefer to keep it generic if we can (otherwise we might as well just have 
> Gen generate the enum definitions directly, right?)
And yeah, generating the whole definitions from the Gen tool is another option.

OK, if you prefer to do it in this way, could you at least rename the `enum 
Rule {` to something like `enum PlaceHolder {`? It is very confusing to use the 
meaningful name `Rule` for this dummy enum.



Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:119
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom

sammccall wrote:
> hokein wrote:
> > sammccall wrote:
> > > hokein wrote:
> > > > The code of applying the pattern doesn't look much worse to me and it 
> > > > is easier to verify the exhaustiveness by human as well. We might loose 
> > > > some performance (I hope not a lot), but I think it is a good tradeoff 
> > > > (not saying we need do it in this patch).
> > > > 
> > > > ```
> > > > switch (LHSNonterminal(Declarator->rule(), *G)) {
> > > >case cxx::Symbol::noptr_declarator: {
> > > >   switch ((rule::noptr_declarator)Declarator->rule())  {
> > > >  case rule::noptr_declarator::declarator_id:
> > > >  
> > > >  case 
> > > > rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
> > > >  
> > > >   }
> > > >...
> > > >}
> > > > }
> > > > ```
> > > I guess this is a question of taste, but I find that style very hard to 
> > > read.
> > > 
> > > (Note that it's incorrectly indented, and the indentation rules around 
> > > switch/case are one of the main reasons I find nested switches confusing)
> > > (Note that it's incorrectly indented, and the indentation rules around 
> > > switch/case are one of the main reasons I find nested switches confusing)
> > 
> > Ah the indentation is weird (it is not what I originally written..). There 
> > are ways to make the indentation correct (turn off the clang-format etc).
> > 
> > If nested switch is hard to read, maybe we can wrap one with a macro to 
> > improve the code readability (happy to explore ideas, I think there is 
> > value on this pattern).
> Maybe - I think nonstandard formatting and macros cost a lot of readability 
> in their own right.
> 
> I think we should probably discuss this separately - I think this change is 
> worth having even if we don't change control flo

[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-26 Thread Sam McCall via Phabricator via cfe-commits
sammccall updated this revision to Diff 447675.
sammccall marked an inline comment as done.
sammccall added a comment.

address comments


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130414/new/

https://reviews.llvm.org/D130414

Files:
  clang-tools-extra/pseudo/gen/Main.cpp
  clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h
  clang-tools-extra/pseudo/include/clang-pseudo/grammar/Grammar.h
  clang-tools-extra/pseudo/lib/cxx/CXX.cpp
  clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
  clang-tools-extra/pseudo/unittests/GrammarTest.cpp

Index: clang-tools-extra/pseudo/unittests/GrammarTest.cpp
===
--- clang-tools-extra/pseudo/unittests/GrammarTest.cpp
+++ clang-tools-extra/pseudo/unittests/GrammarTest.cpp
@@ -113,15 +113,16 @@
   build(R"bnf(
 _ := declaration
 
-declaration := ptr-declarator ;
+declaration := cv-qualifier ptr-declarator ;
 ptr-declarator := * IDENTIFIER
+cv-qualifier := CONST
 
   )bnf");
   ASSERT_TRUE(Diags.empty());
   EXPECT_EQ(G.mangleRule(ruleFor("declaration")),
-"declaration_0ptr_declarator_1semi");
-  EXPECT_EQ(G.mangleRule(ruleFor("ptr-declarator")),
-"ptr_declarator_0star_1identifier");
+"cv_qualifier__ptr_declarator__semi");
+  EXPECT_EQ(G.mangleRule(ruleFor("ptr-declarator")), "star__identifier");
+  EXPECT_EQ(G.mangleRule(ruleFor("cv-qualifier")), "CONST");
 }
 
 TEST_F(GrammarTest, Diagnostics) {
Index: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
===
--- clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
+++ clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
@@ -45,28 +45,6 @@
   return T->Nonterminals[SID].Name;
 }
 
-std::string Grammar::mangleSymbol(SymbolID SID) const {
-  static const char *const TokNames[] = {
-#define TOK(X) #X,
-#define KEYWORD(X, Y) #X,
-#include "clang/Basic/TokenKinds.def"
-  nullptr};
-  if (clang::pseudo::isToken(SID))
-return TokNames[clang::pseudo::symbolToToken(SID)];
-  std::string Name = symbolName(SID).str();
-  // translation-unit -> translation_unit
-  std::replace(Name.begin(), Name.end(), '-', '_');
-  return Name;
-}
-
-std::string Grammar::mangleRule(RuleID RID) const {
-  const auto &R = lookupRule(RID);
-  std::string MangleName = mangleSymbol(R.Target);
-  for (size_t I = 0; I < R.seq().size(); ++I)
-MangleName += llvm::formatv("_{0}{1}", I, mangleSymbol(R.seq()[I]));
-  return MangleName;
-}
-
 llvm::Optional Grammar::findNonterminal(llvm::StringRef Name) const {
   auto It = llvm::partition_point(
   T->Nonterminals,
Index: clang-tools-extra/pseudo/lib/cxx/CXX.cpp
===
--- clang-tools-extra/pseudo/lib/cxx/CXX.cpp
+++ clang-tools-extra/pseudo/lib/cxx/CXX.cpp
@@ -111,42 +111,41 @@
 }
 
 bool isFunctionDeclarator(const ForestNode *Declarator) {
-  assert(Declarator->symbol() == (SymbolID)(cxx::Symbol::declarator));
+  assert(Declarator->symbol() == cxx::Symbol::declarator);
   bool IsFunction = false;
-  using cxx::Rule;
   while (true) {
 // not well-formed code, return the best guess.
 if (Declarator->kind() != ForestNode::Sequence)
   return IsFunction;
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom
   return IsFunction;
 // *X is a nonfunction (unless X is a function).
-case Rule::ptr_declarator_0ptr_operator_1ptr_declarator:
+case rule::ptr_declarator::ptr_operator__ptr_declarator:
   Declarator = Declarator->elements()[1];
   IsFunction = false;
   continue;
 // X() is a function (unless X is a pointer or similar).
-case Rule::
-declarator_0noptr_declarator_1parameters_and_qualifiers_2trailing_return_type:
-case Rule::noptr_declarator_0noptr_declarator_1parameters_and_qualifiers:
+case rule::declarator::
+noptr_declarator__parameters_and_qualifiers__trailing_return_type:
+case rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
   Declarator = Declarator->elements()[0];
   IsFunction = true;
   continue;
 // X[] is an array (unless X is a pointer or function).
-case Rule::
-noptr_declarator_0noptr_declarator_1l_square_2constant_expression_3r_square:
-case Rule::noptr_declarator_0noptr_declarator_1l_square_2r_square:
+case rule::noptr_declarator::
+noptr_declarator__l_square__constant_expression__r_square:
+case rule::noptr_declarator::noptr_declarator__l_square__r_square:
   Declarator = Declarator->elements()[0];
   IsFunction = false;
   continue;
 // (X) is whatever X is.
-case Rule::noptr_declarator_0l_paren_1ptr_declarator_2r_paren:
+case rule::noptr_declarator::l_paren_

[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-26 Thread Sam McCall via Phabricator via cfe-commits
sammccall marked 6 inline comments as done.
sammccall added inline comments.



Comment at: clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h:54
+namespace rules {
+namespace dummy {
+enum Rule {

hokein wrote:
> sammccall wrote:
> > usaxena95 wrote:
> > > sammccall wrote:
> > > > hokein wrote:
> > > > > why there is a dummy namespace here?
> > > > for each NT, we close the previous ns+enum and open new ones.
> > > > For this to work for the first NT, we have to open an ns+enum.
> > > > 
> > > > I can add a comment, but would prefer to do this some other way?
> > > I would include this block in the clang-format off block to show these 
> > > are for the generated code.
> > > ```
> > > //clang-format off
> > > namespace dummy { enum Rule {
> > > ...
> > > };}
> > > //clang-format on
> > > ```
> > Oops, they're not for the generated code, just for the macro definition 
> > (clang-format gets very confused)
> > 
> > Restricted the scope of the formatting-disabled block and tried to improve 
> > the hand-formatting (though I don't think any formatting of this particular 
> > preprocessor trick is very readable)
> ah, I get it (until I read the preprocessed CXX.h code) -- I found it really 
> hard to understand it by literally reading the code here. 
> 
> To make it clear,  I think we can probably add two additional RULE_BEGIN, 
> RULE_END macros? 
> 
>  in `CXXSymbols.inc`, we emit something like
> 
> ```
> RULE_BEGIN(_)
> RULE(_, translation_unit, 135)
> RULE(_, statement_seq, 136)
> RULE(_, declaration_seq, 137))
> RULE_END(_)
> ```
> 
> In CXX.h, we write
> 
> ```
> #define RULE_BEGIN(LHS) namespace LHS { enum Rule : RULE ID {
> #define RULE_END()  }; }
> ```
> 
That would make the callsite more clear, but at the cost of adding ad-hoc bits 
to the CXXSymbols.inc.
I'd prefer to keep it generic if we can (otherwise we might as well just have 
Gen generate the enum definitions directly, right?)

Added a comment to explain the dummy.



Comment at: clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h:54-62
+namespace dummy {
+enum Rule {
+//clang-format off
+#define NONTERMINAL(NAME, ID) };} namespace NAME { enum Rule : RuleID {
+#define RULE(LHS, RHS, ID) RHS = ID,
 #include "CXXSymbols.inc"
+  //clang-format on

usaxena95 wrote:
> sammccall wrote:
> > hokein wrote:
> > > why there is a dummy namespace here?
> > for each NT, we close the previous ns+enum and open new ones.
> > For this to work for the first NT, we have to open an ns+enum.
> > 
> > I can add a comment, but would prefer to do this some other way?
> I would include this block in the clang-format off block to show these are 
> for the generated code.
> ```
> //clang-format off
> namespace dummy { enum Rule {
> ...
> };}
> //clang-format on
> ```
Oops, they're not for the generated code, just for the macro definition 
(clang-format gets very confused)

Restricted the scope of the formatting-disabled block and tried to improve the 
hand-formatting (though I don't think any formatting of this particular 
preprocessor trick is very readable)



Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:119
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom

hokein wrote:
> sammccall wrote:
> > hokein wrote:
> > > The code of applying the pattern doesn't look much worse to me and it is 
> > > easier to verify the exhaustiveness by human as well. We might loose some 
> > > performance (I hope not a lot), but I think it is a good tradeoff (not 
> > > saying we need do it in this patch).
> > > 
> > > ```
> > > switch (LHSNonterminal(Declarator->rule(), *G)) {
> > >case cxx::Symbol::noptr_declarator: {
> > >   switch ((rule::noptr_declarator)Declarator->rule())  {
> > >  case rule::noptr_declarator::declarator_id:
> > >  
> > >  case 
> > > rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
> > >  
> > >   }
> > >...
> > >}
> > > }
> > > ```
> > I guess this is a question of taste, but I find that style very hard to 
> > read.
> > 
> > (Note that it's incorrectly indented, and the indentation rules around 
> > switch/case are one of the main reasons I find nested switches confusing)
> > (Note that it's incorrectly indented, and the indentation rules around 
> > switch/case are one of the main reasons I find nested switches confusing)
> 
> Ah the indentation is weird (it is not what I originally written..). There 
> are ways to make the indentation correct (turn off the clang-format etc).
> 
> If nested switch is hard to read, maybe we can wrap one with a macro to 
> improve the code readability (happy to explore ideas, I think there is value 
> on this pattern).
Maybe - I think nonstandard formatting and

[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-26 Thread Haojian Wu via Phabricator via cfe-commits
hokein added inline comments.



Comment at: clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h:54
+namespace rules {
+namespace dummy {
+enum Rule {

usaxena95 wrote:
> sammccall wrote:
> > hokein wrote:
> > > why there is a dummy namespace here?
> > for each NT, we close the previous ns+enum and open new ones.
> > For this to work for the first NT, we have to open an ns+enum.
> > 
> > I can add a comment, but would prefer to do this some other way?
> I would include this block in the clang-format off block to show these are 
> for the generated code.
> ```
> //clang-format off
> namespace dummy { enum Rule {
> ...
> };}
> //clang-format on
> ```
ah, I get it (until I read the preprocessed CXX.h code) -- I found it really 
hard to understand it by literally reading the code here. 

To make it clear,  I think we can probably add two additional RULE_BEGIN, 
RULE_END macros? 

 in `CXXSymbols.inc`, we emit something like

```
RULE_BEGIN(_)
RULE(_, translation_unit, 135)
RULE(_, statement_seq, 136)
RULE(_, declaration_seq, 137))
RULE_END(_)
```

In CXX.h, we write

```
#define RULE_BEGIN(LHS) namespace LHS { enum Rule : RULE ID {
#define RULE_END()  }; }
```




Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:119
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom

sammccall wrote:
> hokein wrote:
> > The code of applying the pattern doesn't look much worse to me and it is 
> > easier to verify the exhaustiveness by human as well. We might loose some 
> > performance (I hope not a lot), but I think it is a good tradeoff (not 
> > saying we need do it in this patch).
> > 
> > ```
> > switch (LHSNonterminal(Declarator->rule(), *G)) {
> >case cxx::Symbol::noptr_declarator: {
> >   switch ((rule::noptr_declarator)Declarator->rule())  {
> >  case rule::noptr_declarator::declarator_id:
> >  
> >  case 
> > rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
> >  
> >   }
> >...
> >}
> > }
> > ```
> I guess this is a question of taste, but I find that style very hard to read.
> 
> (Note that it's incorrectly indented, and the indentation rules around 
> switch/case are one of the main reasons I find nested switches confusing)
> (Note that it's incorrectly indented, and the indentation rules around 
> switch/case are one of the main reasons I find nested switches confusing)

Ah the indentation is weird (it is not what I originally written..). There are 
ways to make the indentation correct (turn off the clang-format etc).

If nested switch is hard to read, maybe we can wrap one with a macro to improve 
the code readability (happy to explore ideas, I think there is value on this 
pattern).



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:51
 #define TOK(X) #X,
-#define KEYWORD(X, Y) #X,
+#define KEYWORD(Keyword, Condition) llvm::StringRef(#Keyword).upper(),
 #include "clang/Basic/TokenKinds.def"

sammccall wrote:
> hokein wrote:
> > IMO, this improves the readability, and also aligns with the text in the 
> > cxx.grammar.
> Thanks. I like this change too. We still have `semi` vs `;` (should we use 
> `SEMI`?) but it feels clearer
yeah, that looks good (that means lowercase is always referring to nonterminals)



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:62
 
 std::string Grammar::mangleRule(RuleID RID) const {
   const auto &R = lookupRule(RID);

sammccall wrote:
> hokein wrote:
> > nit: update the doc comment in .h file.
> Honestly I would rather just move the impl back into gen - this change seems 
> to demonstrate why
ok, feel free to move it.



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:65
+  std::string MangleName;
+  for (size_t I = 0; I < R.seq().size(); ++I) {
+MangleName += mangleSymbol(R.seq()[I]);

sammccall wrote:
> hokein wrote:
> > ok, now we're dropping the index for all RHS symbols. Just want to know 
> > your thought about this. Personally, I'd like to keep this information 
> > (`noptr_declarator__l_square__constant_expression__r_square` vs 
> > `noptr_declarator0_l_square1_constant_expression2_r_square3`) though it 
> > makes the name a bit uglier.
> > 
> > > Change mangling of keywords to ALL_CAPS (needed to turn keywords that 
> > > appear alone on RHS into valid identifiers)
> > 
> > if we add index number to the name, then this change is not required.
> Short version: I can deal with the numbers at the front (there's a *little* 
> bit of value), but at the back of the symbol name there's no value, just 
> noise. I find the double-underscore version much more readable (than either 
> variant with numbers).
> 
> ---
> 
> 

[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-25 Thread Sam McCall via Phabricator via cfe-commits
sammccall updated this revision to Diff 447286.
sammccall added a comment.

tweak formatting


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130414/new/

https://reviews.llvm.org/D130414

Files:
  clang-tools-extra/pseudo/gen/Main.cpp
  clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h
  clang-tools-extra/pseudo/lib/cxx/CXX.cpp
  clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
  clang-tools-extra/pseudo/unittests/GrammarTest.cpp

Index: clang-tools-extra/pseudo/unittests/GrammarTest.cpp
===
--- clang-tools-extra/pseudo/unittests/GrammarTest.cpp
+++ clang-tools-extra/pseudo/unittests/GrammarTest.cpp
@@ -113,15 +113,16 @@
   build(R"bnf(
 _ := declaration
 
-declaration := ptr-declarator ;
+declaration := cv-qualifier ptr-declarator ;
 ptr-declarator := * IDENTIFIER
+cv-qualifier := CONST
 
   )bnf");
   ASSERT_TRUE(Diags.empty());
   EXPECT_EQ(G.mangleRule(ruleFor("declaration")),
-"declaration_0ptr_declarator_1semi");
-  EXPECT_EQ(G.mangleRule(ruleFor("ptr-declarator")),
-"ptr_declarator_0star_1identifier");
+"cv_qualifier__ptr_declarator__semi");
+  EXPECT_EQ(G.mangleRule(ruleFor("ptr-declarator")), "star__identifier");
+  EXPECT_EQ(G.mangleRule(ruleFor("cv-qualifier")), "CONST");
 }
 
 TEST_F(GrammarTest, Diagnostics) {
Index: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
===
--- clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
+++ clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
@@ -46,11 +46,11 @@
 }
 
 std::string Grammar::mangleSymbol(SymbolID SID) const {
-  static const char *const TokNames[] = {
+  static std::string *TokNames = new std::string[]{
 #define TOK(X) #X,
-#define KEYWORD(X, Y) #X,
+#define KEYWORD(Keyword, Condition) llvm::StringRef(#Keyword).upper(),
 #include "clang/Basic/TokenKinds.def"
-  nullptr};
+  };
   if (clang::pseudo::isToken(SID))
 return TokNames[clang::pseudo::symbolToToken(SID)];
   std::string Name = symbolName(SID).str();
@@ -61,9 +61,13 @@
 
 std::string Grammar::mangleRule(RuleID RID) const {
   const auto &R = lookupRule(RID);
-  std::string MangleName = mangleSymbol(R.Target);
-  for (size_t I = 0; I < R.seq().size(); ++I)
-MangleName += llvm::formatv("_{0}{1}", I, mangleSymbol(R.seq()[I]));
+  std::string MangleName;
+  for (size_t I = 0; I < R.seq().size(); ++I) {
+MangleName += mangleSymbol(R.seq()[I]);
+MangleName.append("__");
+  }
+  MangleName.pop_back();
+  MangleName.pop_back();
   return MangleName;
 }
 
Index: clang-tools-extra/pseudo/lib/cxx/CXX.cpp
===
--- clang-tools-extra/pseudo/lib/cxx/CXX.cpp
+++ clang-tools-extra/pseudo/lib/cxx/CXX.cpp
@@ -111,40 +111,39 @@
 bool isFunctionDeclarator(const ForestNode *Declarator) {
   assert(Declarator->symbol() == (SymbolID)(cxx::Symbol::declarator));
   bool IsFunction = false;
-  using cxx::Rule;
   while (true) {
 // not well-formed code, return the best guess.
 if (Declarator->kind() != ForestNode::Sequence)
   return IsFunction;
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom
   return IsFunction;
 // *X is a nonfunction (unless X is a function).
-case Rule::ptr_declarator_0ptr_operator_1ptr_declarator:
+case rule::ptr_declarator::ptr_operator__ptr_declarator:
   Declarator = Declarator->elements()[1];
   IsFunction = false;
   continue;
 // X() is a function (unless X is a pointer or similar).
-case Rule::
-declarator_0noptr_declarator_1parameters_and_qualifiers_2trailing_return_type:
-case Rule::noptr_declarator_0noptr_declarator_1parameters_and_qualifiers:
+case rule::declarator::
+noptr_declarator__parameters_and_qualifiers__trailing_return_type:
+case rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
   Declarator = Declarator->elements()[0];
   IsFunction = true;
   continue;
 // X[] is an array (unless X is a pointer or function).
-case Rule::
-noptr_declarator_0noptr_declarator_1l_square_2constant_expression_3r_square:
-case Rule::noptr_declarator_0noptr_declarator_1l_square_2r_square:
+case rule::noptr_declarator::
+noptr_declarator__l_square__constant_expression__r_square:
+case rule::noptr_declarator::noptr_declarator__l_square__r_square:
   Declarator = Declarator->elements()[0];
   IsFunction = false;
   continue;
 // (X) is whatever X is.
-case Rule::noptr_declarator_0l_paren_1ptr_declarator_2r_paren:
+case rule::noptr_declarator::l_paren__ptr_declarator__r_paren:
   Declarator = Declarator->elements()[1];
   continue;
-

[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-25 Thread Utkarsh Saxena via Phabricator via cfe-commits
usaxena95 added inline comments.



Comment at: clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h:54-62
+namespace dummy {
+enum Rule {
+//clang-format off
+#define NONTERMINAL(NAME, ID) };} namespace NAME { enum Rule : RuleID {
+#define RULE(LHS, RHS, ID) RHS = ID,
 #include "CXXSymbols.inc"
+  //clang-format on

sammccall wrote:
> hokein wrote:
> > why there is a dummy namespace here?
> for each NT, we close the previous ns+enum and open new ones.
> For this to work for the first NT, we have to open an ns+enum.
> 
> I can add a comment, but would prefer to do this some other way?
I would include this block in the clang-format off block to show these are for 
the generated code.
```
//clang-format off
namespace dummy { enum Rule {
...
};}
//clang-format on
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130414/new/

https://reviews.llvm.org/D130414

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-25 Thread Sam McCall via Phabricator via cfe-commits
sammccall added inline comments.



Comment at: clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h:54
+namespace rules {
+namespace dummy {
+enum Rule {

hokein wrote:
> why there is a dummy namespace here?
for each NT, we close the previous ns+enum and open new ones.
For this to work for the first NT, we have to open an ns+enum.

I can add a comment, but would prefer to do this some other way?



Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:119
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom

hokein wrote:
> The code of applying the pattern doesn't look much worse to me and it is 
> easier to verify the exhaustiveness by human as well. We might loose some 
> performance (I hope not a lot), but I think it is a good tradeoff (not saying 
> we need do it in this patch).
> 
> ```
> switch (LHSNonterminal(Declarator->rule(), *G)) {
>case cxx::Symbol::noptr_declarator: {
>   switch ((rule::noptr_declarator)Declarator->rule())  {
>  case rule::noptr_declarator::declarator_id:
>  
>  case 
> rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
>  
>   }
>...
>}
> }
> ```
I guess this is a question of taste, but I find that style very hard to read.

(Note that it's incorrectly indented, and the indentation rules around 
switch/case are one of the main reasons I find nested switches confusing)



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:51
 #define TOK(X) #X,
-#define KEYWORD(X, Y) #X,
+#define KEYWORD(Keyword, Condition) llvm::StringRef(#Keyword).upper(),
 #include "clang/Basic/TokenKinds.def"

hokein wrote:
> IMO, this improves the readability, and also aligns with the text in the 
> cxx.grammar.
Thanks. I like this change too. We still have `semi` vs `;` (should we use 
`SEMI`?) but it feels clearer



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:62
 
 std::string Grammar::mangleRule(RuleID RID) const {
   const auto &R = lookupRule(RID);

hokein wrote:
> nit: update the doc comment in .h file.
Honestly I would rather just move the impl back into gen - this change seems to 
demonstrate why



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:65
+  std::string MangleName;
+  for (size_t I = 0; I < R.seq().size(); ++I) {
+MangleName += mangleSymbol(R.seq()[I]);

hokein wrote:
> ok, now we're dropping the index for all RHS symbols. Just want to know your 
> thought about this. Personally, I'd like to keep this information 
> (`noptr_declarator__l_square__constant_expression__r_square` vs 
> `noptr_declarator0_l_square1_constant_expression2_r_square3`) though it makes 
> the name a bit uglier.
> 
> > Change mangling of keywords to ALL_CAPS (needed to turn keywords that 
> > appear alone on RHS into valid identifiers)
> 
> if we add index number to the name, then this change is not required.
Short version: I can deal with the numbers at the front (there's a *little* bit 
of value), but at the back of the symbol name there's no value, just noise. I 
find the double-underscore version much more readable (than either variant with 
numbers).

---

I always found the indexes aesthetically ugly but OK when you could read them as
```
lhs _   0 rhsa_ 1 rhsb
lhs := [0]rhsa [1]rhsb
```

But an identifier can't start with a number (`rule::empty_statement::0semi`) 
this isn't possible. I think you suggested shifting the numbers to the end, but 
I can't find a way to read that, it adds too much visual noise. (e.g. because 
the number "decorations" don't line up in a column in code completion or a 
switch statement).

I also think the double-underscore version is significantly less cryptic for 
new readers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130414/new/

https://reviews.llvm.org/D130414

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-25 Thread Haojian Wu via Phabricator via cfe-commits
hokein added a comment.

Thanks for exploring the idea, this looks like a good start to me.




Comment at: clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h:54
+namespace rules {
+namespace dummy {
+enum Rule {

why there is a dummy namespace here?



Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:119
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom

The code of applying the pattern doesn't look much worse to me and it is easier 
to verify the exhaustiveness by human as well. We might loose some performance 
(I hope not a lot), but I think it is a good tradeoff (not saying we need do it 
in this patch).

```
switch (LHSNonterminal(Declarator->rule(), *G)) {
   case cxx::Symbol::noptr_declarator: {
  switch ((rule::noptr_declarator)Declarator->rule())  {
 case rule::noptr_declarator::declarator_id:
 
 case 
rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
 
  }
   ...
   }
}
```



Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:175
   return {
-  {(RuleID)Rule::function_declarator_0declarator,
SYMBOL_GUARD(declarator, isFunctionDeclarator(&N))},

A nice improvement



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:51
 #define TOK(X) #X,
-#define KEYWORD(X, Y) #X,
+#define KEYWORD(Keyword, Condition) llvm::StringRef(#Keyword).upper(),
 #include "clang/Basic/TokenKinds.def"

IMO, this improves the readability, and also aligns with the text in the 
cxx.grammar.



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:62
 
 std::string Grammar::mangleRule(RuleID RID) const {
   const auto &R = lookupRule(RID);

nit: update the doc comment in .h file.



Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:64
   const auto &R = lookupRule(RID);
-  std::string MangleName = mangleSymbol(R.Target);
-  for (size_t I = 0; I < R.seq().size(); ++I)
-MangleName += llvm::formatv("_{0}{1}", I, mangleSymbol(R.seq()[I]));
+  std::string MangleName;
+  for (size_t I = 0; I < R.seq().size(); ++I) {

nit: we don't allow nullable nonterminals, so the RHS should never be empty. we 
could just initialize `MangleName` with `mangleSymbol(R.seq()[0])`, and iterate 
from 1 (then the trailing two `pop_back` is not needed).




Comment at: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp:65
+  std::string MangleName;
+  for (size_t I = 0; I < R.seq().size(); ++I) {
+MangleName += mangleSymbol(R.seq()[I]);

ok, now we're dropping the index for all RHS symbols. Just want to know your 
thought about this. Personally, I'd like to keep this information 
(`noptr_declarator__l_square__constant_expression__r_square` vs 
`noptr_declarator0_l_square1_constant_expression2_r_square3`) though it makes 
the name a bit uglier.

> Change mangling of keywords to ALL_CAPS (needed to turn keywords that appear 
> alone on RHS into valid identifiers)

if we add index number to the name, then this change is not required.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130414/new/

https://reviews.llvm.org/D130414

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130414: [pseudo] Reorganize CXX.h enums

2022-07-23 Thread Sam McCall via Phabricator via cfe-commits
sammccall created this revision.
sammccall added reviewers: hokein, usaxena95.
Herald added a project: All.
sammccall requested review of this revision.
Herald added subscribers: cfe-commits, alextsao1999.
Herald added a project: clang-tools-extra.

- Place rules under rule::lhs::rhs__rhs__rhs
- Change mangling of keywords to ALL_CAPS (needed to turn keywords that appear 
alone on RHS into valid identifiers)
- Make enums implicitly convertible to underlying type (though still scoped, 
using alias tricks)

In principle this lets us exhaustively write a switch over all rules of a NT:

  switch ((rule::declarator)N->rule()) {
case rule::declarator::noptr_declarator:
  ...
  }

In practice we don't do this anywhere yet as we're often switching over multiple
nonterminal kinds at once.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D130414

Files:
  clang-tools-extra/pseudo/gen/Main.cpp
  clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h
  clang-tools-extra/pseudo/lib/cxx/CXX.cpp
  clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
  clang-tools-extra/pseudo/unittests/GrammarTest.cpp

Index: clang-tools-extra/pseudo/unittests/GrammarTest.cpp
===
--- clang-tools-extra/pseudo/unittests/GrammarTest.cpp
+++ clang-tools-extra/pseudo/unittests/GrammarTest.cpp
@@ -113,15 +113,16 @@
   build(R"bnf(
 _ := declaration
 
-declaration := ptr-declarator ;
+declaration := cv-qualifier ptr-declarator ;
 ptr-declarator := * IDENTIFIER
+cv-qualifier := CONST
 
   )bnf");
   ASSERT_TRUE(Diags.empty());
   EXPECT_EQ(G.mangleRule(ruleFor("declaration")),
-"declaration_0ptr_declarator_1semi");
-  EXPECT_EQ(G.mangleRule(ruleFor("ptr-declarator")),
-"ptr_declarator_0star_1identifier");
+"cv_qualifier__ptr_declarator__semi");
+  EXPECT_EQ(G.mangleRule(ruleFor("ptr-declarator")), "star__identifier");
+  EXPECT_EQ(G.mangleRule(ruleFor("cv-qualifier")), "CONST");
 }
 
 TEST_F(GrammarTest, Diagnostics) {
Index: clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
===
--- clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
+++ clang-tools-extra/pseudo/lib/grammar/Grammar.cpp
@@ -46,11 +46,11 @@
 }
 
 std::string Grammar::mangleSymbol(SymbolID SID) const {
-  static const char *const TokNames[] = {
+  static std::string *TokNames = new std::string[]{
 #define TOK(X) #X,
-#define KEYWORD(X, Y) #X,
+#define KEYWORD(Keyword, Condition) llvm::StringRef(#Keyword).upper(),
 #include "clang/Basic/TokenKinds.def"
-  nullptr};
+  };
   if (clang::pseudo::isToken(SID))
 return TokNames[clang::pseudo::symbolToToken(SID)];
   std::string Name = symbolName(SID).str();
@@ -61,9 +61,13 @@
 
 std::string Grammar::mangleRule(RuleID RID) const {
   const auto &R = lookupRule(RID);
-  std::string MangleName = mangleSymbol(R.Target);
-  for (size_t I = 0; I < R.seq().size(); ++I)
-MangleName += llvm::formatv("_{0}{1}", I, mangleSymbol(R.seq()[I]));
+  std::string MangleName;
+  for (size_t I = 0; I < R.seq().size(); ++I) {
+MangleName += mangleSymbol(R.seq()[I]);
+MangleName.append("__");
+  }
+  MangleName.pop_back();
+  MangleName.pop_back();
   return MangleName;
 }
 
Index: clang-tools-extra/pseudo/lib/cxx/CXX.cpp
===
--- clang-tools-extra/pseudo/lib/cxx/CXX.cpp
+++ clang-tools-extra/pseudo/lib/cxx/CXX.cpp
@@ -111,40 +111,39 @@
 bool isFunctionDeclarator(const ForestNode *Declarator) {
   assert(Declarator->symbol() == (SymbolID)(cxx::Symbol::declarator));
   bool IsFunction = false;
-  using cxx::Rule;
   while (true) {
 // not well-formed code, return the best guess.
 if (Declarator->kind() != ForestNode::Sequence)
   return IsFunction;
 
-switch ((cxx::Rule)Declarator->rule()) {
-case Rule::noptr_declarator_0declarator_id: // reached the bottom
+switch (Declarator->rule()) {
+case rule::noptr_declarator::declarator_id: // reached the bottom
   return IsFunction;
 // *X is a nonfunction (unless X is a function).
-case Rule::ptr_declarator_0ptr_operator_1ptr_declarator:
+case rule::ptr_declarator::ptr_operator__ptr_declarator:
   Declarator = Declarator->elements()[1];
   IsFunction = false;
   continue;
 // X() is a function (unless X is a pointer or similar).
-case Rule::
-declarator_0noptr_declarator_1parameters_and_qualifiers_2trailing_return_type:
-case Rule::noptr_declarator_0noptr_declarator_1parameters_and_qualifiers:
+case rule::declarator::
+noptr_declarator__parameters_and_qualifiers__trailing_return_type:
+case rule::noptr_declarator::noptr_declarator__parameters_and_qualifiers:
   Declarator = Declarator->elements()[0];
   IsFunction = true;
   continue;
 // X[] is an array (unless X is a pointer or function).
-case Rule::
-noptr_decla