[PATCH] D41102: Setup clang-doc frontend framework

2018-03-20 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

After much digging, it looks like the lit config is never initialized in 
clang-tools-extra like it is in the other projects. REQUIRES et.al. work 
properly once that's in there (see D44708 ). 
Once that lands I'll reland this and *hopefully* that'll be that!


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-19 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Huh, something weird is going on there.
What about the other way around, `REQUIRES: linux` ?


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-19 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

In https://reviews.llvm.org/D41102#1041791, @lebedev.ri wrote:

> Have you tried something more broad, like
>  `UNSUPPORTED: mingw32,win32`
>  ?


That wasn't working either, confusingly, at least on the local windows machine 
I have.


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-19 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

In https://reviews.llvm.org/D41102#1041773, @juliehockett wrote:

> I was just thinking of disabling the one test that has an issue 
> (class-in-function) on Windows -- the filename is only used in generating 
> *some* USRs, so all of the other ones are fine. We ran into some issues with 
> that though, since `UNSUPPORTED: system-windows` didn't seem to disable the 
> test on the machine I have access to. Thoughts?




> `UNSUPPORTED: system-windows`

Perhaps that is only for msvc?

Have you tried something more broad, like
`UNSUPPORTED: mingw32,win32`
?


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-19 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

I was just thinking of disabling the one test that has an issue 
(class-in-function) on Windows -- the filename is only used in generating 
*some* USRs, so all of the other ones are fine. We ran into some issues with 
that though, since `UNSUPPORTED: system-windows` didn't seem to disable the 
test on the machine I have access to. Thoughts?


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-18 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Hm, or possibly you could just pass the triple to clang?


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-16 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

So what part is failing, specifically?
The SHA1 blobs of USR's differ in the llvm-bcanalyzer dumps?
The actual filenames `%t/docs/bc/` differ?
I guess both?

First one you should be able to handle by replacing the actual values with a 
regex
(i'd guess ` op19=226 op20=232/>` -> ``, but did not try)
I'm not sure we care about the actual values here, do we?

Second one is interesting.
If we assume that the order in which those are generated is the same, which i 
think is a safer assumption,
then you could just use result id, not key (sha1-to-text of USR), i.e. 
`%t/docs/bc/00.bc`, `%t/docs/bc/01.bc` and so on.
I.e. //something// like:

if (DumpMapperResult) {
  +   unsigned id = 0;
  Exec->get()->getToolResults()->forEachResult([&](StringRef Key,
   StringRef Value) {
SmallString<128> IRRootPath;
llvm::sys::path::native(OutDirectory, IRRootPath);
llvm::sys::path::append(IRRootPath, "bc");
std::error_code DirectoryStatus =
llvm::sys::fs::create_directories(IRRootPath);
if (DirectoryStatus != OK) {
  llvm::errs() << "Unable to create documentation directories.\n";
  return;
}
  - llvm::sys::path::append(IRRootPath, Key + ".bc");
  + llvm::sys::path::append(IRRootPath, std::to_string(id) + ".bc");
std::error_code OutErrorInfo;
llvm::raw_fd_ostream OS(IRRootPath, OutErrorInfo, 
llvm::sys::fs::F_None);
if (OutErrorInfo != OK) {
  llvm::errs() << "Error opening documentation file.\n";
  return;
}
OS << Value;
OS.close();
  + id++;
  });
}


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-14 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:230
+  prepRecordData(ID);
+  for (const char C : RecordIdNameMap[ID].Name) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, Record);

lebedev.ri wrote:
> Sadly, i can **not** prove it via godbolt (can't add LLVM as library), but 
> i'd //expect// streamlining this should at least not hurt, i.e. something like
> ```
> Record.append(RecordIdNameMap[ID].Name.begin(), 
> RecordIdNameMap[ID].Name.end());
> ```
> ?
And https://github.com/mattgodbolt/compiler-explorer/issues/841 is done,
so now we can see that `SmallVector::append()` at least results in less code:
https://godbolt.org/g/xJQ59c


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-12 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

In https://reviews.llvm.org/D41102#1034919, @lebedev.ri wrote:

> Since the commit was reverted, did you mean to either recommit it, or reopen 
> this (with updated diff), so it does not get lost?


Relanded in r327295.




Comment at: clang-doc/BitcodeWriter.h:154
+  ClangDocBitcodeWriter(llvm::BitstreamWriter ,
+bool OmitFilenames = false)
+  : Stream(Stream), OmitFilenames(OmitFilenames) {

sammccall wrote:
> Hmm, you spend a lot of effort plumbing this variable around! Why is it so 
> important?
> Filesize? (I'm not that familiar with LLVM bitcode, but surely we'll end up 
> with a string table anyway?)
> 
> If it really is an important option people will want, the command-line arg 
> should probably say why.
It was for testing purposes (so that the tests aren't flaky on filenames), but 
I replaced it with regex.



Comment at: clang-doc/BitcodeWriter.h:241
+/// \param I The info to emit to bitcode.
+template  void ClangDocBitcodeWriter::emitBlock(const T ) {
+  StreamSubBlockGuard Block(Stream, MapFromInfoToBlockId::ID);

lebedev.ri wrote:
> sammccall wrote:
> > OK, I don't get this at all.
> > 
> > We have to declare emitBlockContent(NamespaceInfo) *and* the specialization 
> > of MapFromInfoToBlockId, and deal with the public interface 
> > emitBlock being a template function where you can't tell what's legal to 
> > pass, instead of writing:
> > 
> > ```void emitBlock(const NamespaceInfo ) {
> >   SubStreamBlockGuard Block(Stream, BI_NAMESPACE_BLOCK_ID); // <-- this one 
> > line
> >   ...
> > }```
> > 
> > This really seems like templates for the sake of templates :(
> If you want to add a new block, in one case you just need to add one
> ```
> template <> struct MapFromInfoToBlockId {
>   static const BlockId ID = BI_???_BLOCK_ID;
> };
> ```
> In the other case you need to add whole
> ```
> void ClangDocBitcodeWriter::emitBlock(const ???Info ) {
>   StreamSubBlockGuard Block(Stream, BI_???_BLOCK_ID);
>   emitBlockContent(I);
> }
> ```
> (and it was even longer initially)
> It seems just templating one static variable is shorter than duplicating 
> `emitBlock()` each time, no?
> 
> Do compare the current diff with the original diff state.
> I *think* these templates helped move much of the duplication to simplify the 
> code overall.
You'd still have to add the appropriate `emitBlock()` function for any new 
block, since it would have different attributes. 



Comment at: clang-doc/Mapper.cpp:33
+  ECtx->reportResult(llvm::toHex(llvm::toStringRef(serialize::hashUSR(USR))),
+ serialize::emitInfo(D, getComment(D, D->getASTContext()),
+ getLine(D, D->getASTContext()),

sammccall wrote:
> It seems a bit of a poor fit to use a complete bitcode file (header, version, 
> block info) as your value format when you know the format, and know there'll 
> be no version skew.
> Is it easy just to emit the block we care about?
Ideally, yes, but right now in the clang BitstreamWriter there's no way to tell 
the instance what all the abbreviations are without also emitting the blockinfo 
to the output stream, though I'm thinking about taking a stab at separating the 
two. 

Also, this relies on the llvm-bcanalyzer for testing, which requires both the 
header and the blockinfo in order to read the data :/


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-12 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Since the commit was reverted, did you mean to either recommit it, or reopen 
this (with updated diff), so it does not get lost?


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-09 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Might have been better to not start landing until the all differentials are 
understood/accepted, but i understand that it is not really up to me to decide.
Let's hope nothing in the next differentials will require changes to this 
initial code :)




Comment at: clang-doc/BitcodeWriter.h:241
+/// \param I The info to emit to bitcode.
+template  void ClangDocBitcodeWriter::emitBlock(const T ) {
+  StreamSubBlockGuard Block(Stream, MapFromInfoToBlockId::ID);

sammccall wrote:
> OK, I don't get this at all.
> 
> We have to declare emitBlockContent(NamespaceInfo) *and* the specialization 
> of MapFromInfoToBlockId, and deal with the public interface 
> emitBlock being a template function where you can't tell what's legal to 
> pass, instead of writing:
> 
> ```void emitBlock(const NamespaceInfo ) {
>   SubStreamBlockGuard Block(Stream, BI_NAMESPACE_BLOCK_ID); // <-- this one 
> line
>   ...
> }```
> 
> This really seems like templates for the sake of templates :(
If you want to add a new block, in one case you just need to add one
```
template <> struct MapFromInfoToBlockId {
  static const BlockId ID = BI_???_BLOCK_ID;
};
```
In the other case you need to add whole
```
void ClangDocBitcodeWriter::emitBlock(const ???Info ) {
  StreamSubBlockGuard Block(Stream, BI_???_BLOCK_ID);
  emitBlockContent(I);
}
```
(and it was even longer initially)
It seems just templating one static variable is shorter than duplicating 
`emitBlock()` each time, no?

Do compare the current diff with the original diff state.
I *think* these templates helped move much of the duplication to simplify the 
code overall.


Repository:
  rL LLVM

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-08 Thread Julie Hockett via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
juliehockett marked 11 inline comments as done.
Closed by commit rL327102: [clang-doc] Setup clang-doc frontend framework 
(authored by juliehockett, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D41102?vs=137457=137689#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D41102

Files:
  clang-tools-extra/trunk/CMakeLists.txt
  clang-tools-extra/trunk/clang-doc/BitcodeWriter.cpp
  clang-tools-extra/trunk/clang-doc/BitcodeWriter.h
  clang-tools-extra/trunk/clang-doc/CMakeLists.txt
  clang-tools-extra/trunk/clang-doc/ClangDoc.cpp
  clang-tools-extra/trunk/clang-doc/ClangDoc.h
  clang-tools-extra/trunk/clang-doc/Mapper.cpp
  clang-tools-extra/trunk/clang-doc/Mapper.h
  clang-tools-extra/trunk/clang-doc/Representation.h
  clang-tools-extra/trunk/clang-doc/Serialize.cpp
  clang-tools-extra/trunk/clang-doc/Serialize.h
  clang-tools-extra/trunk/clang-doc/tool/CMakeLists.txt
  clang-tools-extra/trunk/clang-doc/tool/ClangDocMain.cpp
  clang-tools-extra/trunk/docs/clang-doc.rst
  clang-tools-extra/trunk/test/CMakeLists.txt
  clang-tools-extra/trunk/test/clang-doc/mapper-class-in-class.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-class-in-function.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-class.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-comments.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-enum.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-function.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-method.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-namespace.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-struct.cpp
  clang-tools-extra/trunk/test/clang-doc/mapper-union.cpp

Index: clang-tools-extra/trunk/test/clang-doc/mapper-namespace.cpp
===
--- clang-tools-extra/trunk/test/clang-doc/mapper-namespace.cpp
+++ clang-tools-extra/trunk/test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/8D042EFFC98B373450BC6B5B90A330C25A150E9C.bc --dump | FileCheck %s
+
+namespace A {}
+
+// CHECK: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+  // CHECK-NEXT:  blob data = 'A'
+// CHECK-NEXT: 
Index: clang-tools-extra/trunk/test/clang-doc/mapper-class-in-function.cpp
===
--- clang-tools-extra/trunk/test/clang-doc/mapper-class-in-function.cpp
+++ clang-tools-extra/trunk/test/clang-doc/mapper-class-in-function.cpp
@@ -0,0 +1,38 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/B6AC4C5C9F2EA3F2B3ECE1A33D349F4EE502B24E.bc --dump | FileCheck %s --check-prefix CHECK-H
+// RUN: llvm-bcanalyzer %t/docs/bc/E03E804368784360D86C757B549D14BB84A94415.bc --dump | FileCheck %s --check-prefix CHECK-H-I
+
+void H() {
+  class I {};
+}
+
+// CHECK-H: 
+// CHECK-H-NEXT: 
+  // CHECK-H-NEXT: 
+// CHECK-H-NEXT: 
+// CHECK-H-NEXT: 
+  // CHECK-H-NEXT: 
+  // CHECK-H-NEXT:  blob data = 'H'
+  // CHECK-H-NEXT:  blob data = '{{.*}}'
+  // CHECK-H-NEXT: 
+// CHECK-H-NEXT:  blob data = 'void'
+  // CHECK-H-NEXT: 
+// CHECK-H-NEXT: 
+
+// CHECK-H-I: 
+// CHECK-H-I-NEXT: 
+  // CHECK-H-I-NEXT: 
+// CHECK-H-I-NEXT: 
+// CHECK-H-I-NEXT: 
+  // CHECK-H-I-NEXT: 
+  // CHECK-H-I-NEXT:  blob data = 'I'
+  // CHECK-H-I-NEXT:  blob data = 'B6AC4C5C9F2EA3F2B3ECE1A33D349F4EE502B24E'
+  // CHECK-H-I-NEXT:  blob data = '{{.*}}'
+  // CHECK-H-I-NEXT: 
+// CHECK-H-I-NEXT: 
+
+
Index: clang-tools-extra/trunk/test/clang-doc/mapper-class-in-class.cpp
===
--- clang-tools-extra/trunk/test/clang-doc/mapper-class-in-class.cpp
+++ clang-tools-extra/trunk/test/clang-doc/mapper-class-in-class.cpp
@@ -0,0 +1,35 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/641AB4A3D36399954ACDE29C7A8833032BF40472.bc --dump | FileCheck %s --check-prefix CHECK-X-Y
+// RUN: llvm-bcanalyzer %t/docs/bc/CA7C7935730B5EACD25F080E9C83FA087CCDC75E.bc --dump | FileCheck %s --check-prefix CHECK-X
+
+class X {
+  class Y {};
+};
+
+// CHECK-X: 
+// CHECK-X-NEXT: 
+  // CHECK-X-NEXT: 
+// CHECK-X-NEXT: 
+// CHECK-X-NEXT: 
+  // CHECK-X-NEXT: 
+  // CHECK-X-NEXT:  blob data = 'X'
+  // CHECK-X-NEXT:  blob data = '{{.*}}'
+  // CHECK-X-NEXT: 
+// CHECK-X-NEXT: 
+
+
+// CHECK-X-Y: 
+// CHECK-X-Y-NEXT: 
+  // CHECK-X-Y-NEXT: 
+// 

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-08 Thread Sam McCall via Phabricator via cfe-commits
sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

There's a few places where we can trim some of the boilerplate, which I think 
is important - it's hard to find the "real code" among all the plumbing in 
places.
Other than that, this seems OK to me.




Comment at: clang-doc/BitcodeWriter.h:116
+template  struct MapFromInfoToBlockId {
+  static const BlockId ID;
+};

I think you don't want to declare ID in the unspecialized template, so you get 
a compile error if you try to use it.

(Using traits for this sort of thing seems a bit overboard to me, but YMMV)



Comment at: clang-doc/BitcodeWriter.h:154
+  ClangDocBitcodeWriter(llvm::BitstreamWriter ,
+bool OmitFilenames = false)
+  : Stream(Stream), OmitFilenames(OmitFilenames) {

Hmm, you spend a lot of effort plumbing this variable around! Why is it so 
important?
Filesize? (I'm not that familiar with LLVM bitcode, but surely we'll end up 
with a string table anyway?)

If it really is an important option people will want, the command-line arg 
should probably say why.



Comment at: clang-doc/BitcodeWriter.h:241
+/// \param I The info to emit to bitcode.
+template  void ClangDocBitcodeWriter::emitBlock(const T ) {
+  StreamSubBlockGuard Block(Stream, MapFromInfoToBlockId::ID);

OK, I don't get this at all.

We have to declare emitBlockContent(NamespaceInfo) *and* the specialization of 
MapFromInfoToBlockId, and deal with the public interface 
emitBlock being a template function where you can't tell what's legal to pass, 
instead of writing:

```void emitBlock(const NamespaceInfo ) {
  SubStreamBlockGuard Block(Stream, BI_NAMESPACE_BLOCK_ID); // <-- this one line
  ...
}```

This really seems like templates for the sake of templates :(



Comment at: clang-doc/ClangDoc.h:10
+//
+// This file implements the main entry point for the clang-doc tool. It runs
+// the clang-doc mapper on a given set of source code files using a

This comment doesn't seem accurate - there's no main() in this file.
There's a FrontendActionFactory, but nothing in this file uses it.



Comment at: clang-doc/ClangDoc.h:37
+
+  clang::FrontendAction *create() override {
+class ClangDocConsumer : public clang::ASTConsumer {

nit: seems odd to put all this implementation in the header.
(personally I'd just expose a function returning 
unique_ptr from the header, but up to you...)



Comment at: clang-doc/ClangDoc.h:38
+  clang::FrontendAction *create() override {
+class ClangDocConsumer : public clang::ASTConsumer {
+public:

for ASTConsumers implemented by ASTVisitors, there seems a fairly strong 
convention to just make the same class extend both (MapASTVisitor, here).
That would eliminate one plumbing class...



Comment at: clang-doc/Mapper.cpp:33
+  ECtx->reportResult(llvm::toHex(llvm::toStringRef(serialize::hashUSR(USR))),
+ serialize::emitInfo(D, getComment(D, D->getASTContext()),
+ getLine(D, D->getASTContext()),

It seems a bit of a poor fit to use a complete bitcode file (header, version, 
block info) as your value format when you know the format, and know there'll be 
no version skew.
Is it easy just to emit the block we care about?



Comment at: clang-doc/Representation.h:29
+
+using USRString = std::array;
+

lebedev.ri wrote:
> Right, of course, internally this is kept in the binary format, which is just 
> 20 chars.
> This is not the string (the hex-ified version of sha1), but the raw sha1, the 
> binary.
> This should somehow convey that. This should be something closer to `USRSha1`.
I'm not sure that any of the implementation (either USR or SHA) belongs in the 
type name.
In clangd we called this type SymbolID, which seems like a reasonable name here 
too.



Comment at: clang-doc/Representation.h:44
+  CommentInfo(CommentInfo &) : Children(std::move(Other.Children)) {}
+  SmallString<16> Kind;
+  SmallString<64> Text;

this is probably the right place to document these fields - what are the legal 
kinds? what's the name of a comment, direction, etc?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-07 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Hmm, i'm missing something about the way store sha1...




Comment at: clang-doc/BitcodeWriter.cpp:53
+{// 0. Fixed-size integer (length of the sha1'd USR)
+ llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::VBR,
+   BitCodeConstants::USRLengthSize),

This is VBR because USRLengthSize is of such strange size, to conserve the bits?



Comment at: clang-doc/BitcodeWriter.cpp:57
+ llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Array),
+ llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Char6)});
+}

Looking at the `NumWords` changes (decrease!) in the tests, and this is bugging 
me.
And now that i have realized what we do with USR:
* we first compute SHA1, and get 20x uint8_t
* store/use it internally
* then hex-ify it, getting 40x char (assuming 8-bit char)
* then convert to char6, winning back two bits. but we still loose 2 bits.

Question: *why* do we store sha1 of USR as a string? 
Why can't we just store that USRString (aka USRSha1 binary) directly?
That would be just 20 bytes, you just couldn't go any lower than that.



Comment at: clang-doc/Representation.h:29
+
+using USRString = std::array;
+

Right, of course, internally this is kept in the binary format, which is just 
20 chars.
This is not the string (the hex-ified version of sha1), but the raw sha1, the 
binary.
This should somehow convey that. This should be something closer to `USRSha1`.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-07 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 137457.
juliehockett marked 13 inline comments as done.
juliehockett added a comment.

Updating bitcode writer for hashed USRs, and re-running clang-format. Also 
cleaning up a couple of unused fields.


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-comments.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/0B8A6B938B939B77C63258AA3E938BF9E2E8.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+
+// CHECK: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
+// CHECK-NEXT: 
+  // CHECK-NEXT:  record string = '0B8A6B938B939B77C63258AA3E938BF9E2E8'
+  // CHECK-NEXT:  blob data = 'D'
+  // CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT:  blob data = 'int'
+// CHECK-NEXT:  blob data = 'D::X'
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT:  blob data = 'int'
+// CHECK-NEXT:  blob data = 'D::Y'
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/06B5F6A19BA9F6A832E127C9968282B94619B210.bc --dump | FileCheck %s
+
+struct C { int i; };
+
+// CHECK: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
+// CHECK-NEXT: 
+  // CHECK-NEXT:  record string = '06B5F6A19BA9F6A832E127C9968282B94619B210'
+  // CHECK-NEXT:  blob data = 'C'
+  // CHECK-NEXT: 
+// CHECK-NEXT:  blob data = 'int'
+// CHECK-NEXT:  blob data = 'C::i'
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/8D042EFFC98B373450BC6B5B90A330C25A150E9C.bc --dump | FileCheck %s
+
+namespace A {}
+
+// CHECK: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
+// CHECK-NEXT: 
+  // CHECK-NEXT:  record string = '8D042EFFC98B373450BC6B5B90A330C25A150E9C'
+  // CHECK-NEXT:  blob data = 'A'
+// CHECK-NEXT: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,41 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/F0F9FC65FC90F54F690144A7AFB15DFC3D69B6E6.bc --dump | FileCheck %s --check-prefix CHECK-G-F
+// RUN: llvm-bcanalyzer %t/docs/bc/4202E8BF0ECB12AE354C8499C52725B0EE30AED5.bc --dump | FileCheck %s --check-prefix CHECK-G
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+
+// CHECK-G: 
+// CHECK-G-NEXT: 
+  // CHECK-G-NEXT: 
+// CHECK-G-NEXT: 
+// CHECK-G-NEXT: 
+  // CHECK-G-NEXT:  record string = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
+  // CHECK-G-NEXT:  blob data = 'G'
+  // CHECK-G-NEXT: 
+// CHECK-G-NEXT: 
+
+// CHECK-G-F: 
+// CHECK-G-F-NEXT: 
+  // CHECK-G-F-NEXT: 
+// CHECK-G-F-NEXT: 
+// CHECK-G-F-NEXT: 
+  // CHECK-G-F-NEXT:  record string = 'F0F9FC65FC90F54F690144A7AFB15DFC3D69B6E6'
+  // CHECK-G-F-NEXT:  blob data = 'Method'
+  // CHECK-G-F-NEXT:  blob data = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
+  // CHECK-G-F-NEXT: 
+  // CHECK-G-F-NEXT:  blob data = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
+  // CHECK-G-F-NEXT: 
+// 

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-07 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

In https://reviews.llvm.org/D41102#1028995, @lebedev.ri wrote:

> Some further notes based on the SHA1 nature.


I'm sorry, brainfreeze, i meant `40` chars, not `20`.
Updated comments...




Comment at: clang-doc/BitcodeWriter.cpp:309
+  assert(Ref.USR.size() < (1U << BitCodeConstants::USRLengthSize));
+  Record.push_back(Ref.USR.size());
+  Stream.EmitRecordWithBlob(Abbrevs.get(ID), Record, Ref.USR);

lebedev.ri wrote:
> Now it would make sense to also assert that this sha1(usr).strlen() == 20
40 that is



Comment at: clang-doc/BitcodeWriter.h:46
+  static constexpr unsigned ReferenceTypeSize = 8U;
+  static constexpr unsigned USRLengthSize = 16U;
+};

lebedev.ri wrote:
> Can definitively lower this to `5U` (2^6 == 32, which is more than the 20 
> 8-bit chars of sha1)
Edit: to 6U (2^6 == 64, which is more than the 40 8-bit chars of sha1)



Comment at: clang-doc/Representation.h:59
+
+  SmallString<16> USR;
+  InfoType RefType = InfoType::IT_default;

lebedev.ri wrote:
> Now that USR is sha1'd, this is **always** 20 8-bit characters long.
40 that is



Comment at: clang-doc/Representation.h:107
+
+  SmallString<16> USR;
+  SmallString<16> Name;

lebedev.ri wrote:
> `20`
> Maybe place `using USRString = SmallString<20>; // SHA1 of USR` somewhere and 
> use it everywhere?
40


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-07 Thread Athos via Phabricator via cfe-commits
Athosvk added a comment.

In https://reviews.llvm.org/D41102#1028760, @juliehockett wrote:

> If you take a look at the follow-on patch to this (D43341 
> ), you'll see that that is where the pointer 
> is added in (since it is irrelevant to the mapper portion, as it cannot be 
> filled out until the information has been reduced). The back references to 
> children and whatnot are also added there.


Oops! I'll have a look!

In https://reviews.llvm.org/D41102#1028760, @juliehockett wrote:

> The USRs are kept for serialization purposes -- given the modular nature of 
> the design, the goal is to be able to write out the bitstream and have it be 
> consumable with all necessary information. Since we can't write out pointers 
> (and it would be useless if we did, since they would change as soon as the 
> file was read in), we maintain the USRs to have a means of re-finding the 
> referenced declaration.


What I was referring to was the storing of a USR per reference. Of course, 
serializing pointers wouldn't work, but what I mean is that what we used as a 
USR was stored in what was pointed to, not in the reference that tells what we 
are pointing to. To be a little more concise, a RecordInfo has pointers to the 
FuntionInfo for its member functions. Upon serialization, the RecordInfo 
queries the USR of those functions. A function being referenced multiple times 
remains to only have the USR stored. If I understand correctly, you currently 
save the USR for time an InfoType references another InfoType.

Anyhow, don't pay too much attention to that comment, it's all meant as a minor 
thing. It sure is looking good so far!


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-06 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Nice!
Some further notes based on the SHA1 nature.




Comment at: clang-doc/BitcodeWriter.cpp:74
+  AbbrevGen(Abbrev,
+{// 0. Fixed-size integer (ref type)
+ llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,

Those are mixed up.
`USRLengthSize` is definitively supposed to be second.



Comment at: clang-doc/BitcodeWriter.cpp:81
+ // 2. The string blob
+ llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Blob)});
+}

The sha1 is all-printable, so how about using 
`BitCodeAbbrevOp::Encoding::Char6` ?
Char4 would work best, but it is not there.



Comment at: clang-doc/BitcodeWriter.cpp:149
+  {MEMBER_TYPE_ACCESS, {"Access", }},
+  {NAMESPACE_USR, {"USR", }},
+  {NAMESPACE_NAME, {"Name", }},

Ha, and all the `*_USR` are actually `StringAbbrev`'s, not confusing at all :)



Comment at: clang-doc/BitcodeWriter.cpp:309
+  assert(Ref.USR.size() < (1U << BitCodeConstants::USRLengthSize));
+  Record.push_back(Ref.USR.size());
+  Stream.EmitRecordWithBlob(Abbrevs.get(ID), Record, Ref.USR);

Now it would make sense to also assert that this sha1(usr).strlen() == 20



Comment at: clang-doc/BitcodeWriter.h:46
+  static constexpr unsigned ReferenceTypeSize = 8U;
+  static constexpr unsigned USRLengthSize = 16U;
+};

Can definitively lower this to `5U` (2^6 == 32, which is more than the 20 8-bit 
chars of sha1)



Comment at: clang-doc/Representation.h:59
+
+  SmallString<16> USR;
+  InfoType RefType = InfoType::IT_default;

Now that USR is sha1'd, this is **always** 20 8-bit characters long.



Comment at: clang-doc/Representation.h:107
+
+  SmallString<16> USR;
+  SmallString<16> Name;

`20`
Maybe place `using USRString = SmallString<20>; // SHA1 of USR` somewhere and 
use it everywhere?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-06 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 137244.
juliehockett added a comment.

Adding hashing to reduce the size of USRs and updating tests.


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-comments.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/0B8A6B938B939B77C63258AA3E938BF9E2E8.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+
+// CHECK: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
+// CHECK-NEXT: 
+  // CHECK-NEXT:  blob data = '0B8A6B938B939B77C63258AA3E938BF9E2E8'
+  // CHECK-NEXT:  blob data = 'D'
+  // CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT:  blob data = 'int'
+// CHECK-NEXT:  blob data = 'D::X'
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT:  blob data = 'int'
+// CHECK-NEXT:  blob data = 'D::Y'
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/06B5F6A19BA9F6A832E127C9968282B94619B210.bc --dump | FileCheck %s
+
+struct C { int i; };
+
+// CHECK: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
+// CHECK-NEXT: 
+  // CHECK-NEXT:  blob data = '06B5F6A19BA9F6A832E127C9968282B94619B210'
+  // CHECK-NEXT:  blob data = 'C'
+  // CHECK-NEXT: 
+// CHECK-NEXT:  blob data = 'int'
+// CHECK-NEXT:  blob data = 'C::i'
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/8D042EFFC98B373450BC6B5B90A330C25A150E9C.bc --dump | FileCheck %s
+
+namespace A {}
+
+// CHECK: 
+// CHECK-NEXT: 
+  // CHECK-NEXT: 
+// CHECK-NEXT: 
+// CHECK-NEXT: 
+  // CHECK-NEXT:  blob data = '8D042EFFC98B373450BC6B5B90A330C25A150E9C'
+  // CHECK-NEXT:  blob data = 'A'
+// CHECK-NEXT: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,41 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump-mapper --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/bc/F0F9FC65FC90F54F690144A7AFB15DFC3D69B6E6.bc --dump | FileCheck %s --check-prefix CHECK-G-F
+// RUN: llvm-bcanalyzer %t/docs/bc/4202E8BF0ECB12AE354C8499C52725B0EE30AED5.bc --dump | FileCheck %s --check-prefix CHECK-G
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+
+// CHECK-G: 
+// CHECK-G-NEXT: 
+  // CHECK-G-NEXT: 
+// CHECK-G-NEXT: 
+// CHECK-G-NEXT: 
+  // CHECK-G-NEXT:  blob data = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
+  // CHECK-G-NEXT:  blob data = 'G'
+  // CHECK-G-NEXT: 
+// CHECK-G-NEXT: 
+
+// CHECK-G-F: 
+// CHECK-G-F-NEXT: 
+  // CHECK-G-F-NEXT: 
+// CHECK-G-F-NEXT: 
+// CHECK-G-F-NEXT: 
+  // CHECK-G-F-NEXT:  blob data = 'F0F9FC65FC90F54F690144A7AFB15DFC3D69B6E6'
+  // CHECK-G-F-NEXT:  blob data = 'Method'
+  // CHECK-G-F-NEXT:  blob data = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
+  // CHECK-G-F-NEXT: 
+  // CHECK-G-F-NEXT:  blob data = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
+  // CHECK-G-F-NEXT: 
+// CHECK-G-F-NEXT:  blob data = 'int'
+  // CHECK-G-F-NEXT: 
+  // CHECK-G-F-NEXT: 
+// CHECK-G-F-NEXT:  blob data = 'int'
+ 

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-06 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

In https://reviews.llvm.org/D41102#1028228, @Athosvk wrote:

> This seems like quite a decent approach! That being said, I don't see the 
> pointer yet? I assume you mean that you will be adding this? Additionally, a 
> slight disadvantage of doing this generic approach is that you need to do 
> bookkeeping on what it is referencing, but I guess there's no helping that 
> due to the architecture which makes you rely upon the USR? Personally I'd 
> prefer having the explicit types if and where possible. So for now a 
> RecordInfo has a vecotr of Reference's to its parents, but we know the 
> parents can only be of certain kinds (more than just a RecordType, but you 
> get the point); it won't be an enum, namespace or function.


If you take a look at the follow-on patch to this (D43341 
), you'll see that that is where the pointer 
is added in (since it is irrelevant to the mapper portion, as it cannot be 
filled out until the information has been reduced). The back references to 
children and whatnot are also added there.

> As I mentioned, we did this the other way around, which also has the slight 
> advantage that I only had to create and save the USR once per info instance 
> (as in, 10 references to a class only add the overhead of 10 pointers, rather 
> than each having the USR as well), but our disadvantage was of course that we 
> had delayed serialization (although we could arguably do both 
> simultaneously). It seems each method has its merits :).

The USRs are kept for serialization purposes -- given the modular nature of the 
design, the goal is to be able to write out the bitstream and have it be 
consumable with all necessary information. Since we can't write out pointers 
(and it would be useless if we did, since they would change as soon as the file 
was read in), we maintain the USRs to have a means of re-finding the referenced 
declaration.

That said, I was looking at the Clangd symbol indexing code yesterday, and 
noticed that they're hashing the USRs (since they get a little lengthy, 
particularly when you have nested and/or overloaded functions). I'm going to 
take a look at that today to try to make the USRs more space-efficient here.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-06 Thread Athos via Phabricator via cfe-commits
Athosvk added a comment.

My apologies for getting back on this so late!

In https://reviews.llvm.org/D41102#1017683, @juliehockett wrote:

> So, as an idea (as this diff implements), I updated the string references to 
> be a struct, which holds the USR of the referenced type (for serialization, 
> both here in the mapper and for the dump option in the reducer, as well as a 
> pointer to an `Info` struct. This pointer is not used at this point, but 
> would be populated by the reducer. Thoughts?


This seems like quite a decent approach! That being said, I don't see the 
pointer yet? I assume you mean that you will be adding this? Additionally, a 
slight disadvantage of doing this generic approach is that you need to do 
bookkeeping on what it is referencing, but I guess there's no helping that due 
to the architecture which makes you rely upon the USR? Personally I'd prefer 
having the explicit types if and where possible. So for now a RecordInfo has a 
vecotr of Reference's to its parents, but we know the parents can only be of 
certain kinds (more than just a RecordType, but you get the point); it won't be 
an enum, namespace or function.

As I mentioned, we did this the other way around, which also has the slight 
advantage that I only had to create and save the USR once per info instance (as 
in, 10 references to a class only add the overhead of 10 pointers, rather than 
each having the USR as well), but our disadvantage was of course that we had 
delayed serialization (although we could arguably do both simultaneously). It 
seems each method has its merits :).


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-05 Thread Eugene Zelenko via Phabricator via cfe-commits
Eugene.Zelenko added inline comments.



Comment at: clang-doc/BitcodeWriter.h:160
+class ClangDocBitcodeWriter {
+ public:
+  ClangDocBitcodeWriter(llvm::BitstreamWriter ,

Looks like Clang-format was applied incorrectly, because this is Google, not 
LLVM style. Please note that it doesn't modify file, just output formatted code 
to terminal.

Please reformat other files, including those in dependent patches.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-02 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/Representation.h:117
+  bool IsDefinition = false;
+  llvm::Optional DefLoc;
+  llvm::SmallVector Loc;

lebedev.ri wrote:
> I meant that `IsDefinition` controls whether `DefLoc` will be set/used or not.
> So with `llvm::Optional DefLoc`, you don't need the `bool 
> IsDefinition`.
That...makes so much sense. Oops. Thank you!


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-02 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 136809.
juliehockett marked an inline comment as done.
juliehockett added a comment.

Removing IsDefinition field.


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-comments.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,27 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,21 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,16 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,42 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s --check-prefix CHECK-G-F
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@g.bc --dump | FileCheck %s --check-prefix CHECK-G
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G: 
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G:  blob data = 'c:@S@G'
+  // CHECK-G:  blob data = 'G'
+  // CHECK-G: 
+// CHECK-G: 
+
+
+// CHECK-G-F: 
+// CHECK-G-F: 
+  // CHECK-G-F: 
+// CHECK-G-F: 
+// CHECK-G-F: 
+  // CHECK-G-F:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK-G-F:  blob data = 'Method'
+  // CHECK-G-F:  blob data = 'c:@S@G'
+  // CHECK-G-F: 
+  // CHECK-G-F:  blob data = 'c:@S@G'
+  // CHECK-G-F: 
+// CHECK-G-F:  blob data = 'int'
+  // CHECK-G-F: 
+  // CHECK-G-F: 
+// CHECK-G-F:  blob data = 'int'
+// CHECK-G-F:  blob data = 'param'
+  // CHECK-G-F: 
+// CHECK-G-F: 
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@F@F#I#'
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+   

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-02 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/Representation.h:117
+  bool IsDefinition = false;
+  llvm::Optional DefLoc;
+  llvm::SmallVector Loc;

I meant that `IsDefinition` controls whether `DefLoc` will be set/used or not.
So with `llvm::Optional DefLoc`, you don't need the `bool 
IsDefinition`.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-02 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 136791.
juliehockett marked 11 inline comments as done.
juliehockett added a comment.

Addressing comments


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-comments.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,16 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,44 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s --check-prefix CHECK-G-F
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@g.bc --dump | FileCheck %s --check-prefix CHECK-G
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G: 
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G:  blob data = 'c:@S@G'
+  // CHECK-G:  blob data = 'G'
+  // CHECK-G: 
+  // CHECK-G: 
+// CHECK-G: 
+
+
+// CHECK-G-F: 
+// CHECK-G-F: 
+  // CHECK-G-F: 
+// CHECK-G-F: 
+// CHECK-G-F: 
+  // CHECK-G-F:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK-G-F:  blob data = 'Method'
+  // CHECK-G-F:  blob data = 'c:@S@G'
+  // CHECK-G-F: 
+  // CHECK-G-F: 
+  // CHECK-G-F:  blob data = 'c:@S@G'
+  // CHECK-G-F: 
+// CHECK-G-F:  blob data = 'int'
+  // CHECK-G-F: 
+  // CHECK-G-F: 
+// CHECK-G-F:  blob data = 'int'
+// CHECK-G-F:  blob data = 'param'
+  // CHECK-G-F: 
+// CHECK-G-F: 
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-02 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Could some other people please review this differential, too?
I'm sure i have missed things.

---

Some more nitpicking.

For this differential as standalone, i'we mostly run out of things to nitpick.
Some things can probably be done better (the blockid/recordid stuff could 
probably be nicer if tablegen-ed, but that is for later).

I'll try to look at the next differential, and at them combined.




Comment at: clang-doc/BitcodeWriter.cpp:120
+BlockIdNameMap[Init.first] = Init.second;
+assert((BlockIdNameMap[Init.first].size() + 1) <=
+   BitCodeConstants::RecordSize);

We don't actually push these strings to the `Record` (but instead output them 
directly), so this assertion is not really meaningful, i think?



Comment at: clang-doc/BitcodeWriter.h:21
+#include "clang/AST/AST.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/Bitcode/BitstreamWriter.h"

+DenseMap



Comment at: clang-doc/BitcodeWriter.h:21
+#include "clang/AST/AST.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/Bitcode/BitstreamWriter.h"

lebedev.ri wrote:
> +DenseMap
+StringRef



Comment at: clang-doc/BitcodeWriter.h:197
+: Stream(Stream_) {
+  Stream.EnterSubblock(ID, BitCodeConstants::SubblockIDSize);
+}

Humm, you could avoid this constant, and conserve a few bits, if you move the 
init-list out of `emitBlockInfoBlock()` to somewhere e.g. after the `enum 
RecordId`, and then since the `BlockId ID` is already passed, you could compute 
it on-the-fly the same way the `BitCodeConstants::SubblockIDSize` is asserted 
in `emitBlockInfo*()`.
Not sure if it's worth doing though. Maybe just add it as a `NOTE` here.



Comment at: clang-doc/BitcodeWriter.h:249
+///
+/// \param WriteBlockInfo
+/// For serializing a single info (as in the mapper

Stale comment



Comment at: clang-doc/Representation.h:60
+  InfoType RefType = InfoType::IT_default;
+  Info *Ref;
+};

`Info *Ref;` isn't used anywhere



Comment at: clang-doc/Representation.h:117
+  bool IsDefinition = false;
+  Location DefLoc;
+  llvm::SmallVector Loc;

`llvm::Optional DefLoc;`  ?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-03-01 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:196
+/// \brief Emits a record name to the BLOCKINFO block.
+void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
+  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");

lebedev.ri wrote:
> juliehockett wrote:
> > lebedev.ri wrote:
> > > Hmm, so i've been staring at this and 
> > > http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html and i must say 
> > > i'm not fond of this indirection.
> > > 
> > > What i don't understand is, in previous function, we don't store 
> > > `BlockId`, why do we want to store `RecordId`?
> > > Aren't they both unstable, and are implementation detail?
> > > Do we want to store it (`RecordId`)? If yes, please explain it as a new 
> > > comment in code.
> > > 
> > > If no, i guess this would work too?
> > > ```
> > > assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
> > > Record.clear();
> > > Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, 
> > > RecordIdNameMap[ID].Name);
> > > ```
> > > And after that you can lower the default size of `SmallVector<> Record` 
> > > down to, hm, `4`?
> > I'm not entirely certain what you mean -- in `emitBlockId()`, we are 
> > storing both the block id and the block name in separate records 
> > (`BLOCKINFO_CODE_SETBID`, `BLOCKINFO_CODE_BLOCKNAME`, respectively). In 
> > `emitRecordId()`, we're doing something slightly different, in that we emit 
> > one record with both the record id and the record name (in record 
> > `BLOCKINFO_CODE_SETRECORDNAME`). 
> > 
> > Replacing the copy loop here has the same issue as above, namely that there 
> > isn't an easy way to convert between a `StringRef` and an array of 
> > `unsigned char`.
> Tried locally, and yes, we do need to output record id.
> 
> What we could **actually** do, is simply inline that `EmitRecord()`, first 
> emitting the RID, and then the name.
> ```
> template 
> void EmitRecord(unsigned Code, int ID, const Container ) {
>   // If we don't have an abbrev to use, emit this in its fully unabbreviated
>   // form.
>   auto Count = static_cast(makeArrayRef(Vals).size());
>   EmitCode(bitc::UNABBREV_RECORD);
>   EmitVBR(Code, 6);
>   EmitVBR(Count + 1, 6); // Including ID
>   EmitVBR64(ID, 6); // 'Prefix' with ID
>   for (unsigned i = 0, e = Count; i != e; ++i)
> EmitVBR64(Vals[i], 6);
> }
> ```
> 
> But that will result in rather ugly code.
> So given that the record names are quite short, and all the other strings we 
> output directly, maybe leave it as it is for now, until it shows in profiles?
> 
If that makes sense to you, sounds good to me!



Comment at: clang-doc/BitcodeWriter.cpp:139
+  {COMMENT_NAME, {"Name", }},
+  {COMMENT_POSITION, {"Position", }},
+  {COMMENT_DIRECTION, {"Direction", }},

lebedev.ri wrote:
> This change is not covered by tests.
> (I've actually found out that the hard way, by trying to find why it didn't 
> trigger any asssertions, oh well)
So after a some digging, this particular field can't be tested right now as the 
mapper doesn't look at any `TemplateDecl`s (something that definitely needs to 
be implemented, but in a follow-on patch). I've removed it for now, until it 
can be properly used/tested.



Comment at: clang-doc/BitcodeWriter.h:37
+  static constexpr unsigned SubblockIDSize = 4U;
+  static constexpr unsigned BoolSize = 1U;
+  static constexpr unsigned IntSize = 16U;

lebedev.ri wrote:
> juliehockett wrote:
> > lebedev.ri wrote:
> > > Hmm, you build with asserts enabled, right?
> > > I tried testing this, and three tests fail with
> > > ```
> > > clang-doc: /build/llvm/include/llvm/Bitcode/BitstreamWriter.h:122: void 
> > > llvm::BitstreamWriter::Emit(uint32_t, unsigned int): Assertion `(Val & 
> > > ~(~0U >> (32-NumBits))) == 0 && "High bits set!"' failed.
> > > ```
> > > ```
> > > Failing Tests (3):
> > > Clang Tools :: clang-doc/mapper-class-in-function.cpp
> > > Clang Tools :: clang-doc/mapper-function.cpp
> > > Clang Tools :: clang-doc/mapper-method.cpp
> > > 
> > >   Expected Passes: 6
> > >   Unexpected Failures: 3
> > > ```
> > > At least one failure is because of `BoolSize`, so i'd suspect the 
> > > assertion itself is wrong...
> > I do, and I've definitely seen that one triggered before but it's been 
> > because something was off in how the data was being outputted as I was 
> > shifting things around. That said, I'm not seeing it in my local build with 
> > this diff though -- I'll update it again just to make sure they're in sync.
> I did not retry with updated tree/patch, but i'm quite sure i did hit those 
> asserts.
> My current build line:
> ```
> -DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo
> -DLLVM_BINUTILS_INCDIR:PATH=/usr/include
> -DLLVM_BUILD_TESTS:BOOL=ON
> -DLLVM_ENABLE_ASSERTIONS:BOOL=ON
> -DLLVM_ENABLE_LLD:BOOL=ON
> 

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-01 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 136650.
juliehockett marked 16 inline comments as done.
juliehockett added a comment.

Adding tests, fixing comments, and removing an (as-of-yet) unused element of 
the CommentInfo struct.


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-comments.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,16 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,44 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s --check-prefix CHECK-G-F
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@g.bc --dump | FileCheck %s --check-prefix CHECK-G
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G: 
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G:  blob data = 'c:@S@G'
+  // CHECK-G:  blob data = 'G'
+  // CHECK-G: 
+  // CHECK-G: 
+// CHECK-G: 
+
+
+// CHECK-G-F: 
+// CHECK-G-F: 
+  // CHECK-G-F: 
+// CHECK-G-F: 
+// CHECK-G-F: 
+  // CHECK-G-F:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK-G-F:  blob data = 'Method'
+  // CHECK-G-F:  blob data = 'c:@S@G'
+  // CHECK-G-F: 
+  // CHECK-G-F: 
+  // CHECK-G-F:  blob data = 'c:@S@G'
+  // CHECK-G-F: 
+// CHECK-G-F:  blob data = 'int'
+  // CHECK-G-F: 
+  // CHECK-G-F: 
+// CHECK-G-F:  blob data = 'int'
+// CHECK-G-F:  blob data = 'param'
+  // CHECK-G-F: 
+// CHECK-G-F: 
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-01 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Thank you for working on this!
Some more nitpicking.

//Please// consider adding even more tests (ideally, all this code should have 
100% test coverage)




Comment at: clang-doc/BitcodeWriter.cpp:139
+  {COMMENT_NAME, {"Name", }},
+  {COMMENT_POSITION, {"Position", }},
+  {COMMENT_DIRECTION, {"Direction", }},

This change is not covered by tests.
(I've actually found out that the hard way, by trying to find why it didn't 
trigger any asssertions, oh well)



Comment at: clang-doc/BitcodeWriter.cpp:325
+  emitHeader();
+  Stream.EnterBlockInfoBlock();
+

I think it would be cleaner to move it (at least the enterblock, it might make 
sense to leave the header at the very top) after the static variable



Comment at: clang-doc/BitcodeWriter.cpp:363
+
+  for (const auto  : TheBlocks) {
+assert(Block.second.size() < (1U << BitCodeConstants::SubblockIDSize));

I.e.
```
...
, FUNCTION_IS_METHOD}}};

  Stream.EnterBlockInfoBlock();
  for (const auto  : TheBlocks) {
assert(Block.second.size() < (1U << BitCodeConstants::SubblockIDSize));
emitBlockInfo(Block.first, Block.second);
  }
  Stream.ExitBlock();

  emitVersion();
}
```



Comment at: clang-doc/BitcodeWriter.h:19
+
+#include 
+#include 

Please sort includes, clang-tidy complains.



Comment at: clang-doc/BitcodeWriter.h:32
+// BitCodeConstants, though they can be added without breaking it.
+static const unsigned VERSION_NUMBER = 1;
+

```
/build/clang-tools-extra/clang-doc/BitcodeWriter.h:32:23: warning: invalid case 
style for variable 'VERSION_NUMBER' [readability-identifier-naming]
static const unsigned VERSION_NUMBER = 1;
  ^~
  VersionNumber

```



Comment at: clang-doc/BitcodeWriter.h:163
+ public:
+  ClangDocBitcodeWriter(llvm::BitstreamWriter ,
+bool OmitFilenames = false)

The simplest solution would be
```
#ifndef NDEBUG // Don't want explicit dtor unless needed
~ClangDocBitcodeWriter() {
  // Check that the static size is large-enough.
  assert(Record.capacity() == BitCodeConstants::RecordSize);
}
#endif
```



Comment at: clang-doc/BitcodeWriter.h:228
+  // Static size is the maximum length of the block/record names we're pushing
+  // to this + 1. Longest is currently `MemberTypeBlock` at 15 chars.
+  SmallVector Record;

So you want to be really definitive with this. I wanted to avoid that, 
actually..
Then i'm afraid one more assert is needed, to make sure this is *actually* true.

I'm not seeing any way to make `SmallVector` completely static,
so you could either add one more wrapper around it (rather ugly),
or check the final size in the `ClangDocBitcodeWriter` destructor (will not 
pinpoint when the size has 'overflowed')



Comment at: clang-doc/BitcodeWriter.h:246
+void ClangDocBitcodeWriter::writeBitstream(const T , bool WriteBlockInfo) {
+  if (WriteBlockInfo) emitBlockInfoBlock();
+  StreamSubBlockGuard Block(Stream, MapFromInfoToBlockId::ID);

Does it *ever* make sense to output `BlockInfoBlock` anywhere else other than 
once at the very beginning?
I'd think you should drop the boolean param, and unconditinally call the 
`emitBlockInfoBlock();` from `ClangDocBitcodeWriter::ClangDocBitcodeWriter()` 
ctor.



Comment at: clang-doc/BitcodeWriter.h:248
+  StreamSubBlockGuard Block(Stream, MapFromInfoToBlockId::ID);
+  emitBlock(I);
+}

The naming choices confuse me.
There is `writeBitstream()` and `emitBlock()`, which is called from 
`writeBitstream()` to write the actual contents of the block.

Why one is `write` and another is `emit`?
To match the `BitstreamWriter` naming choices? (which uses `Emit` prefix)?
To avoid the confusion of which one outputs the actual content, and which one 
outputs the whole block?

I think it should be:
*
```
- void emitBlock(const NamespaceInfo );
+ void emitBlockContent(const NamespaceInfo );
```
*
```
- void ClangDocBitcodeWriter::writeBitstream(const T , bool WriteBlockInfo);
+ void ClangDocBitcodeWriter::emitBlock(const T , bool EmitBlockInfo);
```

This way, *i think* their names would clearner-er state what they do, and won't 
be weirdly different.
What do you think?



Comment at: clang-doc/Representation.h:18
+
+#include 
+#include "clang/AST/Type.h"

Please sort includes, clang-tidy complains.



Comment at: clang-doc/Serialize.cpp:88
+  CurrentCI.Name = getCommandName(C->getCommandID());
+  for (unsigned i = 0, e = C->getNumArgs(); i < e; ++i)
+CurrentCI.Args.push_back(C->getArgText(i));

```
/build/clang-tools-extra/clang-doc/Serialize.cpp:88:17: warning: invalid 

[PATCH] D41102: Setup clang-doc frontend framework

2018-03-01 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 136520.
juliehockett marked 14 inline comments as done.
juliehockett added a comment.

Fixing comments and adding tests


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,51 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+/// \brief Brief description.
+///
+/// Extended description that
+/// continues onto the next line.
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+  // CHECK: 
+// CHECK:  blob data = 'FullComment'
+// CHECK: 
+  // CHECK:  blob data = 'ParagraphComment'
+  // CHECK: 
+// CHECK:  blob data = 'TextComment'
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'BlockCommandComment'
+  // CHECK:  blob data = 'brief'
+  // CHECK: 
+// CHECK:  blob data = 'ParagraphComment'
+// CHECK: 
+  // CHECK:  blob data = 'TextComment'
+  // CHECK:  blob data = ' Brief description.'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'ParagraphComment'
+  // CHECK: 
+// CHECK:  blob data = 'TextComment'
+// CHECK:  blob data = ' Extended description that'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'TextComment'
+// CHECK:  blob data = ' continues onto the next line.'
+  // CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,45 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s --check-prefix CHECK-G-F
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@g.bc --dump | FileCheck %s --check-prefix CHECK-G
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G: 
+// CHECK-G: 
+// CHECK-G: 
+  // CHECK-G:  blob data = 'c:@S@G'
+  // CHECK-G:  blob data = 'G'
+  // CHECK-G: 
+  // CHECK-G: 
+// CHECK-G: 
+
+
+// CHECK-G-F: 
+// 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-28 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Thank you for working on this!
Some more review notes.
Please look into adding a bit more tests.




Comment at: clang-doc/BitcodeWriter.cpp:179
+  assert(Inits.size() == RecordIdCount);
+  for (const auto  : Inits) RecordIdNameMap[Init.first] = Init.second;
+  assert(RecordIdNameMap.size() == RecordIdCount);

Since this is the only string we ever push to `Record`, can we add an assertion 
to make sure we always have enough room for it?
E.g.
```
for (const auto  : Inits) {
  RecordId RID = Init.first;
  RecordIdNameMap[RID] = Init.second;
  assert((1 + RecordIdNameMap[RID].size()) <= Record.size());
  // Since record was just created, it should not have any dynamic size.
  // Or move the small size into a variable and use it when declaring the 
Record and here.
}
```



Comment at: clang-doc/BitcodeWriter.cpp:230
+  prepRecordData(ID);
+  for (const char C : RecordIdNameMap[ID].Name) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, Record);

Sadly, i can **not** prove it via godbolt (can't add LLVM as library), but i'd 
//expect// streamlining this should at least not hurt, i.e. something like
```
Record.append(RecordIdNameMap[ID].Name.begin(), RecordIdNameMap[ID].Name.end());
```
?



Comment at: clang-doc/BitcodeWriter.cpp:196
+/// \brief Emits a record name to the BLOCKINFO block.
+void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
+  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");

juliehockett wrote:
> lebedev.ri wrote:
> > Hmm, so i've been staring at this and 
> > http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html and i must say 
> > i'm not fond of this indirection.
> > 
> > What i don't understand is, in previous function, we don't store `BlockId`, 
> > why do we want to store `RecordId`?
> > Aren't they both unstable, and are implementation detail?
> > Do we want to store it (`RecordId`)? If yes, please explain it as a new 
> > comment in code.
> > 
> > If no, i guess this would work too?
> > ```
> > assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
> > Record.clear();
> > Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, 
> > RecordIdNameMap[ID].Name);
> > ```
> > And after that you can lower the default size of `SmallVector<> Record` 
> > down to, hm, `4`?
> I'm not entirely certain what you mean -- in `emitBlockId()`, we are storing 
> both the block id and the block name in separate records 
> (`BLOCKINFO_CODE_SETBID`, `BLOCKINFO_CODE_BLOCKNAME`, respectively). In 
> `emitRecordId()`, we're doing something slightly different, in that we emit 
> one record with both the record id and the record name (in record 
> `BLOCKINFO_CODE_SETRECORDNAME`). 
> 
> Replacing the copy loop here has the same issue as above, namely that there 
> isn't an easy way to convert between a `StringRef` and an array of `unsigned 
> char`.
Tried locally, and yes, we do need to output record id.

What we could **actually** do, is simply inline that `EmitRecord()`, first 
emitting the RID, and then the name.
```
template 
void EmitRecord(unsigned Code, int ID, const Container ) {
  // If we don't have an abbrev to use, emit this in its fully unabbreviated
  // form.
  auto Count = static_cast(makeArrayRef(Vals).size());
  EmitCode(bitc::UNABBREV_RECORD);
  EmitVBR(Code, 6);
  EmitVBR(Count + 1, 6); // Including ID
  EmitVBR64(ID, 6); // 'Prefix' with ID
  for (unsigned i = 0, e = Count; i != e; ++i)
EmitVBR64(Vals[i], 6);
}
```

But that will result in rather ugly code.
So given that the record names are quite short, and all the other strings we 
output directly, maybe leave it as it is for now, until it shows in profiles?




Comment at: clang-doc/BitcodeWriter.h:37
+  static constexpr unsigned SubblockIDSize = 4U;
+  static constexpr unsigned BoolSize = 1U;
+  static constexpr unsigned IntSize = 16U;

juliehockett wrote:
> lebedev.ri wrote:
> > Hmm, you build with asserts enabled, right?
> > I tried testing this, and three tests fail with
> > ```
> > clang-doc: /build/llvm/include/llvm/Bitcode/BitstreamWriter.h:122: void 
> > llvm::BitstreamWriter::Emit(uint32_t, unsigned int): Assertion `(Val & 
> > ~(~0U >> (32-NumBits))) == 0 && "High bits set!"' failed.
> > ```
> > ```
> > Failing Tests (3):
> > Clang Tools :: clang-doc/mapper-class-in-function.cpp
> > Clang Tools :: clang-doc/mapper-function.cpp
> > Clang Tools :: clang-doc/mapper-method.cpp
> > 
> >   Expected Passes: 6
> >   Unexpected Failures: 3
> > ```
> > At least one failure is because of `BoolSize`, so i'd suspect the assertion 
> > itself is wrong...
> I do, and I've definitely seen that one triggered before but it's been 
> because something was off in how the data was being outputted as I was 
> shifting things around. That said, I'm not seeing it in my local build with 
> this diff though -- I'll 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-28 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 136303.
juliehockett marked 3 inline comments as done.
juliehockett added a comment.

Running clang-format and fixing newlines


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,16 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,30 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK:  blob data = 'Method'
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@F@F#I#'
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,23 @@
+// 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-28 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/BitcodeWriter.h:37
+  static constexpr unsigned SubblockIDSize = 4U;
+  static constexpr unsigned BoolSize = 1U;
+  static constexpr unsigned IntSize = 16U;

lebedev.ri wrote:
> Hmm, you build with asserts enabled, right?
> I tried testing this, and three tests fail with
> ```
> clang-doc: /build/llvm/include/llvm/Bitcode/BitstreamWriter.h:122: void 
> llvm::BitstreamWriter::Emit(uint32_t, unsigned int): Assertion `(Val & ~(~0U 
> >> (32-NumBits))) == 0 && "High bits set!"' failed.
> ```
> ```
> Failing Tests (3):
> Clang Tools :: clang-doc/mapper-class-in-function.cpp
> Clang Tools :: clang-doc/mapper-function.cpp
> Clang Tools :: clang-doc/mapper-method.cpp
> 
>   Expected Passes: 6
>   Unexpected Failures: 3
> ```
> At least one failure is because of `BoolSize`, so i'd suspect the assertion 
> itself is wrong...
I do, and I've definitely seen that one triggered before but it's been because 
something was off in how the data was being outputted as I was shifting things 
around. That said, I'm not seeing it in my local build with this diff though -- 
I'll update it again just to make sure they're in sync.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-28 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/BitcodeWriter.h:37
+  static constexpr unsigned SubblockIDSize = 4U;
+  static constexpr unsigned BoolSize = 1U;
+  static constexpr unsigned IntSize = 16U;

Hmm, you build with asserts enabled, right?
I tried testing this, and three tests fail with
```
clang-doc: /build/llvm/include/llvm/Bitcode/BitstreamWriter.h:122: void 
llvm::BitstreamWriter::Emit(uint32_t, unsigned int): Assertion `(Val & ~(~0U >> 
(32-NumBits))) == 0 && "High bits set!"' failed.
```
```
Failing Tests (3):
Clang Tools :: clang-doc/mapper-class-in-function.cpp
Clang Tools :: clang-doc/mapper-function.cpp
Clang Tools :: clang-doc/mapper-method.cpp

  Expected Passes: 6
  Unexpected Failures: 3
```
At least one failure is because of `BoolSize`, so i'd suspect the assertion 
itself is wrong...


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-27 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

In https://reviews.llvm.org/D41102#1020808, @lebedev.ri wrote:

> Ok, great.
>  And it will also complain if you try to output a block within block?


Um...no. Since you can have subblocks within blocks.




Comment at: clang-doc/BitcodeWriter.cpp:191
+  Record.clear();
+  for (const char C : BlockIdNameMap[ID]) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record);

lebedev.ri wrote:
> juliehockett wrote:
> > lebedev.ri wrote:
> > > Why do we have this indirection?
> > > Is there a need to first to (unefficiently?) copy to `Record`, and then 
> > > emit from there?
> > > Wouldn't this work just as well?
> > > ```
> > > Record.clear();
> > > Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, 
> > > BlockIdNameMap[ID]);
> > > ```
> > No, since `BlockIdNameMap[ID]` returns a `StringRef`, which can be 
> > manipulated into an `std::string` or a `const char*`, but the `Stream` 
> > wants an `unsigned char`. So, the copying is to satisfy that. Unless 
> > there's a better way to convert a `StringRef` into an array of `unsigned 
> > char`?
> Aha, i see, did not think of that.
> But there is a `bytes()` function in `StringRef`, which returns 
> `iterator_range`. 
> Would it help? 
> http://llvm.org/doxygen/classllvm_1_1StringRef.html#a5e8f22c3553e341404b445430a3b075b
Replaced it with an ArrayRef to the `bytes_begin()` and `bytes_end()`, but that 
only works for the block id, not the record id, since `emitRecordId()` also has 
to emit the ID number in addition to the name in the same record.



Comment at: clang-doc/BitcodeWriter.cpp:265
+  if (!prepRecordData(ID, !OmitFilenames)) return;
+  // FIXME: Assert that the line number is of the appropriate size.
+  Record.push_back(Loc.LineNumber);

lebedev.ri wrote:
> I think it is as simple as
> ```
> assert(Loc.LineNumber < (1U << BitCodeConstants::LineNumberSize));
> ```
> ?
`LineNumber` is  a signed int, so the compiler complains that we're comparing 
signed and unsigned ints.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-27 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 136161.
juliehockett marked 15 inline comments as done.
juliehockett added a comment.

Fixing comments


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK:  blob data = 'Method'
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,25 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@F@F#I#'
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-27 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Tried fixing `tooling::FrontendActionFactory::create()` in 
https://reviews.llvm.org/D43779/https://reviews.llvm.org/D43780, but had to 
revert due to gcc4.8 issues :/

Thank you for working on this, some more review notes.

In https://reviews.llvm.org/D41102#1020107, @juliehockett wrote:

> In https://reviews.llvm.org/D41102#1017918, @lebedev.ri wrote:
>
> > Is there some (internal to `BitstreamWriter`) logic that would 'assert()' 
> > if trying to output some recordid
> >  which is, according to the `BLOCKINFO_BLOCK`, should not be there?
> >  E.g. outputting `VERSION` in `BI_COMMENT_BLOCK_ID`?
>
>
> Yes -- it will fail an assertion:
>  `Assertion 'V == Op.getLiteralValue() && "Invalid abbrev for record!"' 
> failed.`


Ok, great.
And it will also complain if you try to output a block within block?




Comment at: clang-doc/BitcodeWriter.cpp:184
+  return RecordIdNameMap;
+}  // namespace doc
+();

That comment seems wrong.
**If** the namespace is indeed supposed to be closed, it should happen after 
the lambda is called, i.e.
```
  assert(RecordIdNameMap.size() == RecordIdCount);
  return RecordIdNameMap;
}();
}  // namespace doc

// AbbreviationMap



Comment at: clang-doc/BitcodeWriter.cpp:265
+  if (!prepRecordData(ID, !OmitFilenames)) return;
+  // FIXME: Assert that the line number is of the appropriate size.
+  Record.push_back(Loc.LineNumber);

I think it is as simple as
```
assert(Loc.LineNumber < (1U << BitCodeConstants::LineNumberSize));
```
?



Comment at: clang-doc/BitcodeWriter.cpp:367
+BlockId BID, const std::initializer_list ) {
+  emitBlockID(BID);
+  for (RecordId RID : RIDs) {

So i guess this should be:
```
void ClangDocBitcodeWriter::emitBlockInfo(
BlockId BID, const std::initializer_list ) {
  assert(RIDs.size() < (1U << BitCodeConstants::SubblockIDSize), "Too many 
records in a block!");
  emitBlockID(BID);
...
```
?



Comment at: clang-doc/BitcodeWriter.cpp:191
+  Record.clear();
+  for (const char C : BlockIdNameMap[ID]) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record);

juliehockett wrote:
> lebedev.ri wrote:
> > Why do we have this indirection?
> > Is there a need to first to (unefficiently?) copy to `Record`, and then 
> > emit from there?
> > Wouldn't this work just as well?
> > ```
> > Record.clear();
> > Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, BlockIdNameMap[ID]);
> > ```
> No, since `BlockIdNameMap[ID]` returns a `StringRef`, which can be 
> manipulated into an `std::string` or a `const char*`, but the `Stream` wants 
> an `unsigned char`. So, the copying is to satisfy that. Unless there's a 
> better way to convert a `StringRef` into an array of `unsigned char`?
Aha, i see, did not think of that.
But there is a `bytes()` function in `StringRef`, which returns 
`iterator_range`. 
Would it help? 
http://llvm.org/doxygen/classllvm_1_1StringRef.html#a5e8f22c3553e341404b445430a3b075b



Comment at: clang-doc/BitcodeWriter.cpp:240
+  if (!prepRecordData(ID, Val)) return;
+  assert(Val < (1U << BitCodeConstants::IntSize));
+  Record.push_back(Val);

juliehockett wrote:
> lebedev.ri wrote:
> > Ok, now that i think about it, it can't be that easy.
> > Maybe
> > ```
> > FIXME: assumes 8 bits per byte
> > assert(llvm::APInt(8U*sizeof(Val), Val, /*isSigned=*/true).getBitWidth() <= 
> > BitCodeConstants::IntSize));
> > ```
> > Not sure whether `getBitWidth()` is really the right function to ask though.
> > (Not sure how this all works for negative numbers)
> That assertion fails :/ I could do something like `static_cast(Val) 
> == Val` but that would require a) IntSize being a power of 2 b) updating the 
> assert anytime IntSize is updated, and 3) still throws a warning about 
> comparing a signed to an unsigned int...
I see. Let's not have this assertion for now, just a `FIXME`.



Comment at: clang-doc/BitcodeWriter.h:53
+  BI_LAST = BI_COMMENT_BLOCK_ID
+};
+

juliehockett wrote:
> lebedev.ri wrote:
> > juliehockett wrote:
> > > lebedev.ri wrote:
> > > > So what *exactly* does `BitCodeConstants::SubblockIDSize` mean?
> > > > ```
> > > > static_assert(BI_LAST < (1U << BitCodeConstants::SubblockIDSize), "Too 
> > > > many block id's!");
> > > > ```
> > > > ?
> > > It's the current abbrev id width for the block (described [[ 
> > > https://llvm.org/docs/BitCodeFormat.html#enter-subblock-encoding | here 
> > > ]]), so it's the max id width for the block's abbrevs.
> > So in other words that `static_assert()` is doing the right thing?
> > Add it after the `enum BlockId{}` then please, will both document things, 
> > and ensure that things remain in a sane state.
> No...it's the (max) number of the abbrevs relevant to the block itself, which 
> is to say some subset 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-26 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 136010.
juliehockett marked 10 inline comments as done.
juliehockett added a comment.

1. Moved the serialization logic out of the Mapper class and into its own 
namespace
2. Updated tests
3. Addressing comments


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/Serialize.cpp
  clang-doc/Serialize.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class-in-class.cpp
  test/clang-doc/mapper-class-in-function.cpp
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK:  blob data = 'Method'
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,25 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@F@F#I#'
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-enum.cpp

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-26 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

In https://reviews.llvm.org/D41102#1017918, @lebedev.ri wrote:

> Is there some (internal to `BitstreamWriter`) logic that would 'assert()' if 
> trying to output some recordid
>  which is, according to the `BLOCKINFO_BLOCK`, should not be there?
>  E.g. outputting `VERSION` in `BI_COMMENT_BLOCK_ID`?


Yes -- it will fail an assertion:
`Assertion 'V == Op.getLiteralValue() && "Invalid abbrev for record!"' failed.`




Comment at: clang-doc/BitcodeWriter.cpp:191
+  Record.clear();
+  for (const char C : BlockIdNameMap[ID]) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record);

lebedev.ri wrote:
> Why do we have this indirection?
> Is there a need to first to (unefficiently?) copy to `Record`, and then emit 
> from there?
> Wouldn't this work just as well?
> ```
> Record.clear();
> Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, BlockIdNameMap[ID]);
> ```
No, since `BlockIdNameMap[ID]` returns a `StringRef`, which can be manipulated 
into an `std::string` or a `const char*`, but the `Stream` wants an `unsigned 
char`. So, the copying is to satisfy that. Unless there's a better way to 
convert a `StringRef` into an array of `unsigned char`?



Comment at: clang-doc/BitcodeWriter.cpp:196
+/// \brief Emits a record name to the BLOCKINFO block.
+void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
+  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");

lebedev.ri wrote:
> Hmm, so i've been staring at this and 
> http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html and i must say i'm 
> not fond of this indirection.
> 
> What i don't understand is, in previous function, we don't store `BlockId`, 
> why do we want to store `RecordId`?
> Aren't they both unstable, and are implementation detail?
> Do we want to store it (`RecordId`)? If yes, please explain it as a new 
> comment in code.
> 
> If no, i guess this would work too?
> ```
> assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
> Record.clear();
> Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, 
> RecordIdNameMap[ID].Name);
> ```
> And after that you can lower the default size of `SmallVector<> Record` down 
> to, hm, `4`?
I'm not entirely certain what you mean -- in `emitBlockId()`, we are storing 
both the block id and the block name in separate records 
(`BLOCKINFO_CODE_SETBID`, `BLOCKINFO_CODE_BLOCKNAME`, respectively). In 
`emitRecordId()`, we're doing something slightly different, in that we emit one 
record with both the record id and the record name (in record 
`BLOCKINFO_CODE_SETRECORDNAME`). 

Replacing the copy loop here has the same issue as above, namely that there 
isn't an easy way to convert between a `StringRef` and an array of `unsigned 
char`.



Comment at: clang-doc/BitcodeWriter.cpp:240
+  if (!prepRecordData(ID, Val)) return;
+  assert(Val < (1U << BitCodeConstants::IntSize));
+  Record.push_back(Val);

lebedev.ri wrote:
> Ok, now that i think about it, it can't be that easy.
> Maybe
> ```
> FIXME: assumes 8 bits per byte
> assert(llvm::APInt(8U*sizeof(Val), Val, /*isSigned=*/true).getBitWidth() <= 
> BitCodeConstants::IntSize));
> ```
> Not sure whether `getBitWidth()` is really the right function to ask though.
> (Not sure how this all works for negative numbers)
That assertion fails :/ I could do something like `static_cast(Val) == 
Val` but that would require a) IntSize being a power of 2 b) updating the 
assert anytime IntSize is updated, and 3) still throws a warning about 
comparing a signed to an unsigned int...



Comment at: clang-doc/BitcodeWriter.h:53
+  BI_LAST = BI_COMMENT_BLOCK_ID
+};
+

lebedev.ri wrote:
> juliehockett wrote:
> > lebedev.ri wrote:
> > > So what *exactly* does `BitCodeConstants::SubblockIDSize` mean?
> > > ```
> > > static_assert(BI_LAST < (1U << BitCodeConstants::SubblockIDSize), "Too 
> > > many block id's!");
> > > ```
> > > ?
> > It's the current abbrev id width for the block (described [[ 
> > https://llvm.org/docs/BitCodeFormat.html#enter-subblock-encoding | here 
> > ]]), so it's the max id width for the block's abbrevs.
> So in other words that `static_assert()` is doing the right thing?
> Add it after the `enum BlockId{}` then please, will both document things, and 
> ensure that things remain in a sane state.
No...it's the (max) number of the abbrevs relevant to the block itself, which 
is to say some subset of the RecordIds for any given block (e.g. for a 
`BI_COMMENT_BLOCK`,  the number of abbrevs would be 12 and so on the abbrev 
width would be 4). 

To assert for it we could put block start/end markers on the RecordIds and then 
use that to calculate the bitwidth, if you think the assertion should be there.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-24 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Thank you for working on this!
Some more thoughts.




Comment at: clang-doc/BitcodeWriter.cpp:191
+  Record.clear();
+  for (const char C : BlockIdNameMap[ID]) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record);

Why do we have this indirection?
Is there a need to first to (unefficiently?) copy to `Record`, and then emit 
from there?
Wouldn't this work just as well?
```
Record.clear();
Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, BlockIdNameMap[ID]);
```



Comment at: clang-doc/BitcodeWriter.cpp:196
+/// \brief Emits a record name to the BLOCKINFO block.
+void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
+  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");

Hmm, so i've been staring at this and 
http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html and i must say i'm 
not fond of this indirection.

What i don't understand is, in previous function, we don't store `BlockId`, why 
do we want to store `RecordId`?
Aren't they both unstable, and are implementation detail?
Do we want to store it (`RecordId`)? If yes, please explain it as a new comment 
in code.

If no, i guess this would work too?
```
assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
Record.clear();
Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, 
RecordIdNameMap[ID].Name);
```
And after that you can lower the default size of `SmallVector<> Record` down 
to, hm, `4`?



Comment at: clang-doc/BitcodeWriter.h:161
+
+  using RecordData = SmallVector;
+

This alias is used exactly once, for `Record` member variable in this class.
Is there any point in having this alias?



Comment at: clang-doc/BitcodeWriter.h:161
+
+  using RecordData = SmallVector;
+

lebedev.ri wrote:
> This alias is used exactly once, for `Record` member variable in this class.
> Is there any point in having this alias?
Also, why is `uint64_t` used?
We either push `char`, or `enum`, or `int`. Do we ever need 64-bit?



Comment at: clang-doc/ClangDoc.h:47
+   bool OmitFilenames)
+  : Mapper(Ctx, ECtx, OmitFilenames){};
+  void HandleTranslationUnit(clang::ASTContext ) override {

Please add space before `{}`, and drop unneeded `;`



Comment at: clang-doc/Mapper.h:56
+ private:
+  class ClangDocCommentVisitor
+  : public ConstCommentVisitor {

`ClangDocMapper` class is staring to look like a god-class.
I would recommend:
1. Rename `ClangDocMapper` to `ClangDocASTVisitor`.
 It's kind-of conventional to name `RecursiveASTVisitor`-based classes like 
that.
2. Move `ClangDocCommentVisitor` out of the `ClangDocMapper`, into `namespace 
{}` in `clang-doc/Mapper.cpp`
3. 
  * Split `ClangDocSerializer` into new .h/.cpp
  * Replace `ClangDocSerializer Serializer;` with  `ClangDocSerializer& 
Serializer;`
  * Instantiate `ClangDocSerializer` (in `MapperActionFactory`, i think?) 
before `ClangDocMapper`
  * Pass `ClangDocSerializer&` into `ClangDocMapper` ctor.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-23 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Could you please add a bit more tests? In particular, i'd like to see how 
blocks-in-blocks work.
I.e. class-in-class, class-in-function, ...

Is there some (internal to `BitstreamWriter`) logic that would 'assert()' if 
trying to output some recordid
which is, according to the `BLOCKINFO_BLOCK`, should not be there?
E.g. outputting `VERSION` in `BI_COMMENT_BLOCK_ID`?




Comment at: clang-doc/BitcodeWriter.cpp:30
+
+static void IntAbbrev(std::shared_ptr ) {
+  Abbrev->Add(llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,

Ok, these three functions still look off, how about this?
```
// Yes, not by reference, https://godbolt.org/g/T52Vcj
static void AbbrevGen(std::shared_ptr ,
  const std::initializer_list Ops) {
  for(const auto  : Ops)
Abbrev->Add(Op);
}

static void IntAbbrev(std::shared_ptr ) {
  AbbrevGen(Abbrev, {
// 0. Fixed-size integer
{llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::IntSize}});
}

static void StringAbbrev(std::shared_ptr ) {
  AbbrevGen(Abbrev, {
// 0. Fixed-size integer (length of the following string)
{llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::StringLengthSize},
// 1. The string blob
{llvm::BitCodeAbbrevOp::Blob}});
}

// Assumes that the file will not have more than 65535 lines.
static void LocationAbbrev(std::shared_ptr ) {
  AbbrevGen(Abbrev, {
// 0. Fixed-size integer (line number)
{llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::LineNumberSize},
// 1. Fixed-size integer (length of the following string (filename))
{llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::StringLengthSize},
// 2. the string blob
{llvm::BitCodeAbbrevOp::Blob}});
}
```
Though i bet clang-format will mess-up the formatting again :/



Comment at: clang-doc/BitcodeWriter.cpp:108
+  {COMMENT_CLOSENAME, {"CloseName", }},
+  {COMMENT_SELFCLOSING, {"SelfClosing", }},
+  {COMMENT_EXPLICIT, {"Explicit", }},

Some of these `IntAbbrev`'s are actually `bool`s.
Would it make sense to already think about being bitcode-size-conservative and 
introduce `BoolAbbrev` from the get go?
```
static void BoolAbbrev(std::shared_ptr ) {
  AbbrevGen(Abbrev, {
// 0. Fixed-size boolean
{llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::BoolSize}});
}
```
where `BitCodeConstants::BoolSize` = `1U`
 ?
Or is there some internal padding that would make that pointless?



Comment at: clang-doc/BitcodeWriter.cpp:156
+ unsigned AbbrevID) {
+  assert(RecordIdNameMap[RID] && "Unknown Abbreviation");
+  assert(Abbrevs.find(RID) == Abbrevs.end() && "Abbreviation already added.");

Uh, oh, i'm sorry, all(?) these `"Unknown Abbreviation"` are likely copypaste 
gone wrong.
I'm not sure why i wrote that comment. `"Unknown RecordId"` might make more 
sense?



Comment at: clang-doc/BitcodeWriter.cpp:240
+  if (!prepRecordData(ID, Val)) return;
+  assert(Val < (1U << BitCodeConstants::IntSize));
+  Record.push_back(Val);

Ok, now that i think about it, it can't be that easy.
Maybe
```
FIXME: assumes 8 bits per byte
assert(llvm::APInt(8U*sizeof(Val), Val, /*isSigned=*/true).getBitWidth() <= 
BitCodeConstants::IntSize));
```
Not sure whether `getBitWidth()` is really the right function to ask though.
(Not sure how this all works for negative numbers)



Comment at: clang-doc/BitcodeWriter.h:53
+  BI_LAST = BI_COMMENT_BLOCK_ID
+};
+

juliehockett wrote:
> lebedev.ri wrote:
> > So what *exactly* does `BitCodeConstants::SubblockIDSize` mean?
> > ```
> > static_assert(BI_LAST < (1U << BitCodeConstants::SubblockIDSize), "Too many 
> > block id's!");
> > ```
> > ?
> It's the current abbrev id width for the block (described [[ 
> https://llvm.org/docs/BitCodeFormat.html#enter-subblock-encoding | here ]]), 
> so it's the max id width for the block's abbrevs.
So in other words that `static_assert()` is doing the right thing?
Add it after the `enum BlockId{}` then please, will both document things, and 
ensure that things remain in a sane state.



Comment at: clang-doc/BitcodeWriter.h:172
+AbbreviationMap() : Abbrevs(RecordIdCount) {}
+void add(RecordId RID, unsigned AbbrevID);
+unsigned get(RecordId RID) const;

Newline after constructor



Comment at: clang-doc/BitcodeWriter.h:216
+
+  // Emission of different abbreviation types
+  void emitAbbrev(RecordId ID, BlockId Block);

`// Emission of appropriate abbreviation type`


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-23 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135682.
juliehockett added a comment.

Fixing CMakeLists formatting


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,32 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,26 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
+
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,33 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK:  blob data = 'Method'
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@F@F#I#'
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
+
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,27 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-23 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

In https://reviews.llvm.org/D41102#1017499, @Athosvk wrote:

> Disadvantage is of course that you add complexity to certain parts of the 
> deserialization (/serialization) for nested types and inheritance, by either 
> having to do so in the correct order or having to defer the process of 
> initializing these pointers. But see this as just as some thought sharing. I 
> do think this would improve the interaction in the backend (assuming you use 
> the same representation as currently in the frontend).


I agree that the pointer approach would be much more efficient on the backend, 
but the issue here is that the mapper has no idea where the representation of 
anything other than the decl it's currently looking at will be, since it sees 
each decl and serializes it immediately. The reducer, on the other hand, will 
be able to see everything, and so such pointers could be added as a pass over 
the final reduced data structure.
So, as an idea (as this diff implements), I updated the string references to be 
a struct, which holds the USR of the referenced type (for serialization, both 
here in the mapper and for the dump option in the reducer, as well as a pointer 
to an `Info` struct. This pointer is not used at this point, but would be 
populated by the reducer. Thoughts?

> Have you actually started work already on some backend? Developing backend 
> and frontend in tandem can provide some additional insights as to how things 
> should be structured, especially representation-wise!

I added you as a subscriber on the follow-up patches (the reducer, YAML/MD 
formats) -- would love to hear your thoughts! As of now, the MD output is very 
rough, but I'm hoping to keep moving forward on that in the next few days.




Comment at: clang-doc/BitcodeWriter.h:53
+  BI_LAST = BI_COMMENT_BLOCK_ID
+};
+

lebedev.ri wrote:
> So what *exactly* does `BitCodeConstants::SubblockIDSize` mean?
> ```
> static_assert(BI_LAST < (1U << BitCodeConstants::SubblockIDSize), "Too many 
> block id's!");
> ```
> ?
It's the current abbrev id width for the block (described [[ 
https://llvm.org/docs/BitCodeFormat.html#enter-subblock-encoding | here ]]), so 
it's the max id width for the block's abbrevs.



Comment at: clang-doc/BitcodeWriter.h:94
+  FUNCTION_LOCATION,
+  FUNCTION_MANGLED_NAME,
+  FUNCTION_PARENT,

lebedev.ri wrote:
> So i have a question: if something (`FUNCTION_MANGLED_NAME` in this case) is 
> phased out, does it have to stay in this enum?
> That will introduce holes in `RecordIdNameMap`.
> Are the actual numerical id's of enumerators stored in the bitcode, or the 
> string (abbrev, `RecordIdNameMap[].Name`)?
> 
> Looking at tests, i guess these enums are internal detail, and they can be 
> changed freely, including removing enumerators.
> Am i wrong?
> 
> I think that should be explained in a comment before this `enum`.
Yes, the enum is an implementation detail (`FUNCTION_MANGLED_NAME` should have 
been removed earlier). I'll put the comment describing how it works!


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-23 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135678.
juliehockett marked 29 inline comments as done.
juliehockett added a comment.

1. Continued refactoring the bitcode writer
2. Added a USR attribute to infos
3. Created a Reference struct to replace the string references to other infos


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,32 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@U@D'
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,26 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@N@A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
+
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,33 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@S@G@F@Method#I#'
+  // CHECK:  blob data = 'Method'
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'c:@F@F#I#'
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
+
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,27 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-23 Thread Eugene Zelenko via Phabricator via cfe-commits
Eugene.Zelenko added a comment.

Please run Clang-format and Clang-tidy modernize.




Comment at: clang-doc/Representation.h:80
+  : LineNumber(LineNumber), Filename(std::move(Filename)) {}
+  int LineNumber;
+  std::string Filename;

Please separate constructors from data members with empty line.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-23 Thread Athos via Phabricator via cfe-commits
Athosvk added a comment.

The change to USR seems like quite an improvement already! That being said, I 
do think that it might be preferable to opt out of the use of strings for 
linking things together. What we did with our clang-doc is that we directly 
used pointers to refer to other types. So for example, our class for storing 
Record/CXX related information has something like:

  std::vectormMethods;
  std::vectormVariables;
  std::vectormEnums;
  std::vector mTypedefs;

Only upon serialization we fetch some kind of USR that would uniquely identify 
the type. This is especially useful to us for the conversion to HTML and I 
think the same would go for this backend, as it seems this way you'll have to 
do string lookups to get to the actual types, which would be inefficient in 
multiple aspects. It can make the backend a little more of a one-on-one 
conversion, e.g. with one of our HTML template definitions (note: this is a 
Jinja2 template in Python):

  {%- for enum in inEntry.GetMemberEnums() -%}




{{- 
Modifiers.RenderAccessModifier(enum.GetAccessModifier()) -}}
enum {{- enum.GetName().GetName()|e -}}
{{- 
Descriptions.RenderDescription(enum.GetBriefDescription()) -}}

  {%- endfor -%}

Disadvantage is of course that you add complexity to certain parts of the 
deserialization (/serialization) for nested types and inheritance, by either 
having to do so in the correct order or having to defer the process of 
initializing these pointers. But see this as just as some thought sharing. I do 
think this would improve the interaction in the backend (assuming you use the 
same representation as currently in the frontend). Also, we didn't apply this 
to our Type representation (which we use to store the type of a member, 
parameter etc.), which stores the name of the type rather than a pointer to it 
(since it can also be a built-in), though it embeds pretty much every possible 
modifier on said type, like this:

  EntryName mName;  

  bool  mIsConst = false;   

  EReferenceTypemReferenceType = EReferenceType::None;  
  std::vector mPointerConstnessMask;  

  std::vector  mArraySizes;

  bool  mIsAtomic = false;  

  std::vectormAttributes;

  bool  mIsExpansion = false;   

  std::vector mTemplateArguments; 

  std::unique_ptr mFunctionTypeProperties = 
nullptr;
  EntryName mParentCXXEntry;

The last member refers to the case where a pointer is a pointer to member, 
though some other fields may require some explaining too. Anyway, this is just 
to give some insight into how we structured our representation, where we 
largely omitted string representations where possible.

Have you actually started work already on some backend? Developing backend and 
frontend in tandem can provide some additional insights as to how things should 
be structured, especially representation-wise!




Comment at: clang-doc/Representation.h:113
+  TagTypeKind TagType;
+  llvm::SmallVector Members;
+  llvm::SmallVector ParentUSRs;

How come these are actually unique ptrs? They can be stored directly in the 
vector, right? (same for CommentInfo children, FnctionInfo params etc.)


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-23 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Next, i suggest to look into code self-debugging, see comments.
Also, i have added a few questions, it would be great to know that my 
understanding is correct?

I'm sorry that it seems like we are going over and over and over over the same 
code again, 
this is the very base of the tool, i think it is important to get it as close 
to great as possible.
I *think* these review comments move it in that direction, not in the opposite 
direction?




Comment at: clang-doc/BitcodeWriter.cpp:47
+  BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
+  BitCodeConstants::LineNumFixedSize));  // Line number
+  Abbrev->Add(

So in other words this is making an assumption that no file with more than 
65535 lines will be analyzed, correct?
Can you add that as comment please?



Comment at: clang-doc/BitcodeWriter.cpp:56
+  StringRef Name;
+  AbbrevDsc Abbrev;
+

```
  AbbrevDsc Abbrev = nullptr;
```



Comment at: clang-doc/BitcodeWriter.cpp:57
+  AbbrevDsc Abbrev;
+
+  RecordIdDsc() = default;

```
// Is this 'description' valid?
operator bool() const {
  return Abbrev != nullptr && Name.data() != nullptr && !Name.empty();
}
```



Comment at: clang-doc/BitcodeWriter.cpp:137
+  {FUNCTION_LOCATION, {"Location", }},
+  {FUNCTION_PARENT, {"Parent", }},
+  {FUNCTION_ACCESS, {"Access", }}};

So `FUNCTION_MANGLED_NAME` is phased out, and is thus missing, as far as i 
understand?



Comment at: clang-doc/BitcodeWriter.cpp:148
+void ClangDocBitcodeWriter::AbbreviationMap::add(RecordId RID,
+ unsigned AbbrevID) {
+  assert(Abbrevs.find(RID) == Abbrevs.end() && "Abbreviation already added.");

+`assert(RecordIdNameMap[ID] && "Unknown Abbreviation");`



Comment at: clang-doc/BitcodeWriter.cpp:153
+
+unsigned ClangDocBitcodeWriter::AbbreviationMap::get(RecordId RID) const {
+  assert(Abbrevs.find(RID) != Abbrevs.end() && "Unknown abbreviation.");

+`assert(RecordIdNameMap[ID] && "Unknown Abbreviation");`



Comment at: clang-doc/BitcodeWriter.cpp:158
+
+void ClangDocBitcodeWriter::AbbreviationMap::clear() { Abbrevs.clear(); }
+

Called only once, and that call does nothing.
I'd drop it.



Comment at: clang-doc/BitcodeWriter.cpp:175
+/// \brief Emits a block ID and the block name to the BLOCKINFO block.
+void ClangDocBitcodeWriter::emitBlockID(BlockId ID) {
+  Record.clear();

```
/// \brief Emits a block ID and the block name to the BLOCKINFO block.
void ClangDocBitcodeWriter::emitBlockID(BlockId ID) {
  const auto& BlockIdName = BlockIdNameMap[ID];
  assert(BlockIdName.data() && BlockIdName.size() && "Unknown BlockId!");

  Record.clear();
  Record.push_back(ID);
  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETBID, Record);

  Record.clear();
  for (const char C : BlockIdName) Record.push_back(C);
  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record);
}
```



Comment at: clang-doc/BitcodeWriter.cpp:187
+void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
+  prepRecordData(ID);
+  for (const char C : RecordIdNameMap[ID].Name) Record.push_back(C);

```
/// \brief Emits a record name to the BLOCKINFO block.
void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
  prepRecordData(ID);
```
(Yes, `prepRecordData()` will have the same code. It should get optimized away.)



Comment at: clang-doc/BitcodeWriter.cpp:194
+
+void ClangDocBitcodeWriter::emitAbbrev(RecordId ID, BlockId Block) {
+  auto Abbrev = std::make_shared();

```
void ClangDocBitcodeWriter::emitAbbrev(RecordId ID, BlockId Block) {
  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
  auto Abbrev = std::make_shared();
```



Comment at: clang-doc/BitcodeWriter.cpp:204
+void ClangDocBitcodeWriter::emitRecord(StringRef Str, RecordId ID) {
+  if (!prepRecordData(ID, !Str.empty())) return;
+  Record.push_back(Str.size());

So remember that in a previous iteration, seemingly useless `AbbrevDsc` stuff 
was added to the `RecordIdNameMap`?
It is going to pay-off now:
```
void ClangDocBitcodeWriter::emitRecord(StringRef Str, RecordId ID) {
  assert(RecordIdNameMap[ID] && "Unknown Abbreviation");
  assert(RecordIdNameMap[ID].Abbrev ==  && "Abbrev type mismatch");
  if (!prepRecordData(ID, !Str.empty())) return;
...
```
And if we did not add an `RecordIdNameMap` entry for this `RecordId`, then i 
believe that will also be detected because `Abbrev` will be a `nullptr`.



Comment at: clang-doc/BitcodeWriter.cpp:205
+  if (!prepRecordData(ID, !Str.empty())) return;
+  Record.push_back(Str.size());
+  

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-22 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135559.
juliehockett marked 10 inline comments as done.
juliehockett added a comment.

Refactoring bitcode writer


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
+
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'Method'
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,25 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,26 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@e...@b.bc --dump | FileCheck %s
+
+enum B { X, Y };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'B'
+  // 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-22 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

An idea on how to further generalize/cleanup `emitBlockInfoBlock()`.

While *i think* will help, i'm not sure how to further consolidate the 
`BlockIdNameMap`/`RecordIdNameMap` and the actual `emitBlock(*)`...




Comment at: clang-doc/BitcodeWriter.cpp:55
+}();
+
+static const IndexedMap RecordIdNameMap =

After thinking, i think the solution is to turn the lambdas in 
`emit{String,...}Abbrev()` into static functions, and store not a stringref in 
RecordIdNameMap, but a struct with stringref + pointer to one of the functions.
Since `RecordIdNameMap` is only used in `emitRecordID()`, which is only used in 
`emitBlockInfoBlock()`, i think it should not be too intrusive..

So
```
// Or, decltype(), or std::function?
using AbbrevDsc = void (*)(std::shared_ptr );

static void StringAbbrev(std::shared_ptr ) {
Abbrev->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
BitCodeConstants::LineNumFixedSize));  // String size
Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));   // String
};
static void LocationAbbrev(std::shared_ptr ) {
Abbrev->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
BitCodeConstants::LineNumFixedSize));  // Line number
Abbrev->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
BitCodeConstants::LineNumFixedSize));  // Filename size
Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));   // Filename
}
static void IntAbbrev = [](std::shared_ptr ) {
Abbrev->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
BitCodeConstants::LineNumFixedSize));  // Integer
};

struct RecordIdDsc {
  StringRef Name;
  AbbrevDsc Abbrev;

  RecordIdDsc() = default;

  RecordIdDsc(StringRef Name_, AbbrevDsc Abbrev_) : Name(Name_), 
Abbrev(Abbrev_) {}
}

static const IndexedMap RecordIdNameMap =
[]() {
  IndexedMap RecordIdNameMap;
  static constexpr unsigned ExpectedSize = RI_LAST - RI_FIRST + 1;
  RecordIdNameMap.resize(ExpectedSize);

  // There is no init-list constructor for the IndexedMap, so have to
  // improvise
  static constexpr std::initializer_list<
  std::pair>
  inits = {{VERSION, {"Version", }},

```



Comment at: clang-doc/BitcodeWriter.cpp:156
+  prepRecordData(ID);
+  for (const char C : RecordIdNameMap[ID]) Record.push_back(C);
+  Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, Record);

```
for (const char C : RecordIdNameMap[ID].Name) Record.push_back(C);
```



Comment at: clang-doc/BitcodeWriter.cpp:162
+
+template 
+void ClangDocBitcodeWriter::emitAbbrev(RecordId ID, BlockId Block, Lambda &) 
{

```
void ClangDocBitcodeWriter::emitAbbrev(RecordId ID, BlockId Block) {
  auto Abbrev = std::make_shared();
  Abbrev->Add(BitCodeAbbrevOp(ID));
  RecordIdNameMap[ID].Abbrev(Abbrev);
  Abbrevs.add(ID, Stream.EmitBlockInfoAbbrev(Block, std::move(Abbrev)));
}
```



Comment at: clang-doc/BitcodeWriter.cpp:170
+
+void ClangDocBitcodeWriter::emitStringAbbrev(RecordId ID, BlockId Block) {
+  auto EmitString = [](std::shared_ptr ) {

Those three will be gone



Comment at: clang-doc/BitcodeWriter.cpp:253
+COMMENT_ARG, COMMENT_POSITION})
+emitStringAbbrev(RID, BI_COMMENT_BLOCK_ID);
+  emitIntAbbrev(COMMENT_SELFCLOSING, BI_COMMENT_BLOCK_ID);

So now this is
```
  for (RecordId RID :
   {COMMENT_KIND, COMMENT_TEXT, COMMENT_NAME, COMMENT_DIRECTION,
COMMENT_PARAMNAME, COMMENT_CLOSENAME, COMMENT_ATTRKEY, COMMENT_ATTRVAL,
COMMENT_ARG, COMMENT_POSITION, COMMENT_SELFCLOSING, COMMENT_EXPLICIT})
emitAbbrev(RID, BI_COMMENT_BLOCK_ID);
```



Comment at: clang-doc/BitcodeWriter.cpp:273
+emitRecordID(RID);
+  emitStringAbbrev(MEMBER_TYPE_TYPE, BI_MEMBER_TYPE_BLOCK_ID);
+  emitStringAbbrev(MEMBER_TYPE_NAME, BI_MEMBER_TYPE_BLOCK_ID);

```
  for (RecordId RID : {MEMBER_TYPE_TYPE, MEMBER_TYPE_NAME, MEMBER_TYPE_ACCESS})
emitAbbrev(RID, BI_MEMBER_TYPE_BLOCK_ID);
```
and so on



Comment at: clang-doc/BitcodeWriter.cpp:277
+
+  // Namespace Block
+  emitBlockID(BI_NAMESPACE_BLOCK_ID);

And The Next Pattern:
```
emitBlockID(BI_{STUFF}_BLOCK_ID); // <- same BI_{STUFF}_BLOCK_ID
for (RecordId RID : {STUFF_FOO, STUFF_BAR, STUFF_BAZ, ...}) // <- same init-list
  emitRecordID(RID);
for (RecordId RID : {STUFF_FOO, STUFF_BAR, STUFF_BAZ, ...}) // <- same init-list
  emitAbbrev(RID, BI_{STUFF}_BLOCK_ID); // <- same BI_{STUFF}_BLOCK_ID
```

I think it can be generalized as:
```
std::initializer_list> 
TheBlocks {
...
  // Namespace Block
  {BI_NAMESPACE_BLOCK_ID, 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-22 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:219
+
+void ClangDocBitcodeWriter::emitIntRecord(int Value, RecordId ID) {
+  if (!Value) return;

lebedev.ri wrote:
> Now, all these three `emit*Record` functions now have the 'same signature':
> ```
> template 
> void ClangDocBitcodeWriter::emitRecord(const T& Record, RecordId ID);
> 
> template <>
> void ClangDocBitcodeWriter::emitRecord(StringRef Str, RecordId ID) {
> ...
> ```
> 
> **Assuming there are no implicit conversions going on**, i'd make that change.
> It, again, may open the road for further generalizations.
I overloaded the functions -- cleaner, and deals with any implicit conversions 
nicely.



Comment at: clang-doc/BitcodeWriter.h:178
+  void emitTypeBlock(const std::unique_ptr );
+  void emitMemberTypeBlock(const std::unique_ptr );
+  void emitFieldTypeBlock(const std::unique_ptr );

lebedev.ri wrote:
> Let's continue cracking down on duplication.
> I think these four functions need the same template treatment as 
> `writeBitstreamForInfo()`
> 
> (please feel free to use better names)
> ```
> template
> void emitBlock(const std::unique_ptr );
> 
> template
> void emitTypedBlock(const std::unique_ptr ) {
>   StreamSubBlockGuard Block(Stream, MapFromInfoToBlockId::ID);
>   emitBlock(B);
> }
> 
> template<>
> void ClangDocBitcodeWriter::emitBlock(const std::unique_ptr ) {
>   emitStringRecord(T->TypeUSR, FIELD_TYPE_TYPE);
>   for (const auto  : T->Description) emitCommentBlock(CI);
> }
> ```
> 
> I agree that it seems strange, and seem to actually increase the code size so 
> far,
> but i believe by exposing similar functionality under one function,
> later, it will open the road for more opportunities of further consolidation.
Since it actually ended up duplicating the `writeBitstreamForInfo()` code, I 
rolled all of this into one `emitBlock()` entry point.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-22 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135453.
juliehockett marked 13 inline comments as done.
juliehockett added a comment.

Cleaning up bitcode writer


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'Method'
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,25 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@e...@b.bc --dump | FileCheck %s
+
+enum B { X, Y };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'B'
+  

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-22 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:149
+
+/// \brief Emits a record ID in the BLOCKINFO block.
+void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {

For me, with no prior knowledge of llvm's bitstreams, it was not obvious that 
the differences with `emitBlockID()` are intentional.
Maybe add a comment noting that for blocks, we do output their ID, but for 
records, we only output their name.



Comment at: clang-doc/BitcodeWriter.cpp:178
+void ClangDocBitcodeWriter::emitLocationAbbrev(RecordId ID, BlockId Block) {
+  auto EmitString = [](std::shared_ptr ) {
+Abbrev->Add(

That should be `EmitLocation`



Comment at: clang-doc/BitcodeWriter.cpp:191
+void ClangDocBitcodeWriter::emitIntAbbrev(RecordId ID, BlockId Block) {
+  auto EmitString = [](std::shared_ptr ) {
+Abbrev->Add(

`EmitInt`



Comment at: clang-doc/BitcodeWriter.cpp:219
+
+void ClangDocBitcodeWriter::emitIntRecord(int Value, RecordId ID) {
+  if (!Value) return;

Now, all these three `emit*Record` functions now have the 'same signature':
```
template 
void ClangDocBitcodeWriter::emitRecord(const T& Record, RecordId ID);

template <>
void ClangDocBitcodeWriter::emitRecord(StringRef Str, RecordId ID) {
...
```

**Assuming there are no implicit conversions going on**, i'd make that change.
It, again, may open the road for further generalizations.



Comment at: clang-doc/BitcodeWriter.h:24
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/Bitcode/BitstreamReader.h"
+#include "llvm/Bitcode/BitstreamWriter.h"

Is `BitstreamReader.h` include needed here?



Comment at: clang-doc/BitcodeWriter.h:142
+AbbreviationMap() {}
+void add(RecordId RID, unsigned abbrevID);
+unsigned get(RecordId RID) const;

`void add(RecordId RID, unsigned AbbrevID);`




Comment at: clang-doc/BitcodeWriter.h:175
+  void emitStringRecord(StringRef Str, RecordId ID);
+  void emitLocationRecord(int LineNumber, StringRef File, RecordId ID);
+  void emitIntRecord(int Value, RecordId ID);

You have already included `"Representation.h"` here.
Why don't you just pass `const Location& Loc` into this function?



Comment at: clang-doc/BitcodeWriter.h:177
+  void emitIntRecord(int Value, RecordId ID);
+  void emitTypeBlock(const std::unique_ptr );
+  void emitMemberTypeBlock(const std::unique_ptr );

New line
```
void emitIntRecord(int Value, RecordId ID);

void emitTypeBlock(const std::unique_ptr );
```



Comment at: clang-doc/BitcodeWriter.h:178
+  void emitTypeBlock(const std::unique_ptr );
+  void emitMemberTypeBlock(const std::unique_ptr );
+  void emitFieldTypeBlock(const std::unique_ptr );

Let's continue cracking down on duplication.
I think these four functions need the same template treatment as 
`writeBitstreamForInfo()`

(please feel free to use better names)
```
template
void emitBlock(const std::unique_ptr );

template
void emitTypedBlock(const std::unique_ptr ) {
  StreamSubBlockGuard Block(Stream, MapFromInfoToBlockId::ID);
  emitBlock(B);
}

template<>
void ClangDocBitcodeWriter::emitBlock(const std::unique_ptr ) {
  emitStringRecord(T->TypeUSR, FIELD_TYPE_TYPE);
  for (const auto  : T->Description) emitCommentBlock(CI);
}
```

I agree that it seems strange, and seem to actually increase the code size so 
far,
but i believe by exposing similar functionality under one function,
later, it will open the road for more opportunities of further consolidation.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:407
+
+void ClangDocBinaryWriter::writeBitstream(const EnumInfo ,
+  BitstreamWriter ,

juliehockett wrote:
> jakehehrlich wrote:
> > lebedev.ri wrote:
> > > juliehockett wrote:
> > > > lebedev.ri wrote:
> > > > > Hmm, common pattern again
> > > > > ```
> > > > > void ClangDocBinaryWriter::writeBitstream(const  ,
> > > > >   BitstreamWriter ,
> > > > >   bool writeBlockInfo) {
> > > > >   if (writeBlockInfo) emitBlockInfoBlock(Stream);
> > > > >   StreamSubBlock Block(Stream, BI__BLOCK_ID);
> > > > >   ...
> > > > > }
> > > > > ```
> > > > > Could be solved if a mapping from `TYPENAME` -> 
> > > > > `BI__BLOCK_ID` can be added.
> > > > > If LLVM would be using C++14, that'd be easy, but with C++11, it 
> > > > > would require whole new class (although with just a single static 
> > > > > variable).
> > > > Do you want me to try to write that class, or leave it as it is?
> > > It would be something like: (total guesswork, literally just wrote it 
> > > here, like rest of the snippets)
> > > ```
> > > template 
> > > struct MapFromTypeToEnumerator {
> > >   static const BlockId id;
> > > };
> > > 
> > > template <>
> > > struct MapFromTypeToEnumerator {
> > >   static const BlockId id = BI_NAMESPACE_BLOCK_ID;
> > > };
> > > void ClangDocBitcodeWriter::writeBitstream(const NamespaceInfo ) {
> > >   EMITINFO(NAMESPACE)
> > > }
> > > ...
> > > 
> > > template 
> > > void ClangDocBitcodeWriter::writeBitstream(const TypeInfo , bool 
> > > writeBlockInfo) {
> > >   if (writeBlockInfo) emitBlockInfoBlock();
> > >   StreamSubBlockGuard Block(Stream, 
> > > MapFromTypeToEnumerator::id);
> > >   writeBitstream(I);
> > > }
> > > ```
> > > Uhm, now that i have wrote it, it does not look as ugly as i though it 
> > > would look...
> > > So maybe try integrating that, i *think* it is a bit cleaner?
> > If we know the set of types then it should just be a static member of every 
> > *Info type. Then the mapping is just TYPENAME::id
> @jakehehrlich The issue with that is that it would mix writer implementation 
> details into the representation, which at this point has no knowledge of the 
> writer. We can break that, but is that the best option?
Probably not, I didn't think about that.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135342.
juliehockett marked 6 inline comments as done.
juliehockett added a comment.

Updating location creation and adding mapping from type to BlockId


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'Method'
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,25 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@e...@b.bc --dump | FileCheck %s
+
+enum B { X, Y };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:407
+
+void ClangDocBinaryWriter::writeBitstream(const EnumInfo ,
+  BitstreamWriter ,

jakehehrlich wrote:
> lebedev.ri wrote:
> > juliehockett wrote:
> > > lebedev.ri wrote:
> > > > Hmm, common pattern again
> > > > ```
> > > > void ClangDocBinaryWriter::writeBitstream(const  ,
> > > >   BitstreamWriter ,
> > > >   bool writeBlockInfo) {
> > > >   if (writeBlockInfo) emitBlockInfoBlock(Stream);
> > > >   StreamSubBlock Block(Stream, BI__BLOCK_ID);
> > > >   ...
> > > > }
> > > > ```
> > > > Could be solved if a mapping from `TYPENAME` -> 
> > > > `BI__BLOCK_ID` can be added.
> > > > If LLVM would be using C++14, that'd be easy, but with C++11, it would 
> > > > require whole new class (although with just a single static variable).
> > > Do you want me to try to write that class, or leave it as it is?
> > It would be something like: (total guesswork, literally just wrote it here, 
> > like rest of the snippets)
> > ```
> > template 
> > struct MapFromTypeToEnumerator {
> >   static const BlockId id;
> > };
> > 
> > template <>
> > struct MapFromTypeToEnumerator {
> >   static const BlockId id = BI_NAMESPACE_BLOCK_ID;
> > };
> > void ClangDocBitcodeWriter::writeBitstream(const NamespaceInfo ) {
> >   EMITINFO(NAMESPACE)
> > }
> > ...
> > 
> > template 
> > void ClangDocBitcodeWriter::writeBitstream(const TypeInfo , bool 
> > writeBlockInfo) {
> >   if (writeBlockInfo) emitBlockInfoBlock();
> >   StreamSubBlockGuard Block(Stream, MapFromTypeToEnumerator::id);
> >   writeBitstream(I);
> > }
> > ```
> > Uhm, now that i have wrote it, it does not look as ugly as i though it 
> > would look...
> > So maybe try integrating that, i *think* it is a bit cleaner?
> If we know the set of types then it should just be a static member of every 
> *Info type. Then the mapping is just TYPENAME::id
@jakehehrlich The issue with that is that it would mix writer implementation 
details into the representation, which at this point has no knowledge of the 
writer. We can break that, but is that the best option?



Comment at: clang-doc/Representation.h:79
+  std::string Filename;
+};
+

lebedev.ri wrote:
> Hmm, have you tried adding a constructor here?
> ```
> struct Location {
>   int LineNumber;
>   std::string Filename;
> 
>   Location() = default;
> 
>   Location(int LineNumber_, std::string Filename_) : LineNumber(LineNumber_), 
> Filename(std::move(Filename_)) {}
> };
> ```
> ?
Okay after some research the way to do this is to add the constructor and then 
`emplace_back(Line, File)` -- no braces. That was a fun adventure into the 
implementations of emplace and push :)


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:407
+
+void ClangDocBinaryWriter::writeBitstream(const EnumInfo ,
+  BitstreamWriter ,

lebedev.ri wrote:
> juliehockett wrote:
> > lebedev.ri wrote:
> > > Hmm, common pattern again
> > > ```
> > > void ClangDocBinaryWriter::writeBitstream(const  ,
> > >   BitstreamWriter ,
> > >   bool writeBlockInfo) {
> > >   if (writeBlockInfo) emitBlockInfoBlock(Stream);
> > >   StreamSubBlock Block(Stream, BI__BLOCK_ID);
> > >   ...
> > > }
> > > ```
> > > Could be solved if a mapping from `TYPENAME` -> 
> > > `BI__BLOCK_ID` can be added.
> > > If LLVM would be using C++14, that'd be easy, but with C++11, it would 
> > > require whole new class (although with just a single static variable).
> > Do you want me to try to write that class, or leave it as it is?
> It would be something like: (total guesswork, literally just wrote it here, 
> like rest of the snippets)
> ```
> template 
> struct MapFromTypeToEnumerator {
>   static const BlockId id;
> };
> 
> template <>
> struct MapFromTypeToEnumerator {
>   static const BlockId id = BI_NAMESPACE_BLOCK_ID;
> };
> void ClangDocBitcodeWriter::writeBitstream(const NamespaceInfo ) {
>   EMITINFO(NAMESPACE)
> }
> ...
> 
> template 
> void ClangDocBitcodeWriter::writeBitstream(const TypeInfo , bool 
> writeBlockInfo) {
>   if (writeBlockInfo) emitBlockInfoBlock();
>   StreamSubBlockGuard Block(Stream, MapFromTypeToEnumerator::id);
>   writeBitstream(I);
> }
> ```
> Uhm, now that i have wrote it, it does not look as ugly as i though it would 
> look...
> So maybe try integrating that, i *think* it is a bit cleaner?
If we know the set of types then it should just be a static member of every 
*Info type. Then the mapping is just TYPENAME::id


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135305.
juliehockett marked 20 inline comments as done.
juliehockett added a comment.

Cleaning up bitcode writer and fixing pointers for CommentInfos


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'Method'
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,25 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@e...@b.bc --dump | FileCheck %s
+
+enum B { X, Y };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/Representation.h:79
+  std::string Filename;
+};
+

Hmm, have you tried adding a constructor here?
```
struct Location {
  int LineNumber;
  std::string Filename;

  Location() = default;

  Location(int LineNumber_, std::string Filename_) : LineNumber(LineNumber_), 
Filename(std::move(Filename_)) {}
};
```
?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:407
+
+void ClangDocBinaryWriter::writeBitstream(const EnumInfo ,
+  BitstreamWriter ,

juliehockett wrote:
> lebedev.ri wrote:
> > Hmm, common pattern again
> > ```
> > void ClangDocBinaryWriter::writeBitstream(const  ,
> >   BitstreamWriter ,
> >   bool writeBlockInfo) {
> >   if (writeBlockInfo) emitBlockInfoBlock(Stream);
> >   StreamSubBlock Block(Stream, BI__BLOCK_ID);
> >   ...
> > }
> > ```
> > Could be solved if a mapping from `TYPENAME` -> 
> > `BI__BLOCK_ID` can be added.
> > If LLVM would be using C++14, that'd be easy, but with C++11, it would 
> > require whole new class (although with just a single static variable).
> Do you want me to try to write that class, or leave it as it is?
It would be something like: (total guesswork, literally just wrote it here, 
like rest of the snippets)
```
template 
struct MapFromTypeToEnumerator {
  static const BlockId id;
};

template <>
struct MapFromTypeToEnumerator {
  static const BlockId id = BI_NAMESPACE_BLOCK_ID;
};
void ClangDocBitcodeWriter::writeBitstream(const NamespaceInfo ) {
  EMITINFO(NAMESPACE)
}
...

template 
void ClangDocBitcodeWriter::writeBitstream(const TypeInfo , bool 
writeBlockInfo) {
  if (writeBlockInfo) emitBlockInfoBlock();
  StreamSubBlockGuard Block(Stream, MapFromTypeToEnumerator::id);
  writeBitstream(I);
}
```
Uhm, now that i have wrote it, it does not look as ugly as i though it would 
look...
So maybe try integrating that, i *think* it is a bit cleaner?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/BitcodeWriter.cpp:407
+
+void ClangDocBinaryWriter::writeBitstream(const EnumInfo ,
+  BitstreamWriter ,

lebedev.ri wrote:
> Hmm, common pattern again
> ```
> void ClangDocBinaryWriter::writeBitstream(const  ,
>   BitstreamWriter ,
>   bool writeBlockInfo) {
>   if (writeBlockInfo) emitBlockInfoBlock(Stream);
>   StreamSubBlock Block(Stream, BI__BLOCK_ID);
>   ...
> }
> ```
> Could be solved if a mapping from `TYPENAME` -> `BI__BLOCK_ID` 
> can be added.
> If LLVM would be using C++14, that'd be easy, but with C++11, it would 
> require whole new class (although with just a single static variable).
Do you want me to try to write that class, or leave it as it is?



Comment at: clang-doc/Mapper.cpp:113
+  populateInfo(I, D, C);
+  I.Loc.emplace_back(Location{LineNumber, File});
+}

lebedev.ri wrote:
> ```
> I.Loc.emplace_back({LineNumber, File});
> ```
That...doesn't work? Throws a no-matching-function error (suggesting to try 
emplacing a Location instead of a 


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-21 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Nice!




Comment at: clang-doc/BitcodeWriter.cpp:33
+  llvm::IndexedMap BlockIdNameMap;
+  BlockIdNameMap.resize(BI_LAST - BI_FIRST + 1);
+

So here's the thing.
We know how many enumerators we have (`BI_LAST - BI_FIRST + 1`).
And we know how many enumerators we will init (`inits.size()`).
Those numbers should match.



Comment at: clang-doc/BitcodeWriter.cpp:47
+   {BI_COMMENT_BLOCK_ID, "CommentBlock"}};
+  for (const auto  : inits) {
+BlockIdNameMap[init.first] = init.second;

Can elide `{}`



Comment at: clang-doc/BitcodeWriter.cpp:50
+  }
+  return BlockIdNameMap;
+}();

So i would recommend:
```
static constexpr unsigned ExpectedSize = BI_LAST - BI_FIRST + 1;
BlockIdNameMap.resize(ExpectedSize);
...
static_assert(inits.size() == ExpectedSize, "unexpected count of initializers");
for (const auto  : inits)
  BlockIdNameMap[init.first] = init.second;
assert(BlockIdNameMap.size() == ExpectedSize);
```



Comment at: clang-doc/BitcodeWriter.cpp:56
+  llvm::IndexedMap RecordIdNameMap;
+  RecordIdNameMap.resize(RI_LAST - RI_FIRST + 1);
+

Same

So the differences between the two are:
* functors
* `*_LAST` and `*_FIRST` params
* the actual initializer-list
I guess it might be //eventually// refactored as new `llvm::IndexedMap` ctor.



Comment at: clang-doc/BitcodeWriter.cpp:120
+void ClangDocBinaryWriter::emitHeader(BitstreamWriter ) {
+  // Emit the file header.
+  Stream.Emit((unsigned)'D', BitCodeConstants::SignatureBitSize);

I wonder if this would work?
```
for(char c : StringRef("DOCS"))
  Stream.Emit((unsigned)c, BitCodeConstants::SignatureBitSize);
```



Comment at: clang-doc/BitcodeWriter.cpp:258
+  StreamSubBlock Block(Stream, BI_COMMENT_BLOCK_ID);
+  emitStringRecord(I->Text, COMMENT_TEXT, Stream);
+  emitStringRecord(I->Name, COMMENT_NAME, Stream);

Hmm, you could try something like
```
for(const auto& L : std::initializer_list>{{I->Text, COMMENT_TEXT}, {I->Name, COMMENT_NAME}, ...})
  emitStringRecord(L.first, S.second, Stream);
```



Comment at: clang-doc/BitcodeWriter.cpp:286
+  emitBlockID(BI_COMMENT_BLOCK_ID, Stream);
+  emitRecordID(COMMENT_KIND, Stream);
+  emitRecordID(COMMENT_TEXT, Stream);

```
for(RecordId RID : {COMMENT_KIND, COMMENT_TEXT, COMMENT_NAME, COMMENT_DIRECTION,
COMMENT_PARAMNAME, COMMENT_CLOSENAME, COMMENT_SELFCLOSING,
COMMENT_EXPLICIT, COMMENT_ATTRKEY, COMMENT_ATTRVAL, 
COMMENT_ARG,
COMMENT_POSITION})
  emitRecordID(RID, Stream);
```
should work



Comment at: clang-doc/BitcodeWriter.cpp:298
+  emitRecordID(COMMENT_POSITION, Stream);
+  emitStringAbbrev(COMMENT_KIND, BI_COMMENT_BLOCK_ID, Stream);
+  emitStringAbbrev(COMMENT_TEXT, BI_COMMENT_BLOCK_ID, Stream);

```
for(RecordId RID : {COMMENT_KIND, COMMENT_TEXT, COMMENT_NAME, COMMENT_DIRECTION,
COMMENT_PARAMNAME, COMMENT_CLOSENAME})
  emitStringAbbrev(RID, BI_COMMENT_BLOCK_ID, Stream);
```



Comment at: clang-doc/BitcodeWriter.cpp:306
+  emitIntAbbrev(COMMENT_EXPLICIT, BI_COMMENT_BLOCK_ID, Stream);
+  emitStringAbbrev(COMMENT_ATTRKEY, BI_COMMENT_BLOCK_ID, Stream);
+  emitStringAbbrev(COMMENT_ATTRVAL, BI_COMMENT_BLOCK_ID, Stream);

```
for(RecordId RID : {COMMENT_ATTRKEY, COMMENT_ATTRVAL, COMMENT_ARG, 
COMMENT_POSITION})
  emitStringAbbrev(RID, BI_COMMENT_BLOCK_ID, Stream);
```
and maybe in a few other places



Comment at: clang-doc/BitcodeWriter.cpp:407
+
+void ClangDocBinaryWriter::writeBitstream(const EnumInfo ,
+  BitstreamWriter ,

Hmm, common pattern again
```
void ClangDocBinaryWriter::writeBitstream(const  ,
  BitstreamWriter ,
  bool writeBlockInfo) {
  if (writeBlockInfo) emitBlockInfoBlock(Stream);
  StreamSubBlock Block(Stream, BI__BLOCK_ID);
  ...
}
```
Could be solved if a mapping from `TYPENAME` -> `BI__BLOCK_ID` 
can be added.
If LLVM would be using C++14, that'd be easy, but with C++11, it would require 
whole new class (although with just a single static variable).



Comment at: clang-doc/BitcodeWriter.h:11
+// This file implements a writer for serializing the clang-doc internal
+// represeentation to LLVM bitcode. The writer takes in a stream and emits the
+// generated bitcode to that stream.

representation



Comment at: clang-doc/BitcodeWriter.h:32
+
+#define VERSION_NUMBER 0
+

`static const unsigned` please :)
I'm not sure where to best 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-20 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

In https://reviews.llvm.org/D41102#1011299, @lebedev.ri wrote:

> I don't know the protocol, but i think it might be a good idea
>  to add a new entry to `CODE_OWNERS.TXT` for `clang-doc`?
>
> `clang-doc` going to be quite distinctive, and bigger/complicated
>  than what already is in `clang-tools-extra`.


Does anyone know what the protocol on this would be?




Comment at: clang-doc/ClangDocBinary.cpp:88
+  Stream.Emit((unsigned)'C', 8);
+  Stream.Emit((unsigned)'S', 8);
+}

lebedev.ri wrote:
> juliehockett wrote:
> > lebedev.ri wrote:
> > > General comment: shouldn't the bitcode be versioned?
> > Possibly? My understanding of the versioning (which could be incorrect) was 
> > that it was for the LLVM IR and how it is written in the given file -- I'm 
> > not writing to LLVM IR here, just using it as a data storage format, and so 
> > didn't think it was necessary. Happy to add it in though, but which version 
> > number should I use?
> The question i'm asking is: what will happen if two different (documenting 
> different attributes, with non-identical `enum {something}Id`, etc) 
> clang-doc's were used to generate two different parts of the docs (two 
> different TU's)?
> When merging two parts, if the older clang-doc is used, will it only accept 
> the part if bc it understands? Or fail altogether?
> And, does it make sense to allow to generate such mixed-up documentation?
> 
After some thought, I think it will depend on how the bitcode changes in the 
future. The reader can be implemented to simply ignore anything it doesn't 
recognize (with a default switch case), so that route is possible, but if the 
representation shifts in a major way it should probably just bail if the 
version is too early. 

I think this a good question to consider in implementing the reader and reducer 
portions of the tool -- for now, I've added the version number to the writer, 
so it can be checked in that part.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-20 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135168.
juliehockett marked 13 inline comments as done.
juliehockett added a comment.

1. Updating mapper keys to use USRs instead of names
2. Also updating internal representation to use USRs instead of names
3. Renaming files (getting rid of the ClangDoc prefix in most cases)
4. Added bitcode version number to output
5. Put brief documentation in header files
6. Updating internal representation to generate full infos for all decls 
(regardless of if they're defined -- the reducer step will consolidate these) 
and to store the namespace as a vector instead of a string.


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/BitcodeWriter.cpp
  clang-doc/BitcodeWriter.h
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/Mapper.cpp
  clang-doc/Mapper.h
  clang-doc/Representation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@u...@d.bc --dump | FileCheck %s
+
+union D { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'D'
+  // CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'D::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@s...@c.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+
+
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@n...@a.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+// CHECK: 
+
+
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,31 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@S@G@F@Method#I#.bc --dump | FileCheck %s
+
+class G {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'Method'
+  // CHECK: 
+  // CHECK:  blob data = 'c:@S@G'
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
+
+
+
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,24 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/c:@F@F#I#.bc --dump | FileCheck %s
+
+int F(int param) { return param; }
+// CHECK: 
+// CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'F'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+  // CHECK: 
+  // CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-enum.cpp

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-20 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added a comment.

> 2. I've mentioned it before as a comment, but to what extent will you be 
> parsing information in this frontend? Currently the links between types are 
> primarily stored as strings. Are you planning to have the backend that 
> generates the MarkDown parse those strings and link them to types? E.g. the 
> parenttype is a std::vector and assuming you want the markdown 
> to have a link to this parent type, will the MarkDown have to parse this type 
> and lookup the link to the original type? Or will you embed references to 
> other types within the intermediate format?

This is a good question -- expecting the backend to parse the strings is a bit 
of an unrealistic assumption, as you point out. I'm currently switching the key 
type from the names to the USRs, and I think that might help resolve this too. 
With that, each type can reference the USR of the linked def, which can be used 
to directly lookup the linked def. Does that make sense? I have to see where 
the majority of the USR resolution should be done (i.e. here in the mapper or 
later in the reducer), so will update once I take a look at that.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-20 Thread Athos via Phabricator via cfe-commits
Athosvk added a comment.

The changes seem good (both mapper and additional changes)! I've added some 
comments, but those are primarily details.

What I'm however primarily interested in is the following:

1. You seemingly output only little information for declarations that are not 
definition. Is that temporary? Will you //need// to have a definition in order 
for a function be documented?
2. I've mentioned it before as a comment, but to what extent will you be 
parsing information in this frontend? Currently the links between types are 
primarily stored as strings. Are you planning to have the backend that 
generates the MarkDown parse those strings and link them to types? E.g. the 
parenttype is a std::vector and assuming you want the markdown to 
have a link to this parent type, will the MarkDown have to parse this type and 
lookup the link to the original type? Or will you embed references to other 
types within the intermediate format?

I'm curious to hear your thoughts on this!




Comment at: clang-doc/ClangDocRepresentation.h:39
+  llvm::SmallVector Position;
+  std::vector Children;
+};

I might be missing something, but can't this be a unique ptr? Shouldn't 
children of comments only have one parent?



Comment at: clang-doc/ClangDocRepresentation.h:46
+struct NamedType {
+  enum FieldName { PARAM = 1, MEMBER, RETTYPE };
+  FieldName Field;

Perhaps use an enum class instead? Same goes for the other enums



Comment at: clang-doc/ClangDocRepresentation.h:63
+  std::string SimpleName;
+  std::string Namespace;
+  llvm::SmallVector Description;

It's not too important for now , but you probably want to at least store the 
namespace identifier for each nested namespace at some point. So instead you 
store a vector of namespaces, which in the final markdown generation stage 
allows you to link to each namespace individually (assuming you'll have some 
kind of namespace overview pages)



Comment at: tools/clang-doc/ClangDocReporter.h:42
   std::string Name;
   AccessSpecifier Access;
 };

You might want to separate this out to a FieldType/MemberType or something 
alike, as only class members will have this set, while you also use this for 
parameters/return types etc. I know there's AS_NONE but it seems a little 
wasteful considering the amount of instances that will not have this set


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-19 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/ClangDocBinary.cpp:72
+  assert(Abbrevs.find(recordID) == Abbrevs.end() &&
+ "Abbreviation already set.");
+  Abbrevs[recordID] = abbrevID;

juliehockett wrote:
> lebedev.ri wrote:
> > lebedev.ri wrote:
> > > So it does not *set* the abbreviation, since it is not supposed to be 
> > > called if the abbreviation is already set, but it *adds* a unique 
> > > abbreviation.
> > > I think it should be called `void AbbreviationMap::add(unsigned recordID, 
> > > unsigned abbrevID)` then
> > This is marked as done, but the name is still the same, and no 
> > counter-comment was added, as far as i can see
> It was changed to AbbreviationMap::add from AbbreviationMap::set, as you 
> suggested -- unless I missed something in your comment?
Oh right, sorry, i was thinking about some other code it seems.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-19 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 135009.
juliehockett marked 27 inline comments as done.
juliehockett added a comment.

1. Decoupled the mapper implementation from the main program, exposing only the 
function to generate the action factory
2. Implemented the matchers into a RecursiveASTVisitor and moved the serializer 
code into the mapper class
3. Cleaned up string references, templates and overloaded functions
4. Made the record/block id to name mapping an IndexedMap

Still need to address using USRs as the key in the mapper, bitcode versioning, 
and documentation in each file -- will do that in the morning!


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.h
  clang-doc/ClangDocBinary.cpp
  clang-doc/ClangDocBinary.h
  clang-doc/ClangDocMapper.cpp
  clang-doc/ClangDocMapper.h
  clang-doc/ClangDocRepresentation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-undefined.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,26 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/A.bc --dump | FileCheck %s
+
+union A { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+  // CHECK:  blob data = 'A'
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'A::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'A::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-undefined.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-undefined.cpp
@@ -0,0 +1,15 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/E.bc --dump | FileCheck %s
+
+class E;
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'E'
+  // CHECK:  blob data = 'E'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/C.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,13 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/A.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,30 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/_ZN1E6MethodEi.bc --dump | FileCheck %s
+
+class E {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'E::Method'
+  // CHECK:  blob data = 'Method'
+  // CHECK:  blob data = 'E'
+  // CHECK:  blob data = '_ZN1E6MethodEi'
+  // CHECK:  blob data = 'E'
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-19 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/ClangDoc.cpp:32
+  ECtx.reportResult(
+  Name, Mapper.emitInfo(D, getComment(D), Name, getLine(D), getFile(D)));
+}

lebedev.ri wrote:
> I wonder if `Name` should be `std::move()`'d ? Or not, `reportResult()` seems 
> to take `StringRef`...
> 
> (in general, it might be a good idea to run clang-tidy on the code)
So the `ExecutionContext` can do implement different ways to do this -- in this 
case, the default container created is the `InMemoryToolResults`, which 
technically takes in `StringRef`s, but copies their data to its in-memory 
representation: 

```void InMemoryToolResults::addResult(StringRef Key, StringRef Value) {
KVResults.push_back({Key.str(), Value.str()});
}```

A different implementation of it (i.e. a results container not in memory) would 
likely have to be backed by a file, so the data would be written out there 
anyways.



Comment at: clang-doc/ClangDocBinary.cpp:72
+  assert(Abbrevs.find(recordID) == Abbrevs.end() &&
+ "Abbreviation already set.");
+  Abbrevs[recordID] = abbrevID;

lebedev.ri wrote:
> lebedev.ri wrote:
> > So it does not *set* the abbreviation, since it is not supposed to be 
> > called if the abbreviation is already set, but it *adds* a unique 
> > abbreviation.
> > I think it should be called `void AbbreviationMap::add(unsigned recordID, 
> > unsigned abbrevID)` then
> This is marked as done, but the name is still the same, and no 
> counter-comment was added, as far as i can see
It was changed to AbbreviationMap::add from AbbreviationMap::set, as you 
suggested -- unless I missed something in your comment?



Comment at: clang-doc/ClangDocBinary.h:82
+
+static std::map BlockIdNameMap = {
+  {NAMESPACE_BLOCK_ID, "NamespaceBlock"},

lebedev.ri wrote:
> lebedev.ri wrote:
> > Nice!
> > Some thoughts:
> > 1. I agree it makes sense to keep it close to the enum definition, in 
> > header...
> > 2. This will result in global constructor. Generally they are frowned upon 
> > in LLVM. But since this is a standalone binary, it may be ok?
> > 3. Have you tried using `StringRef` here, instead of `std::string`?
> > 4. `std::map` is in general a bad idea.
> >   Since the `enum`'s enumerators are all small and consecutive, maybe 
> > try `llvm::IndexedMap`?
> Also, this should be `static const`, since the underlying enum won't change 
> on the fly.
> 
> `#llvm` suggests to use TableGen here, i'm not sure how that would work.
> 
> As i have now noticed, there isn't a init-list constructor, so I think 
> **something** like this might work:
> ```
> static const llvm::IndexedMap BlockIdNameMap = []() {
>   llvm::IndexedMap map;
>   map.reserve(BI_LAST);
> 
>   // There is no init-list constructor for the IndexedMap, so have to 
> improvise
>   static const std::initializer_list> 
> inits = {
> {NAMESPACE_BLOCK_ID, "NamespaceBlock"},
> ...
>   };
>   for(const auto& init : inits)
> map[init.first] = init.second;
> }();
> ```
> 
> Also, even though `llvm::IndexedMap<>` is using `llvm::SmallVector<>` 
> internally, it does not expose the initial size as template parameter, 
> unfortunately, but hardcodes it to `0`. I think it would be great to add one 
> more template parameter to `llvm::IndexedMap<>`, which would default to `0`, 
> but would allow us here to avoid all memory allocation altogether.
> 
> What do you think? If you do agree that using `IndexedMap` seems like the 
> right choice, but **don't** want to write the patch for template parameter, i 
> might look into it..
Had to play with it a bit, but it's working now.

For the template parameter, I'm happy to take a look! Avoiding allocation here 
would be great.



Comment at: clang-doc/ClangDocMapper.cpp:202
+  for (comments::Comment *Child :
+   make_range(C->child_begin(), C->child_end())) {
+CurrentCI.Children.emplace_back(std::make_shared());

lebedev.ri wrote:
> It would be nice if you could (as a new Differential) add a `children()` 
> function to that class that will do that automatically.
Will do :)  (and same for the below)



Comment at: clang-doc/ClangDocMapper.h:36
+
+class ClangDocCommentVisitor
+: public ConstCommentVisitor {

sammccall wrote:
> why is this exposed?
> (and what does it do?)
Moved it into the mapper class, but it traverses a comment and extracts its 
information into the `CommentInfo` struct



Comment at: clang-doc/ClangDocMapper.h:66
+  template 
+  StringRef emitInfo(const C *D, const FullComment *FC, StringRef Key,
+ int LineNumber, StringRef File);

sammccall wrote:
> when returning a stringref, it might pay to be explicit about who owns the 
> data, so the caller knows the safe lifetime.
> (This isn't always spelled out 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-19 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

Naming conventions tend to stick around for a while - `clang-doc/ClangDocXYZ.h` 
seems a bit unwieldy compared to `clang-doc/XYZ.h` - might be worth considering.




Comment at: clang-doc/ClangDoc.cpp:37
+  Context = Result.Context;
+  if (const auto *M = Result.Nodes.getNodeAs(BoundName))
+processMatchedDecl(M);

if you're just going to call processMatchedDecl anyway, why pass in `BoundName` 
and allow it to vary rather than using a fixed string?



Comment at: clang-doc/ClangDoc.cpp:69
+
+std::string ClangDocCallback::getName(const NamedDecl *D) const {
+  if (const auto *F = dyn_cast(D)) {

this needs a comment describing what/why it's doing.

In particular, you're usually using qnames, but sometimes mangled names?
Downstream consumers are going to have a very hard time doing something 
meaningful with that.
Where you need a stable, machine-readable identifier that distinguishes between 
overloads, I'd suggest USR (see USRGeneration).
Where you need something human readable, you should think carefully about the 
representation you want, and try to avoid depending on it being unique.



Comment at: clang-doc/ClangDoc.h:36
+/// Parses each match and sends it along to the reporter for serialization.
+class ClangDocCallback : public MatchFinder::MatchCallback {
+ public:

Having `ClangDocMain` responsible for building the MatchFinder, but `ClangDoc` 
responsible for implementing the callbacks seems like an odd choice for 
layering:
 - there's a deep implicit contract between the matchers and the callbacks, 
they are going to end up being tightly coupled so the split doesn't gain much
 - using MatchCallback as the interface exposes a detail you're quite likely to 
want to change. Some heavy users of ASTMatchers end up moving to explicit AST 
traversal for efficiency reasons.

It would seem cleaner to have the MatchFinder and collection of callbacks all 
owned by one class in `ClangDoc.cpp`, and just have `ClangDoc.h` expose a 
function that creates the `FrontendActionFactory` from it. This gives you a 
narrower interface with less implicit contracts, where ASTMatchers is an 
implementation detail of this TU.



Comment at: clang-doc/ClangDoc.h:38
+ public:
+  ClangDocCallback(StringRef BoundName, ExecutionContext ,
+   ClangDocBinaryWriter )

Something seems slightly off here: we register a separate ClangDocCallback for 
each type of decl, but then each one detects what node it actually got...

There are a few ways to reduce this duplication:
 - (most reduction) use RecursiveASTVisitor, which naturally couples type and 
handling code (the matchers seem trivial, which makes this feasible)
 - use separate callbacks for each type (a ClangDocCallback?)
 - (least reduction) create one callback and add it a bunch of times, or once 
with an anyof() matcher



Comment at: clang-doc/ClangDocBinary.h:1
+//===--  ClangDocBinary.h - ClangDoc Binary -*- C++ 
-*-===//
+//

(As well as a file comment, this two-word description is pretty confusing - 
binary used as a noun seems like it would refer to the compiled clang-doc tool 
itself)



Comment at: clang-doc/ClangDocBinary.h:26
+enum BlockId {
+  NAMESPACE_BLOCK_ID = bitc::FIRST_APPLICATION_BLOCKID,
+  NONDEF_BLOCK_ID,

nit: llvm style is e.g. `BI_NamespaceBlockID` with prefix or 
`BlockID::NamespaceBlockID` (using enum class)



Comment at: clang-doc/ClangDocBinary.h:158
+
+  template 
+  void writeBitstream(const T , BitstreamWriter ,

again, this template really seems like it's a set of overloads.



Comment at: clang-doc/ClangDocMapper.cpp:148
+  }
+  return serialize(I);
+}

If I'm reading correctly, serialize() returns a SmallString by value, and now 
you're returning a (dangling) stringref to that temporary.



Comment at: clang-doc/ClangDocMapper.h:36
+
+class ClangDocCommentVisitor
+: public ConstCommentVisitor {

why is this exposed?
(and what does it do?)



Comment at: clang-doc/ClangDocMapper.h:61
+
+class ClangDocMapper {
+ public:

Naming: as things stand, `ClangDoc` looks like the mapper, and this is some 
sort of serializer helper: ClangDoc consumes the input, decides what to do with 
it, and writes the output.



Comment at: clang-doc/ClangDocMapper.h:65
+
+  template 
+  StringRef emitInfo(const C *D, const FullComment *FC, StringRef Key,

why is this a template method rather than a set of overloads?
I think if you pass in the wrong type, you'll get (at best) a linker error 
instead of a useful compile error.



Comment at: clang-doc/ClangDocMapper.h:66
+  template 
+  StringRef emitInfo(const C *D, const FullComment 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-19 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added inline comments.



Comment at: clang-doc/ClangDocBinary.h:82
+
+static std::map BlockIdNameMap = {
+  {NAMESPACE_BLOCK_ID, "NamespaceBlock"},

lebedev.ri wrote:
> Nice!
> Some thoughts:
> 1. I agree it makes sense to keep it close to the enum definition, in 
> header...
> 2. This will result in global constructor. Generally they are frowned upon in 
> LLVM. But since this is a standalone binary, it may be ok?
> 3. Have you tried using `StringRef` here, instead of `std::string`?
> 4. `std::map` is in general a bad idea.
>   Since the `enum`'s enumerators are all small and consecutive, maybe try 
> `llvm::IndexedMap`?
Also, this should be `static const`, since the underlying enum won't change on 
the fly.

`#llvm` suggests to use TableGen here, i'm not sure how that would work.

As i have now noticed, there isn't a init-list constructor, so I think 
**something** like this might work:
```
static const llvm::IndexedMap BlockIdNameMap = []() {
  llvm::IndexedMap map;
  map.reserve(BI_LAST);

  // There is no init-list constructor for the IndexedMap, so have to improvise
  static const std::initializer_list> 
inits = {
{NAMESPACE_BLOCK_ID, "NamespaceBlock"},
...
  };
  for(const auto& init : inits)
map[init.first] = init.second;
}();
```

Also, even though `llvm::IndexedMap<>` is using `llvm::SmallVector<>` 
internally, it does not expose the initial size as template parameter, 
unfortunately, but hardcodes it to `0`. I think it would be great to add one 
more template parameter to `llvm::IndexedMap<>`, which would default to `0`, 
but would allow us here to avoid all memory allocation altogether.

What do you think? If you do agree that using `IndexedMap` seems like the right 
choice, but **don't** want to write the patch for template parameter, i might 
look into it..


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-19 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Great work!
Some more review...




Comment at: clang-doc/ClangDoc.cpp:32
+  ECtx.reportResult(
+  Name, Mapper.emitInfo(D, getComment(D), Name, getLine(D), getFile(D)));
+}

I wonder if `Name` should be `std::move()`'d ? Or not, `reportResult()` seems 
to take `StringRef`...

(in general, it might be a good idea to run clang-tidy on the code)



Comment at: clang-doc/ClangDoc.h:46
+  void processMatchedDecl(const T *D);
+  int getLine(const NamedDecl *D) const;
+  StringRef getFile(const NamedDecl *D) const;

//I// would add an empty line:
```
 private:
  template 
  void processMatchedDecl(const T *D);

  int getLine(const NamedDecl *D) const;
  StringRef getFile(const NamedDecl *D) const;
  comments::FullComment *getComment(const NamedDecl *D) const;
  std::string getName(const NamedDecl *D) const;
```



Comment at: clang-doc/ClangDocBinary.cpp:72
+  assert(Abbrevs.find(recordID) == Abbrevs.end() &&
+ "Abbreviation already set.");
+  Abbrevs[recordID] = abbrevID;

lebedev.ri wrote:
> So it does not *set* the abbreviation, since it is not supposed to be called 
> if the abbreviation is already set, but it *adds* a unique abbreviation.
> I think it should be called `void AbbreviationMap::add(unsigned recordID, 
> unsigned abbrevID)` then
This is marked as done, but the name is still the same, and no counter-comment 
was added, as far as i can see



Comment at: clang-doc/ClangDocBinary.cpp:88
+  Stream.Emit((unsigned)'C', 8);
+  Stream.Emit((unsigned)'S', 8);
+}

juliehockett wrote:
> lebedev.ri wrote:
> > General comment: shouldn't the bitcode be versioned?
> Possibly? My understanding of the versioning (which could be incorrect) was 
> that it was for the LLVM IR and how it is written in the given file -- I'm 
> not writing to LLVM IR here, just using it as a data storage format, and so 
> didn't think it was necessary. Happy to add it in though, but which version 
> number should I use?
The question i'm asking is: what will happen if two different (documenting 
different attributes, with non-identical `enum {something}Id`, etc) clang-doc's 
were used to generate two different parts of the docs (two different TU's)?
When merging two parts, if the older clang-doc is used, will it only accept the 
part if bc it understands? Or fail altogether?
And, does it make sense to allow to generate such mixed-up documentation?




Comment at: clang-doc/ClangDocBinary.cpp:30
+unsigned AbbreviationMap::get(RecordId RID) {
+  assert(Abbrevs.find(RID) != Abbrevs.end() && "Abbreviation not added.");
+  return Abbrevs[RID];

Maybe `"Unknown Abbreviation."` ?



Comment at: clang-doc/ClangDocBinary.cpp:63
+
+// Common Abbreviations
+

So, these `emitStringAbbrev()`, `emitLocationAbbrev()` and `emitIntAbbrev()` 
are quite similar.
How about **something** like:
```
template 
void ClangDocBinaryWriter::emitAbbrev(RecordId ID, BlockId Block, Lambda &, 
) {
  auto Abbrev = std::make_shared();
  Abbrev->Add(BitCodeAbbrevOp(ID));
  L(Abbrev);
  Abbrevs.add(ID, Stream.EmitBlockInfoAbbrev(Block, std::move(Abbrev)));
}

void ClangDocBinaryWriter::emitStringAbbrev(RecordId ID, BlockId Block,
BitstreamWriter ) {
  auto EmitString = [](std::shared_ptr ) {
Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 
BitCodeConstants::LineNumFixedSize));  // String size
Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));   // String
  };
  emitAbbrev(ID, Block, EmitString, Stream);
}

...
```
?



Comment at: clang-doc/ClangDocBinary.cpp:69
+  Abbrev->Add(BitCodeAbbrevOp(ID));
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 
BitCodeConstants::LineNumFixedSize));  // String size
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));   // String

And that de-magicking made these strings longer than `80` columns, boo :(



Comment at: clang-doc/ClangDocBinary.cpp:71
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));   // String
+  Abbrevs.add(ID, Stream.EmitBlockInfoAbbrev(Block, Abbrev));
+}

I think you should `std::move()` the `Abbrev`, it's not used afterwards in this 
function anyways,
and it //seems// to generate nicer code https://godbolt.org/g/ow58JV



Comment at: clang-doc/ClangDocBinary.cpp:132
+  emitIntRecord(N.Access, NAMED_TYPE_ACCESS, Stream);
+  Stream.ExitBlock();
+}

There is a common patter here;
I'd try to do something like:
```
class StreamSubBlock {
  BitstreamWriter 

  public:
StreamSubBlock(BitstreamWriter _, BlockId ID) : Stream(Stream_) {
  Stream.EnterSubblock(ID, BitCodeConstants::SubblockIDSize);
}

// Optionally, also delete all the other constructors / copy/move operators.

~StreamSubBlock() {
  

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-18 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: clang-doc/ClangDocBinary.cpp:88
+  Stream.Emit((unsigned)'C', 8);
+  Stream.Emit((unsigned)'S', 8);
+}

lebedev.ri wrote:
> General comment: shouldn't the bitcode be versioned?
Possibly? My understanding of the versioning (which could be incorrect) was 
that it was for the LLVM IR and how it is written in the given file -- I'm not 
writing to LLVM IR here, just using it as a data storage format, and so didn't 
think it was necessary. Happy to add it in though, but which version number 
should I use?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-18 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 134855.
juliehockett marked 14 inline comments as done.
juliehockett added a comment.

1. Fixing docs
2. Adding static map from bitcode block/record id to block/record name
3. Pulling magic numbers into one struct
4. Cleaning up and clarifying command line options
5. Adding tests for functions and methods


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.cpp
  clang-doc/ClangDoc.h
  clang-doc/ClangDocBinary.cpp
  clang-doc/ClangDocBinary.h
  clang-doc/ClangDocMapper.cpp
  clang-doc/ClangDocMapper.h
  clang-doc/ClangDocRepresentation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-function.cpp
  test/clang-doc/mapper-method.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-undefined.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,26 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/A.bc --dump | FileCheck %s
+
+union A { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+  // CHECK:  blob data = 'A'
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'A::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'A::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-undefined.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-undefined.cpp
@@ -0,0 +1,15 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/E.bc --dump | FileCheck %s
+
+class E;
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'E'
+  // CHECK:  blob data = 'E'
+  // CHECK: 
+// CHECK: 
+
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/C.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,13 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/A.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
Index: test/clang-doc/mapper-method.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-method.cpp
@@ -0,0 +1,30 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/_ZN1E6MethodEi.bc --dump | FileCheck %s
+
+class E {
+public: 
+	int Method(int param) { return param; }
+};
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'E::Method'
+  // CHECK:  blob data = 'Method'
+  // CHECK:  blob data = 'E'
+  // CHECK:  blob data = '_ZN1E6MethodEi'
+  // CHECK:  blob data = 'E'
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'param'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-function.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-function.cpp
@@ -0,0 +1,25 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -output=%t/docs
+// RUN: llvm-bcanalyzer 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-17 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

Nice work!
It will be great to have a replacement for doxygen, which is actually modern 
and usable.

Some review notes for some of the code below:




Comment at: clang-doc/ClangDocBinary.cpp:17
+
+enum BlockIds {
+  NAMESPACE_BLOCK_ID = bitc::FIRST_APPLICATION_BLOCKID,

I wonder if you could add a map from `BlockIds` enumerator to the textual 
representation.
e.g. `NAMESPACE_BLOCK_ID` -> "NamespaceBlock"
...
`RECORD_BLOCK_ID` -> "RecordBlock"
...

This would allow to only pass the `BlockId`, and avoid passing hardcoded string 
each time.



Comment at: clang-doc/ClangDocBinary.cpp:72
+  assert(Abbrevs.find(recordID) == Abbrevs.end() &&
+ "Abbreviation already set.");
+  Abbrevs[recordID] = abbrevID;

So it does not *set* the abbreviation, since it is not supposed to be called if 
the abbreviation is already set, but it *adds* a unique abbreviation.
I think it should be called `void AbbreviationMap::add(unsigned recordID, 
unsigned abbrevID)` then



Comment at: clang-doc/ClangDocBinary.cpp:88
+  Stream.Emit((unsigned)'C', 8);
+  Stream.Emit((unsigned)'S', 8);
+}

General comment: shouldn't the bitcode be versioned?



Comment at: clang-doc/ClangDocBinary.cpp:99
+  // Emit the block name if present.
+  if (!Name || Name[0] == 0) return;
+  Record.clear();

So you are actually checking that there is either no string, or the string is 
of zero length here.
Is this function ever going to be called with a null `Name`?
All calls in this Differential always pass a static C string here.

Also see my comments about passing enumerator, and having a map that would 
avoid passing string altogether.



Comment at: clang-doc/ClangDocBinary.cpp:120
+  Abbrev->Add(BitCodeAbbrevOp(D));
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 16));  // String size
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));   // String

These constants are somewhat vague.
Maybe consolidate them somewhere somehow, e.g.:
```
clang-doc/ClangDocBinary.cpp:

namespace {
struct BitCodeConstants {
static constexpr int LineNumFixedSize = 16;
...
}
}
```



Comment at: clang-doc/ClangDocBinary.cpp:178
+  BitstreamWriter ) {
+  Stream.EnterSubblock(NAMED_TYPE_BLOCK_ID, 5);
+  emitIntRecord(ID, NAMED_TYPE_ID, Stream);

From the docs i can see that `5` is `CodeLen`, but how is this decided, etc?
Seems like a magical constant, maybe consolidate them somewhere, like in the 
previous note?



Comment at: clang-doc/ClangDocBinary.h:57
+
+  void emitRecordID(unsigned ID, const char *Name, BitstreamWriter );
+  void emitBlockID(unsigned ID, const char *Name, BitstreamWriter );

Should these take `StringRef` instead of `const char *` ?




Comment at: clang-doc/ClangDocBinary.h:57
+
+  void emitRecordID(unsigned ID, const char *Name, BitstreamWriter );
+  void emitBlockID(unsigned ID, const char *Name, BitstreamWriter );

lebedev.ri wrote:
> Should these take `StringRef` instead of `const char *` ?
> 
Also, isn't the first param always a `BlockIds`? Why not pass enumerators, and 
make it more obvious?



Comment at: clang-doc/ClangDocBinary.h:62
+  void emitIntAbbrev(unsigned D, unsigned Block, BitstreamWriter );
+
+  RecordData Record;

^ I think all these `unsigned Block` is actually a `BlockIds Block` ?
And `unsigned D` is actually `DataTypes D` ?



Comment at: clang-doc/tool/ClangDocMain.cpp:41
+static cl::opt OutDirectory(
+"docs", cl::desc("Directory for outputting docs."), cl::init("docs"),
+cl::cat(ClangDocCategory));

Hmm, are you sure about `docs` being the param to specify where to output the 
docs?
I'd //expect// to see `-o / --output` or a positional argument.
Or is that impossible due to some parent LLVM/clang implicit requirements?



Comment at: clang-doc/tool/ClangDocMain.cpp:45
+static cl::opt Format(
+"format", cl::desc("Format for outputting docs. (options are md)"),
+cl::init("md"), cl::cat(ClangDocCategory));

`options are: md`
Though this appears to be a dead code right now



Comment at: clang-doc/tool/ClangDocMain.cpp:97
+  // Mapping phase
+  errs() << "Mapping decls...\n";
+  auto Err =

This does not seem to be a erroneous situation to be in



Comment at: clang-doc/tool/ClangDocMain.cpp:107
+  sys::path::native(OutDirectory, IRRootPath);
+  std::error_code DirectoryStatus = 
sys::fs::create_directories(IRRootPath);
+  if (DirectoryStatus != OK) {

I'm having trouble following.
`DumpResult` description says `Dump results to stdout.`
Why does it need `OutDirectory`?



Comment at: 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-17 Thread Roman Lebedev via Phabricator via cfe-commits
lebedev.ri added a comment.

I don't know the protocol, but i think it might be a good idea
to add a new entry to `CODE_OWNERS.TXT` for `clang-doc`?

`clang-doc` going to be quite distinctive, and bigger/complicated
than what already is in `clang-tools-extra`.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-16 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added a comment.

Can we add tests for a function at the top level and for a method?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-15 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 134544.
juliehockett edited the summary of this revision.
juliehockett added a comment.

Updating tests and moving the bitcode reader out (to the next patch)


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.cpp
  clang-doc/ClangDoc.h
  clang-doc/ClangDocBinary.cpp
  clang-doc/ClangDocBinary.h
  clang-doc/ClangDocMapper.cpp
  clang-doc/ClangDocMapper.h
  clang-doc/ClangDocRepresentation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-class.cpp
  test/clang-doc/mapper-enum.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-struct.cpp
  test/clang-doc/mapper-union.cpp

Index: test/clang-doc/mapper-union.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-union.cpp
@@ -0,0 +1,26 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -docs=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/A.bc --dump | FileCheck %s
+
+union A { int X; int Y; };
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+  // CHECK:  blob data = 'A'
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'A::X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'A::Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-struct.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-struct.cpp
@@ -0,0 +1,19 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -docs=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/C.bc --dump | FileCheck %s
+
+struct C { int i; };
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'C'
+  // CHECK:  blob data = 'C'
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'int'
+// CHECK:  blob data = 'C::i'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,13 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -docs=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/A.bc --dump | FileCheck %s
+
+namespace A {}
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'A'
+  // CHECK:  blob data = 'A'
+// CHECK: 
Index: test/clang-doc/mapper-enum.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-enum.cpp
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -docs=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/B.bc --dump | FileCheck %s
+
+enum B { X, Y };
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'B'
+  // CHECK:  blob data = 'B'
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'X'
+// CHECK: 
+  // CHECK: 
+  // CHECK: 
+// CHECK: 
+// CHECK:  blob data = 'Y'
+// CHECK: 
+  // CHECK: 
+// CHECK: 
Index: test/clang-doc/mapper-class.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-class.cpp
@@ -0,0 +1,14 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp -docs=%t/docs
+// RUN: llvm-bcanalyzer %t/docs/E.bc --dump | FileCheck %s
+
+class E {};
+// CHECK: 
+// CHECK: 
+  // CHECK:  blob data = 'E'
+  // CHECK:  blob data = 'E'
+  // CHECK: 
+// CHECK: 
Index: test/CMakeLists.txt
===
--- test/CMakeLists.txt
+++ test/CMakeLists.txt
@@ -41,6 +41,7 @@
   clang-apply-replacements
   clang-change-namespace
   clangd
+  clang-doc
   clang-include-fixer
   clang-move
   clang-query
Index: docs/clang-doc.rst
===
--- /dev/null
+++ docs/clang-doc.rst
@@ -0,0 +1,62 @@
+===
+Clang-Doc
+===
+
+.. contents::
+
+:program:`clang-doc` is a tool for generating C and C++ documenation from 
+source code and comments. 
+
+The tool is in a very early development stage, so you might encounter bugs and
+crashes. Submitting reports with information about how to reproduce the issue
+to `the LLVM bugtracker `_ will definitely help the
+project. If you have any ideas or suggestions, 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-09 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 133726.
juliehockett added a comment.

Updating documentation


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.cpp
  clang-doc/ClangDoc.h
  clang-doc/ClangDocBinary.cpp
  clang-doc/ClangDocBinary.h
  clang-doc/ClangDocMapper.cpp
  clang-doc/ClangDocMapper.h
  clang-doc/ClangDocRepresentation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-type.cpp

Index: test/clang-doc/mapper-type.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-type.cpp
@@ -0,0 +1,137 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc  --dump --omit-filenames -doxygen -p %t %t/test.cpp | FileCheck %s
+
+union A { int X; int Y; };
+// CHECK: ---
+// CHECK: KEY: A
+// CHECK: FullyQualifiedName: A
+// CHECK: Name: A
+// CHECK: TagType: 2
+// CHECK: ID: Member
+// CHECK: Type: int
+// CHECK: Name: A::X
+// CHECK: Access: 3
+// CHECK: ID: Member
+// CHECK: Type: int
+// CHECK: Name: A::Y
+// CHECK: Access: 3
+// CHECK: ---
+// CHECK: KEY: A::A
+// CHECK: FullyQualifiedName: A::A
+// CHECK: Name: A
+// CHECK: Namespace: A
+
+enum B { X, Y };
+// CHECK: ---
+// CHECK: KEY: B
+// CHECK: FullyQualifiedName: B
+// CHECK: Name: B
+// CHECK: ID: Member
+// CHECK: Type: X
+// CHECK: Access: 3
+// CHECK: ID: Member
+// CHECK: Type: Y
+// CHECK: Access: 3
+
+struct C { int i; };
+// CHECK: ---
+// CHECK: KEY: C
+// CHECK: FullyQualifiedName: C
+// CHECK: Name: C
+// CHECK: ID: Member
+// CHECK: Type: int
+// CHECK: Name: C::i
+// CHECK: Access: 3
+// CHECK: ---
+// CHECK: KEY: C::C
+// CHECK: FullyQualifiedName: C::C
+// CHECK: Name: C
+// CHECK: Namespace: C
+
+class D {};
+// CHECK: ---
+// CHECK: KEY: D
+// CHECK: FullyQualifiedName: D
+// CHECK: Name: D
+// CHECK: TagType: 3
+// CHECK: ---
+// CHECK: KEY: D::D
+// CHECK: FullyQualifiedName: D::D
+// CHECK: Name: D
+// CHECK: Namespace: D
+
+class E {
+// CHECK: ---
+// CHECK: KEY: E
+// CHECK: FullyQualifiedName: E
+// CHECK: Name: E
+// CHECK: TagType: 3
+// CHECK: ---
+// CHECK: KEY: E::E
+// CHECK: FullyQualifiedName: E::E
+// CHECK: Name: E
+// CHECK: Namespace: E
+
+public:
+	E() {}
+// CHECK: ---
+// CHECK: KEY: _ZN1EC1Ev
+// CHECK: FullyQualifiedName: E::E
+// CHECK: Name: E
+// CHECK: Namespace: E
+// CHECK: MangledName: _ZN1EC1Ev
+// CHECK: Parent: E
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+
+	 ~E() {}
+// CHECK: ---
+// CHECK: KEY: _ZN1ED1Ev
+// CHECK: FullyQualifiedName: E::~E
+// CHECK: Name: ~E
+// CHECK: Namespace: E
+// CHECK: MangledName: _ZN1ED1Ev
+// CHECK: Parent: E
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+
+protected:
+	void ProtectedMethod();
+// CHECK:  ---
+// CHECK: KEY: _ZN1E15ProtectedMethodEv
+// CHECK: FullyQualifiedName: _ZN1E15ProtectedMethodEv
+// CHECK: Name: ProtectedMethod
+// CHECK: Namespace: E
+};
+
+void E::ProtectedMethod() {}
+// CHECK: ---
+// CHECK: KEY: _ZN1E15ProtectedMethodEv
+// CHECK: FullyQualifiedName: E::ProtectedMethod
+// CHECK: Name: ProtectedMethod
+// CHECK: Namespace: E
+// CHECK: MangledName: _ZN1E15ProtectedMethodEv
+// CHECK: Parent: E
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+// CHECK: Access: 1
+
+class F : virtual private D, public E {};
+// CHECK: ---
+// CHECK: KEY: F
+// CHECK: FullyQualifiedName: F
+// CHECK: Name: F
+// CHECK: TagType: 3
+// CHECK: Parent: class E
+// CHECK: VParent: class D
+// CHECK: ---
+// CHECK: KEY: F::F
+// CHECK: FullyQualifiedName: F::F
+// CHECK: Name: F
+// CHECK: Namespace: F
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,70 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp | FileCheck %s
+
+namespace A {
+// CHECK: ---
+// CHECK: KEY: A
+// CHECK: FullyQualifiedName: A
+// CHECK: Name: A
+
+void f() {};
+// CHECK: ---
+// CHECK: KEY: _ZN1A1fEv
+// CHECK: FullyQualifiedName: A::f
+// CHECK: Name: f
+// CHECK: Namespace: A
+// CHECK: MangledName: _ZN1A1fEv
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+// CHECK: Access: 3
+
+} // A
+
+namespace A {
+// CHECK: ---
+// CHECK: KEY: A
+// CHECK: FullyQualifiedName: A
+// CHECK: Name: A
+
+namespace B {
+// CHECK: ---
+// CHECK: KEY: A::B
+// CHECK: FullyQualifiedName: A::B
+// CHECK: Name: B
+// CHECK: Namespace: A
+
+enum E { X };
+// CHECK: ---
+// CHECK: KEY: A::B::E
+// CHECK: FullyQualifiedName: A::B::E
+// CHECK: Name: E
+// CHECK: Namespace: A::B
+// CHECK: ID: Member
+// CHECK: Type: A::B::X
+// CHECK: Access: 3
+
+E func(int i) { 
+	return X;

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-09 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 133714.
juliehockett marked 6 inline comments as done.
juliehockett added a comment.

1. Implementing the bitstream decoder (and fixing the encoder)
2. Setting up new tests for the mapper output
3. Fixing comments


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.cpp
  clang-doc/ClangDoc.h
  clang-doc/ClangDocBinary.cpp
  clang-doc/ClangDocBinary.h
  clang-doc/ClangDocMapper.cpp
  clang-doc/ClangDocMapper.h
  clang-doc/ClangDocRepresentation.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst
  test/CMakeLists.txt
  test/clang-doc/Inputs/enum_test.cpp
  test/clang-doc/Inputs/namespace_test.cpp
  test/clang-doc/Inputs/record_test.cpp
  test/clang-doc/mapper-namespace.cpp
  test/clang-doc/mapper-type.cpp

Index: test/clang-doc/mapper-type.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-type.cpp
@@ -0,0 +1,137 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc  --dump --omit-filenames -doxygen -p %t %t/test.cpp | FileCheck %s
+
+union A { int X; int Y; };
+// CHECK: ---
+// CHECK: KEY: A
+// CHECK: FullyQualifiedName: A
+// CHECK: Name: A
+// CHECK: TagType: 2
+// CHECK: ID: Member
+// CHECK: Type: int
+// CHECK: Name: A::X
+// CHECK: Access: 3
+// CHECK: ID: Member
+// CHECK: Type: int
+// CHECK: Name: A::Y
+// CHECK: Access: 3
+// CHECK: ---
+// CHECK: KEY: A::A
+// CHECK: FullyQualifiedName: A::A
+// CHECK: Name: A
+// CHECK: Namespace: A
+
+enum B { X, Y };
+// CHECK: ---
+// CHECK: KEY: B
+// CHECK: FullyQualifiedName: B
+// CHECK: Name: B
+// CHECK: ID: Member
+// CHECK: Type: X
+// CHECK: Access: 3
+// CHECK: ID: Member
+// CHECK: Type: Y
+// CHECK: Access: 3
+
+struct C { int i; };
+// CHECK: ---
+// CHECK: KEY: C
+// CHECK: FullyQualifiedName: C
+// CHECK: Name: C
+// CHECK: ID: Member
+// CHECK: Type: int
+// CHECK: Name: C::i
+// CHECK: Access: 3
+// CHECK: ---
+// CHECK: KEY: C::C
+// CHECK: FullyQualifiedName: C::C
+// CHECK: Name: C
+// CHECK: Namespace: C
+
+class D {};
+// CHECK: ---
+// CHECK: KEY: D
+// CHECK: FullyQualifiedName: D
+// CHECK: Name: D
+// CHECK: TagType: 3
+// CHECK: ---
+// CHECK: KEY: D::D
+// CHECK: FullyQualifiedName: D::D
+// CHECK: Name: D
+// CHECK: Namespace: D
+
+class E {
+// CHECK: ---
+// CHECK: KEY: E
+// CHECK: FullyQualifiedName: E
+// CHECK: Name: E
+// CHECK: TagType: 3
+// CHECK: ---
+// CHECK: KEY: E::E
+// CHECK: FullyQualifiedName: E::E
+// CHECK: Name: E
+// CHECK: Namespace: E
+
+public:
+	E() {}
+// CHECK: ---
+// CHECK: KEY: _ZN1EC1Ev
+// CHECK: FullyQualifiedName: E::E
+// CHECK: Name: E
+// CHECK: Namespace: E
+// CHECK: MangledName: _ZN1EC1Ev
+// CHECK: Parent: E
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+
+	 ~E() {}
+// CHECK: ---
+// CHECK: KEY: _ZN1ED1Ev
+// CHECK: FullyQualifiedName: E::~E
+// CHECK: Name: ~E
+// CHECK: Namespace: E
+// CHECK: MangledName: _ZN1ED1Ev
+// CHECK: Parent: E
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+
+protected:
+	void ProtectedMethod();
+// CHECK:  ---
+// CHECK: KEY: _ZN1E15ProtectedMethodEv
+// CHECK: FullyQualifiedName: _ZN1E15ProtectedMethodEv
+// CHECK: Name: ProtectedMethod
+// CHECK: Namespace: E
+};
+
+void E::ProtectedMethod() {}
+// CHECK: ---
+// CHECK: KEY: _ZN1E15ProtectedMethodEv
+// CHECK: FullyQualifiedName: E::ProtectedMethod
+// CHECK: Name: ProtectedMethod
+// CHECK: Namespace: E
+// CHECK: MangledName: _ZN1E15ProtectedMethodEv
+// CHECK: Parent: E
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+// CHECK: Access: 1
+
+class F : virtual private D, public E {};
+// CHECK: ---
+// CHECK: KEY: F
+// CHECK: FullyQualifiedName: F
+// CHECK: Name: F
+// CHECK: TagType: 3
+// CHECK: Parent: class E
+// CHECK: VParent: class D
+// CHECK: ---
+// CHECK: KEY: F::F
+// CHECK: FullyQualifiedName: F::F
+// CHECK: Name: F
+// CHECK: Namespace: F
Index: test/clang-doc/mapper-namespace.cpp
===
--- /dev/null
+++ test/clang-doc/mapper-namespace.cpp
@@ -0,0 +1,70 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: echo "" > %t/compile_flags.txt
+// RUN: cp "%s" "%t/test.cpp"
+// RUN: clang-doc --dump --omit-filenames -doxygen -p %t %t/test.cpp | FileCheck %s
+
+namespace A {
+// CHECK: ---
+// CHECK: KEY: A
+// CHECK: FullyQualifiedName: A
+// CHECK: Name: A
+
+void f() {};
+// CHECK: ---
+// CHECK: KEY: _ZN1A1fEv
+// CHECK: FullyQualifiedName: A::f
+// CHECK: Name: f
+// CHECK: Namespace: A
+// CHECK: MangledName: _ZN1A1fEv
+// CHECK: ID: Return
+// CHECK: Type: void
+// CHECK: Access: 3
+// CHECK: Access: 3
+
+} // A
+
+namespace A {
+// CHECK: ---
+// CHECK: KEY: A
+// CHECK: FullyQualifiedName: A
+// CHECK: Name: A
+
+namespace B {
+// CHECK: ---
+// CHECK: KEY: A::B
+// CHECK: FullyQualifiedName: A::B
+// CHECK: 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-07 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added a comment.

After the comments I just made are resolved, I'm fine with the low level 
details of the parts of this change that are here




Comment at: clang-doc/ClangDoc.cpp:47
+
+comments::FullComment *ClangDocCallback::getComment(const NamedDecl *D) {
+  RawComment *Comment = Context->getRawCommentForDeclNoCache(D);

I don't see any reason this can't be a const method. If I recall a previous 
version you said that it can be it can't be const because it modifies the 
Comment but that shouldn't violate the this being a const method. No 
modifications are being made to any members of this object and no non-const 
references/pointers to any of the members are accessed or needed.



Comment at: clang-doc/ClangDocMapper.cpp:91
+  ClangDocCommentVisitor Visitor(CI);
+  return Visitor.parseComment(C);
+}

If/When you make CurrentCI a reference you should return CI here instead.



Comment at: clang-doc/ClangDocMapper.cpp:181
+  for (comments::Comment *Child : make_range(C->child_begin(), 
C->child_end())) {
+CommentInfo ChildCI;
+ClangDocCommentVisitor Visitor(ChildCI);

Assuming you make the reference changes above, this should be rewritten to 
something like the following:

CurrentCI.Children.emplace_back();
ClangDocCommentVisitor Visitor(CurrentCI.Children.back());
Visitor.parseComment(Child);



Comment at: clang-doc/ClangDocMapper.cpp:266
+bool ClangDocCommentVisitor::isWhitespaceOnly(StringRef S) const {
+  return S.find_first_not_of(" \t\n\v\f\r") == std::string::npos;
+}

Can you use isspace here instead of keeping a list of characters that are 
considered spaces?



Comment at: clang-doc/ClangDocMapper.h:112
+  
+  CommentInfo parseComment(const comments::Comment *C);
+

This shouldn't return CommentInfo if you make CurrentCI a reference



Comment at: clang-doc/ClangDocMapper.h:129
+  
+  CommentInfo CurrentCI;
+};

This should be a CommentInfo& to avoid copy constructing. It also then lets you 
view ClangDocCommentVisitor as a kind of CommentInfo builder


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-06 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett added inline comments.



Comment at: tools/clang-doc/ClangDoc.h:29
+struct ClangDocContext {
+  // Which format in which to emit representation.
+  OutFormat EmitFormat;

sammccall wrote:
> juliehockett wrote:
> > sammccall wrote:
> > > Is this the intermediate representation referred to in the design doc, or 
> > > the final output format?
> > > 
> > > If the former, why two formats rather than picking one?
> > > YAML is nice for being usable by out-of-tree tools (though not as nice as 
> > > JSON). But it seems like providing YAML as a trivial backend format would 
> > > fit well?
> > > Bitcode is presumably more space-efficient - if this is significant in 
> > > practice it seems like a better choice.
> > That's the idea -- for developing purposes, I wrote up the YAML output 
> > first for this patch, and there will be a follow-on patch expanding the 
> > bitcode/binary output. I've updated the flags to default to the binary, 
> > with an option to dump the yaml (rather than the other way around).
> What's still not clear to me is: is YAML a) a "real" intermediate format, or 
> b) just a debug representation?
> 
> I would suggest for orthogonality that there only be one intermediate format, 
> and that any debug version be generated from it. In practice I guess this 
> means:
>  - the reporter builds the in-memory representation
>  - you can serialize/deserialize memory representation to the IR (bitcode)
>  - you can serialize memory representation to debug representation (YAML) but 
> not parse
>  - maybe the clang-doc core should *only* know about IR, and YAML should be 
> produced in the same way e.g. HTML would be?
> 
> This does pose a short-term problem: the canonical IR is bitcode, we need 
> YAML for the lit tests, and we don't have the decoder/transformer part yet. 
> This could be solved either by using YAML as the IR *for now* and switching 
> later, or by adding a simple decoder now.
> Either way it points to the *reporter* not having an output format option, 
> and having to support two formats.
> 
> WDYT? I might be missing something here.
The mapper now only has the ability to write to bitcode -- I'm working on 
writing up a simple decoder to use for testing and will update the patch again 
once that's working. Once that's in place, that will also serve the purpose of 
being the foundation for how we're going to read the bitcode into the backend 
to produce actual docs. Does that make sense?



Comment at: tools/clang-doc/ClangDoc.h:33
+
+class ClangDocVisitor : public RecursiveASTVisitor {
+public:

sammccall wrote:
> juliehockett wrote:
> > jakehehrlich wrote:
> > > sammccall wrote:
> > > > This API makes essentially everything public. Is that the intent?
> > > > 
> > > > It seems like `ClangDocVisitor` is a detail, and the operation you want 
> > > > to expose is "extract doc from this AST into this reporter" or maybe 
> > > > "create an AST consumer that feeds this reporter".
> > > > 
> > > > It would be useful to have an API to extract documentation from 
> > > > individual AST nodes (e.g. a Decl). But I'd be nervous about trying to 
> > > > use the classes exposed here to do that. If it's efficiently possible, 
> > > > it'd be nice to expose a function.
> > > > (one use case for this is clangd)
> > > Correct me if I'm wrong but I believe that everything needs to be public 
> > > in this case because the base class needs to be able to call them. So the 
> > > visit methods all need to be public.
> > Yes to the `Visit*Decl` methods being public because of the base class.
> > 
> > That said, I shifted a few things around here and implemented it as a 
> > `MatcherFinder` instead of a `RecursiveASTVisitor`. The change will allow 
> > us to make most of the methods private, and have the ability to fairly 
> > easily implement an API for pulling a specific node (e.g. by name or by 
> > decl type). As far as I understand (and please correct me if I'm wrong), 
> > the matcher traverses the tree in a similar way. This will also make 
> > mapping through individual nodes easier.
> Sorry for being vague - yes overridden or "CRTP-overridden" methods may need 
> to be public.
> I meant that the classes themselves don't need to be exposed, I think. (The 
> header could just expose a function to create the needed ones, that returns 
> `unique_ptr`
> 
> There are now fewer classes exposed here, but I think most/all of them can 
> still reasonably be hidden.
So I've restructured this again and collapsed all of the tooling things into to 
ExecutionContext. The only thing exposed here now is the callback, which is 
registered on the matcher. Is there anything else I'm missing?



Comment at: tools/clang-doc/ClangDocReporter.h:87
   std::string MangledName;
   std::string DefinitionFile;
   std::string ReturnType;

Athosvk wrote:
> Seems common to almost all Info structs, so you can probably 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-06 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 133108.
juliehockett marked 27 inline comments as done.
juliehockett edited the summary of this revision.
juliehockett edited projects, added clang-tools-extra; removed clang.
juliehockett added a comment.

  1. Moved the tool to clang-tools-extra
1. Refactored tool to have two stages to the frontend parsing: the mapping 
stage (this patch), which uses ASTMatchers to extract declarations and 
serialize each into an individual key-value pair in the ExecutionContext, and 
the reducing stage (next patch, not yet implemented), which will take the 
records in the ExecutionContext and reduce them by key to produce and write the 
final output of the frontend.
  1. Replaced the YAML serialization with bitcode serialization. Will update 
again with tests once I've implemented a simple decoder for the serial bitcode 
format.
  2. Streamlined the emit*Info function call path.
  3. Introduced a new layer into the Info inheritance to better represent each 
level.


https://reviews.llvm.org/D41102

Files:
  CMakeLists.txt
  clang-doc/CMakeLists.txt
  clang-doc/ClangDoc.cpp
  clang-doc/ClangDoc.h
  clang-doc/ClangDocBinary.cpp
  clang-doc/ClangDocBinary.h
  clang-doc/ClangDocMapper.cpp
  clang-doc/ClangDocMapper.h
  clang-doc/tool/CMakeLists.txt
  clang-doc/tool/ClangDocMain.cpp
  docs/clang-doc.rst

Index: docs/clang-doc.rst
===
--- /dev/null
+++ docs/clang-doc.rst
@@ -0,0 +1,13 @@
+===
+Clang-Doc
+===
+
+.. contents::
+
+Intro
+
+Setup
+=
+
+Use
+
Index: clang-doc/tool/ClangDocMain.cpp
===
--- /dev/null
+++ clang-doc/tool/ClangDocMain.cpp
@@ -0,0 +1,88 @@
+//===-- ClangDocMain.cpp - Clangdoc -*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "ClangDoc.h"
+#include "clang/AST/AST.h"
+#include "clang/AST/Decl.h"
+#include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/ASTMatchers/ASTMatchersInternal.h"
+#include "clang/Driver/Options.h"
+#include "clang/Frontend/FrontendActions.h"
+#include "clang/Tooling/CommonOptionsParser.h"
+#include "clang/Tooling/Execution.h"
+#include "clang/Tooling/StandaloneExecution.h"
+#include "clang/Tooling/Tooling.h"
+#include "llvm/ADT/APFloat.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Process.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/raw_ostream.h"
+#include 
+
+using namespace clang::ast_matchers;
+using namespace clang::tooling;
+using namespace clang;
+using namespace llvm;
+
+namespace {
+
+static cl::extrahelp CommonHelp(CommonOptionsParser::HelpMessage);
+static cl::OptionCategory ClangDocCategory("clang-doc options");
+
+static cl::opt DumpResult("dump", cl::desc("Dump results to stdout."),
+cl::init(false), cl::cat(ClangDocCategory));
+
+static cl::opt OmitFilenames("omit-filenames", cl::desc("Omit filenames in output."),
+   cl::init(false), cl::cat(ClangDocCategory));
+
+static cl::opt DoxygenOnly("doxygen", cl::desc("Use only doxygen-style comments to generate docs."),
+cl::init(false), cl::cat(ClangDocCategory));
+
+} // namespace
+
+int main(int argc, const char **argv) {
+  sys::PrintStackTraceOnErrorSignal(argv[0]);
+
+  auto Exec = clang::tooling::createExecutorFromCommandLineArgs(
+argc, argv, ClangDocCategory);
+
+  if (!Exec) {
+errs() << toString(Exec.takeError()) << "\n";
+return 1;
+  }
+
+  MatchFinder Finder;
+  ExecutionContext *ECtx = Exec->get()->getExecutionContext();
+  
+  doc::ClangDocCallback NCallback("namespace", *ECtx, OmitFilenames);
+  Finder.addMatcher(namespaceDecl().bind("namespace"), );
+  doc::ClangDocCallback RCallback("record", *ECtx, OmitFilenames);
+  Finder.addMatcher(recordDecl().bind("record"), );
+  doc::ClangDocCallback ECallback("enum", *ECtx, OmitFilenames);
+  Finder.addMatcher(enumDecl().bind("enum"), );
+  doc::ClangDocCallback MCallback("method", *ECtx, OmitFilenames);
+  Finder.addMatcher(cxxMethodDecl(isUserProvided()).bind("method"), );
+  doc::ClangDocCallback FCallback("function", *ECtx, OmitFilenames);
+  Finder.addMatcher(functionDecl(unless(cxxMethodDecl())).bind("function"), );
+  
+  ArgumentsAdjuster ArgAdjuster;
+  if (!DoxygenOnly)
+ArgAdjuster = combineAdjusters(getInsertArgumentAdjuster("-fparse-all-comments", tooling::ArgumentInsertPosition::BEGIN), ArgAdjuster);
+  auto Err = Exec->get()->execute(newFrontendActionFactory(), ArgAdjuster);
+  if (Err)
+errs() << toString(std::move(Err)) << "\n";
+
+  if (DumpResult)
+Exec->get()->getToolResults()->forEachResult(
+[](StringRef 

[PATCH] D41102: Setup clang-doc frontend framework

2018-02-06 Thread Athos via Phabricator via cfe-commits
Athosvk added inline comments.



Comment at: tools/clang-doc/ClangDocReporter.h:39
 // Info for named types (parameters, members).
 struct NamedType {
   std::string Type;

Storing the type information seems more suitable than storing just the name and 
type as a string. In my view, the frontend creates a format suitable for 
(almost) any backend to use without further parsing. This would for example 
require me to parse part of the name to get the namespace.



Comment at: tools/clang-doc/ClangDocReporter.h:75
 struct Info {
   bool isDefined = false;
   std::string FullyQualifiedName;

NItpick but should probably be 'IsDefined'



Comment at: tools/clang-doc/ClangDocReporter.h:87
   std::string MangledName;
   std::string DefinitionFile;
   std::string ReturnType;

Seems common to almost all Info structs, so you can probably move it to 
the/some base.

Namespaces do seem unrelated, so maybe you can make another struct inbetween? 
E.g. something like struct SymbolInfo : Info which contains a field for 
DefinitionFile and Locations (since that may not be used for namespaces either).

Additionally, what will you do when you merge this output information from 
multiple compilation untis back together? Only one should have the 
DefinitionFile set as the other compilation units won't see the definition. 
What happens if a function stays undefined? Can you generate documentation for 
it?



Comment at: tools/clang-doc/ClangDocReporter.h:88
   std::string DefinitionFile;
   std::string ReturnType;
   llvm::SmallVector Params;

This should probably be a NamedType like the parameters



Comment at: tools/clang-doc/ClangDocReporter.h:89
   std::string ReturnType;
   llvm::SmallVector Params;
   AccessSpecifier Access;

Perhaps you could already attach the parameter comments to this? So something 
like a 

```
struct ParamInfo
{
NamedType Type;
std::string/CommentInfo Description/CommentInfo;
}
```
Or are you planning to keep this to a later stage? At this point it seems like 
the backend will have to parse a CommentInfo struct to attach comments to 
parameters etc. manually



Comment at: tools/clang-doc/ClangDocReporter.h:93
 
 struct NamespaceInfo : public Info {
+  llvm::StringMap Functions;

Currently info stores the locations of occurrences, but this seems like a hard 
thing to do when it comes namespaces. Is it useful to have this particular 
information?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-06 Thread Athos via Phabricator via cfe-commits
Athosvk added a comment.

As we had discussed before, we're interested in the development as well! As an 
overall comment, I speak from experience that maintaining a large degree of 
documentation throughout the source code of the tool can provide an excellent 
test-case.

We sure hope this will move forward well, but so far it seems to be heading in 
the right overall direction. I'll try to place some of my thoughts on the 
overall structure as well!


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-05 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a subscriber: bkramer.
sammccall added a comment.

In https://reviews.llvm.org/D41102#998180, @thakis wrote:

> This should be in clang-tools-extra next to clang-tidy, clang-include-fixer, 
> clangd etc, not in the main compiler repo, right?


I agree. I see there was earlier discussion on this, which concluded that 
clang-tools-extra was going to merge into clang.
From talking to people working in c-t-e (particularly @bkramer), it sounds like:

- a merge isn't likely to happen soon
- clang-tools is appropriate as tools are mature, support the platforms clang 
does, and part of more developers' workflows (e.g. clang-format)
- new things should probably go in c-t-e if possible

Sorry for the back and forth :-\


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-05 Thread Nico Weber via Phabricator via cfe-commits
thakis added a comment.

This should be in clang-tools-extra next to clang-tidy, clang-include-fixer, 
clangd etc, not in the main compiler repo, right?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-05 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added a comment.

Additional note: This diff is a diff from your last commit not the full diff 
relative to origin/master which is what should be up here.




Comment at: tools/clang-doc/ClangDoc.cpp:34
 
-bool ClangDocVisitor::VisitNamespaceDecl(const NamespaceDecl *D) {
+template <>
+void ClangDocCallback::processMatchedDecl(

I can't think of a good way to dedup these two methods at the moment. Can you 
put a TODO here to deduplicate these two specializations?



Comment at: tools/clang-doc/ClangDoc.h:69
 
-private:
-  ClangDocVisitor Visitor;
-  ClangDocReporter 
-};
-
-class ClangDocAction : public clang::ASTFrontendAction {
-public:
-  ClangDocAction(ClangDocReporter ) : Reporter(Reporter) {}
-
-  virtual std::unique_ptr
-  CreateASTConsumer(clang::CompilerInstance , llvm::StringRef InFile);
+  virtual void HandleTranslationUnit(clang::ASTContext ) {
+Finder->matchAST(Context);

This should be moved to the .cpp file. Because there is no key function 
(https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-vtable) this method 
will be redefined in every translation unit that includes this header.



Comment at: tools/clang-doc/ClangDocReporter.cpp:422
+void ClangDocReporter::serializeLLVM(StringRef RootDir) {
+  // TODO: Implement.
+}

jakehehrlich wrote:
> Can you report an error to the user that says something along the lines of 
> "not implemented yet" (leave the TODO as well)
I think it would be better if instead of returning a string, you just fail and 
print a message to the user (well, first print the message and then fail).



Comment at: tools/clang-doc/ClangDocReporter.cpp:41
+  Docs.Files[Filename] = llvm::make_unique();
+  Docs.Files[Filename]->Filename = Filename;
 }

nit: Can this just do one lookup?

```
auto F = llvm::make_unique();
F->Filename = Filename;
Docs.Files[Filename] = std::move(Filename);
```



Comment at: tools/clang-doc/ClangDocReporter.cpp:136
+  if (NS == Docs.Namespaces.end()) {
+Docs.Namespaces[Namespace] = llvm::make_unique();
+Docs.Namespaces[Namespace]->FullyQualifiedName = Namespace;

nit: could you rewrite with a single lookup.



Comment at: tools/clang-doc/ClangDocReporter.cpp:190
 return;
-  CommentInfo CI;
-  parseFullComment(C, CI);
-  I.Descriptions.push_back(CI);
+  I.Descriptions.push_back(std::move(parseFullComment(C)));
 }

If you use emplace_back here you don't need the explicit std::move



Comment at: tools/clang-doc/ClangDocReporter.cpp:281-288
+  sys::path::native(RootDir, FilePath);
+  sys::path::append(FilePath, "files.yaml");
+  raw_fd_ostream FileOS(FilePath, OutErrorInfo, sys::fs::F_None);
+  if (OutErrorInfo != OK) {
+errs() << OutErrorInfo.message();
+errs() << " Error opening documentation file.\n";
+return;

You use the same basic code 3 times for different file names. Can you factor 
that out into a function? Also in this block you output the OutErrorInfo 
message but in blocks below you don't. You should always output that message.



Comment at: tools/clang-doc/ClangDocReporter.cpp:335
+  RootCI->Kind = C->getCommentKindName();
+  CurrentCI = RootCI.get();
+  ConstCommentVisitor::visit(C);

Instead of assigning a CI like this, could you construct a new 
ClangDocCommentVisitor on the stack? The idea would be that you could would 
still have a "CI" member variable that would be set in the 
ClangDocCommentVisitor's constructor. That way it never has to change and each 
visitor is just responsible for constructing one CommentInfo





Comment at: tools/clang-doc/ClangDocReporter.h:168
 private:
+  template 
+  void createInfo(llvm::StringMap , const C *D,

If you add a public "using DeclType = FooDecl;" to each "FooInfo" you can 
eliminate the second template argument and make the intent of this code more 
clear. This also formalizes the connection these types have to each other.



Comment at: tools/clang-doc/tool/ClangDocMain.cpp:76
+  if (DirectoryStatus != OK) {
+errs() << "Unable to create documentation directories.\n";
+return 1;

Can you convert this error_code to a message and display that to the user?


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-02-01 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

In https://reviews.llvm.org/D41102#994311, @juliehockett wrote:

> I'm going to take a stab at refactoring the serialization part of this next 
> -- rather than keeping it all in memory and dumping it at the end, it should 
> serialize as it goes and do some sort of reduce at the end. I'm currently 
> thinking of writing it out to a file as it goes, and then reading and 
> reducing from there -- thoughts?


Sounds awesome :-) Some thoughts on structure (sorry for the rant - clangd's 
indexer has a similar problem so I've been thinking about this).

This neatly separates the problem out into

- "business logic" for map and reduce - these are basically pure functions
- imperative frameworky stuff: serializing/deserializing, grouping map keys 
together, running threadpools

Sadly we don't have a generic implementation of the frameworky part in 
libtooling I think, so you'll probably end up writing the bits you need in 
clang-doc.
There is `ExecutionContext` in Tooling/Execution.h, which allows you to report 
KV pairs, and loop over them afterwards, which models map output (must be 
serialized as strings).

Still I think there's value in separating out the two layers:

- conceptual clarity
- testing (the pure functions are nice to test)
- one can hook up the mapper/reducer to hadoop, or google's internal 
infrastructure, or ...
- it can inspire and benefit from someone writing a reusable upstream MR API

So maybe the API could look something roughly like:

`newFrontendActionFactory(DocSink)` returns clang-doc's extractor part (the 
mapper, i.e. most of this patch)
`DocSink` is a class with a bunch of methods like `emitNamespaceInfo(StringRef 
Name, NSInfo)`. This adapts to the MR machinery: it could write to 
ExecutionContext, or files on disk as you suggest.
Merging provided via classes like `class MergeNS { MergeNS(StringRef Name); 
void add(NSInfo); NSInfo finish(); }` These would be invoked by the MR 
machinery (for now, just code that iterates over the ExecutionContext or files 
on disk).

In this case, you could test the mapper in isolation by writing a small tool 
that just runs the mapper over some files, gathers the result, and dumps it as 
YAML. The reducer and the MR driver wouldn't need to be part of this patch.

Anyway, of course you may want to take this in a different direction - just 
some ideas.




Comment at: tools/clang-doc/ClangDoc.h:29
+struct ClangDocContext {
+  // Which format in which to emit representation.
+  OutFormat EmitFormat;

juliehockett wrote:
> sammccall wrote:
> > Is this the intermediate representation referred to in the design doc, or 
> > the final output format?
> > 
> > If the former, why two formats rather than picking one?
> > YAML is nice for being usable by out-of-tree tools (though not as nice as 
> > JSON). But it seems like providing YAML as a trivial backend format would 
> > fit well?
> > Bitcode is presumably more space-efficient - if this is significant in 
> > practice it seems like a better choice.
> That's the idea -- for developing purposes, I wrote up the YAML output first 
> for this patch, and there will be a follow-on patch expanding the 
> bitcode/binary output. I've updated the flags to default to the binary, with 
> an option to dump the yaml (rather than the other way around).
What's still not clear to me is: is YAML a) a "real" intermediate format, or b) 
just a debug representation?

I would suggest for orthogonality that there only be one intermediate format, 
and that any debug version be generated from it. In practice I guess this means:
 - the reporter builds the in-memory representation
 - you can serialize/deserialize memory representation to the IR (bitcode)
 - you can serialize memory representation to debug representation (YAML) but 
not parse
 - maybe the clang-doc core should *only* know about IR, and YAML should be 
produced in the same way e.g. HTML would be?

This does pose a short-term problem: the canonical IR is bitcode, we need YAML 
for the lit tests, and we don't have the decoder/transformer part yet. This 
could be solved either by using YAML as the IR *for now* and switching later, 
or by adding a simple decoder now.
Either way it points to the *reporter* not having an output format option, and 
having to support two formats.

WDYT? I might be missing something here.



Comment at: tools/clang-doc/ClangDoc.h:33
+
+class ClangDocVisitor : public RecursiveASTVisitor {
+public:

juliehockett wrote:
> jakehehrlich wrote:
> > sammccall wrote:
> > > This API makes essentially everything public. Is that the intent?
> > > 
> > > It seems like `ClangDocVisitor` is a detail, and the operation you want 
> > > to expose is "extract doc from this AST into this reporter" or maybe 
> > > "create an AST consumer that feeds this reporter".
> > > 
> > > It would be useful to have an API to extract documentation from 
> > > individual AST nodes 

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-31 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett planned changes to this revision.
juliehockett added inline comments.



Comment at: tools/clang-doc/ClangDoc.cpp:60
+
+comments::FullComment *ClangDocVisitor::getComment(const Decl *D) {
+  RawComment *Comment = Context->getRawCommentForDeclNoCache(D);

jakehehrlich wrote:
> Can this be a const method?
Not right now -- it's actually updating the `Attached` attribute of the 
comment, since it's not actually set in the initial parsing. It should be moved 
out into the initial comment parsing (see FIXME), but that's a separate patch. 
I should probably write that. :)



Comment at: tools/clang-doc/ClangDoc.cpp:123
+ClangDocAction::CreateASTConsumer(CompilerInstance , StringRef InFile) {
+  return make_unique((), Reporter);
+}

jakehehrlich wrote:
> Pro Tip: Always explicitly refer to this as "llvm::make_unique" because 
> you'll have to revert this change if you don't.
> 
> Some of the build bots have C++14 headers instead of C++11 headers. This 
> means that llvm::make_unique and std::make_unique will both be defined. This 
> means that using "make_unique" will cause an error even though only 
> llvm::make_unique can be referred to unqualified. So even if you're inside of 
> the llvm namespace you should explicitly refer to "llvm::make_unique" and 
> never use "make_unique".
Oh interesting -- thanks for the tip!



Comment at: tools/clang-doc/ClangDoc.h:29
+struct ClangDocContext {
+  // Which format in which to emit representation.
+  OutFormat EmitFormat;

sammccall wrote:
> Is this the intermediate representation referred to in the design doc, or the 
> final output format?
> 
> If the former, why two formats rather than picking one?
> YAML is nice for being usable by out-of-tree tools (though not as nice as 
> JSON). But it seems like providing YAML as a trivial backend format would fit 
> well?
> Bitcode is presumably more space-efficient - if this is significant in 
> practice it seems like a better choice.
That's the idea -- for developing purposes, I wrote up the YAML output first 
for this patch, and there will be a follow-on patch expanding the 
bitcode/binary output. I've updated the flags to default to the binary, with an 
option to dump the yaml (rather than the other way around).



Comment at: tools/clang-doc/ClangDoc.h:33
+
+class ClangDocVisitor : public RecursiveASTVisitor {
+public:

jakehehrlich wrote:
> sammccall wrote:
> > This API makes essentially everything public. Is that the intent?
> > 
> > It seems like `ClangDocVisitor` is a detail, and the operation you want to 
> > expose is "extract doc from this AST into this reporter" or maybe "create 
> > an AST consumer that feeds this reporter".
> > 
> > It would be useful to have an API to extract documentation from individual 
> > AST nodes (e.g. a Decl). But I'd be nervous about trying to use the classes 
> > exposed here to do that. If it's efficiently possible, it'd be nice to 
> > expose a function.
> > (one use case for this is clangd)
> Correct me if I'm wrong but I believe that everything needs to be public in 
> this case because the base class needs to be able to call them. So the visit 
> methods all need to be public.
Yes to the `Visit*Decl` methods being public because of the base class.

That said, I shifted a few things around here and implemented it as a 
`MatcherFinder` instead of a `RecursiveASTVisitor`. The change will allow us to 
make most of the methods private, and have the ability to fairly easily 
implement an API for pulling a specific node (e.g. by name or by decl type). As 
far as I understand (and please correct me if I'm wrong), the matcher traverses 
the tree in a similar way. This will also make mapping through individual nodes 
easier.



Comment at: tools/clang-doc/ClangDoc.h:39
+
+  bool VisitTagDecl(const TagDecl *D);
+  bool VisitNamespaceDecl(const NamespaceDecl *D);

sammccall wrote:
> `override` where applicable
I might be wrong, but I don't believe the Visit*Decl methods are overrides for 
RecursiveASTVisitor?



Comment at: tools/clang-doc/ClangDocReporter.h:48
+/// A representation of a parsed comment.
+struct CommentInfo {
+  std::string Kind;

jakehehrlich wrote:
> There are a lot of std::string members here, do we know for sure that we 
> CommentInfo to own all of these? My general strategy is to avoid owning data 
> (e.g. use StringRef and ArrayRef) unless there's a good reason to own the 
> data. This is a general question I have about all FooInfo structs.
So the issue here is that most of this data is owned by the various Decl 
things, which will go out of scope before the data is serialized. That said, I 
think this won't be a problem at all once I refactor the intermediate output 
for mapping and reducing instead of doing it in memory.



Comment at: 

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-31 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 132306.
juliehockett marked 47 inline comments as done.
juliehockett added a comment.

1. Changing the traversal pattern from using `RecursiveASTVisitor` to using 
matchers instead. This will allow for a more flexible API (e.g. allowing access 
to individual nodes, rather than forcing all data on the user).
2. Templatizing lots of things. There was a lot of duplicated code.
3. Fixing comments

I'm going to take a stab at refactoring the serialization part of this next -- 
rather than keeping it all in memory and dumping it at the end, it should 
serialize as it goes and do some sort of reduce at the end. I'm currently 
thinking of writing it out to a file as it goes, and then reading and reducing 
from there -- thoughts?


https://reviews.llvm.org/D41102

Files:
  test/Tooling/clang-doc-basic.cpp
  test/Tooling/clang-doc-namespace.cpp
  test/Tooling/clang-doc-type.cpp
  tools/clang-doc/ClangDoc.cpp
  tools/clang-doc/ClangDoc.h
  tools/clang-doc/ClangDocReporter.cpp
  tools/clang-doc/ClangDocReporter.h
  tools/clang-doc/ClangDocYAML.h
  tools/clang-doc/tool/ClangDocMain.cpp

Index: tools/clang-doc/tool/ClangDocMain.cpp
===
--- tools/clang-doc/tool/ClangDocMain.cpp
+++ tools/clang-doc/tool/ClangDocMain.cpp
@@ -8,35 +8,44 @@
 //===--===//
 
 #include "ClangDoc.h"
+#include "clang/AST/AST.h"
+#include "clang/AST/Decl.h"
+#include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/ASTMatchers/ASTMatchersInternal.h"
 #include "clang/Driver/Options.h"
 #include "clang/Frontend/FrontendActions.h"
 #include "clang/Tooling/CommonOptionsParser.h"
 #include "clang/Tooling/Tooling.h"
+#include "llvm/ADT/APFloat.h"
+#include "llvm/Support/Path.h"
 #include "llvm/Support/Process.h"
 #include "llvm/Support/Signals.h"
+#include "llvm/Support/raw_ostream.h"
 #include 
 
+using namespace clang::ast_matchers;
 using namespace clang;
 using namespace llvm;
 
 namespace {
 
-cl::OptionCategory ClangDocCategory("clang-doc options");
+static cl::OptionCategory ClangDocCategory("clang-doc options");
+
+static cl::opt
+OutDirectory("root", cl::desc("Directory for generated files."),
+ cl::init("docs"), cl::cat(ClangDocCategory));
 
 static cl::opt
-EmitLLVM("emit-llvm",
- cl::desc("Output in LLVM bitstream format (default is YAML)."),
+EmitYAML("emit-yaml",
+ cl::desc("Output in YAML format (default is clang-doc binary)."),
  cl::init(false), cl::cat(ClangDocCategory));
 
-static cl::opt
-   DumpResult("dump",
-cl::desc("Dump results to stdout."),
-cl::init(false), cl::cat(ClangDocCategory));
+static cl::opt DumpResult("dump", cl::desc("Dump results to stdout."),
+cl::init(false), cl::cat(ClangDocCategory));
 
-static cl::opt
-   OmitFilenames("omit-filenames",
-cl::desc("Omit filenames in output."),
-cl::init(false), cl::cat(ClangDocCategory));
+static cl::opt OmitFilenames("omit-filenames",
+   cl::desc("Omit filenames in output."),
+   cl::init(false), cl::cat(ClangDocCategory));
 
 static cl::opt
 DoxygenOnly("doxygen",
@@ -48,31 +57,73 @@
 int main(int argc, const char **argv) {
   sys::PrintStackTraceOnErrorSignal(argv[0]);
   tooling::CommonOptionsParser OptionsParser(argc, argv, ClangDocCategory);
-
-  clang::doc::OutFormat EmitFormat =
-  EmitLLVM ? doc::OutFormat::LLVM : doc::OutFormat::YAML;
-
-  doc::ClangDocReporter Reporter(OptionsParser.getSourcePathList(), OmitFilenames);
-  doc::ClangDocContext Context{EmitFormat};
+  std::error_code OK;
+
+  doc::OutFormat EmitFormat;
+  SmallString<128> IRFilePath;
+  sys::path::native(OutDirectory, IRFilePath);
+  std::string IRFilename;
+  if (EmitYAML) {
+EmitFormat = doc::OutFormat::YAML;
+sys::path::append(IRFilePath, "yaml");
+  } else {
+EmitFormat = doc::OutFormat::BIN;
+sys::path::append(IRFilePath, "bin");
+  }
+
+  std::error_code DirectoryStatus = sys::fs::create_directories(IRFilePath);
+  if (DirectoryStatus != OK) {
+errs() << "Unable to create documentation directories.\n";
+return 1;
+  }
+
+  std::unique_ptr Reporter =
+  llvm::make_unique(
+  OptionsParser.getSourcePathList(), OmitFilenames);
+  std::unique_ptr Finder = llvm::make_unique();
+
+  doc::ClangDocCallback NCallback(Reporter, "namespace");
+  DeclarationMatcher ND = namespaceDecl().bind("namespace");
+  Finder->addMatcher(ND, );
+
+  doc::ClangDocCallback RCallback(Reporter, "record");
+  DeclarationMatcher RD = recordDecl().bind("record");
+  Finder->addMatcher(RD, );
+
+  doc::ClangDocCallback ECallback(Reporter, "enum");
+  DeclarationMatcher ED = enumDecl().bind("enum");
+  Finder->addMatcher(ED, );
+
+  doc::ClangDocCallback FCallback(Reporter, "function");
+  

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-31 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added inline comments.



Comment at: tools/clang-doc/ClangDoc.h:33
+
+class ClangDocVisitor : public RecursiveASTVisitor {
+public:

sammccall wrote:
> This API makes essentially everything public. Is that the intent?
> 
> It seems like `ClangDocVisitor` is a detail, and the operation you want to 
> expose is "extract doc from this AST into this reporter" or maybe "create an 
> AST consumer that feeds this reporter".
> 
> It would be useful to have an API to extract documentation from individual 
> AST nodes (e.g. a Decl). But I'd be nervous about trying to use the classes 
> exposed here to do that. If it's efficiently possible, it'd be nice to expose 
> a function.
> (one use case for this is clangd)
Correct me if I'm wrong but I believe that everything needs to be public in 
this case because the base class needs to be able to call them. So the visit 
methods all need to be public.



Comment at: tools/clang-doc/ClangDoc.h:39
+
+  bool VisitTagDecl(const TagDecl *D);
+  bool VisitNamespaceDecl(const NamespaceDecl *D);

sammccall wrote:
> `override` where applicable
These methods are not virtual methods. It's technically legal to use the 
override keyword if a subclass shadows a non-virtual method but I don't think 
its what we want to do here.



Comment at: tools/clang-doc/ClangDocReporter.cpp:43
+  F->Filename = Filename;
+  Docs.Files.insert(std::make_pair(Filename, std::move(F)));
+}

instead of inserting a pair can we just use '[]' syntax?



Comment at: tools/clang-doc/ClangDocReporter.cpp:114
+  if (Pair == Docs.Namespaces[Namespace]->Functions.end()) {
+std::unique_ptr I = make_unique();
+Docs.Namespaces[Namespace]->Functions[MangledName] = std::move(I);

no need for I, and use llvm::make_unique



Comment at: tools/clang-doc/ClangDocReporter.cpp:127
+
+void ClangDocReporter::createMethodInfo(const CXXMethodDecl *D,
+const FullComment *C, int LineNumber,

As I'm looking though these methods more I'm thinking you might want to break 
each of these createInfo methods up into smaller parts. For instance the 
addLocation/addComment part is the same in everyone of these, they all extract 
some name from a decl, they all use that string to get an iterator to the 
needed item. They all check to see if that iterator is the end and then add the 
item to the container etc... There's a lot more opportunity for deduplication 
if  break these things up some.



Comment at: tools/clang-doc/ClangDocReporter.cpp:142
+  if (Pair == Docs.Types[ParentName]->Functions.end()) {
+std::unique_ptr I = make_unique();
+Docs.Types[ParentName]->Functions[MangledName] = std::move(I);

ditto again



Comment at: tools/clang-doc/ClangDocReporter.cpp:155
+
+void ClangDocReporter::parseFullComment(const FullComment *C,
+std::shared_ptr ) {

Do we need this method?



Comment at: tools/clang-doc/ClangDocReporter.cpp:178
+NamedType T{Attr.Name, Attr.Value, clang::AccessSpecifier::AS_none};
+CurrentCI->Attrs.push_back(T);
+  }

Can we use emplace_back here instead of copying a NamedType?



Comment at: tools/clang-doc/ClangDocReporter.cpp:303
+  CI->Kind = C->getCommentKindName();
+  ConstCommentVisitor::visit(C);
+  for (comments::Comment *Child :

So it looks like the reason you need CurrentCI is because the visit methods 
need it and you need different CI's to be used at different visit calls but the 
visit methods can't take any more parameters. I think you should put the visit 
methods in another class that takes a pointer to a CommentInfo as an argument 
to the constructor. I *think* that should clean up this code smell and help 
mitigate the use of shared_ptr everywhere.



Comment at: tools/clang-doc/ClangDocReporter.cpp:355
+  yaml::Output YOut(OS);
+  for (auto  : Map)
+YOut << *(I.second);

can these be const auto&?



Comment at: tools/clang-doc/ClangDocReporter.cpp:363
+  yaml::Output YOut(OS);
+  for (auto  : Map) {
+YOut << *(I.second);

ditto on these



Comment at: tools/clang-doc/ClangDocReporter.cpp:372
+  if (RootDir.empty()) {
+printMap(outs(), Docs.Files);
+printMapPlusFunctions(outs(),

Do we need to explicitly pass these types? I think template argument deduction 
should fill this in for us.



Comment at: tools/clang-doc/ClangDocReporter.cpp:383-390
+  sys::path::native(RootDir, FilePath);
+  sys::path::append(FilePath, "files.yaml");
+  raw_fd_ostream FileOS(FilePath, OutErrorInfo, sys::fs::F_None);
+  if (OutErrorInfo != OK) {
+errs() << 

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-31 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

Really sorry about the delay in getting to this.
At a high level, I'm most concerned about:

- the monolithic in-memory intermediate format, which seems to put hard limits 
on performance and scalability
- having high-level documentation and clear APIs
- having multiple intermediate formats




Comment at: test/Tooling/clang-doc-basic.cpp:3
+// RUN: mkdir %t
+// RUN: echo '[{"directory":"%t","command":"clang++ -c 
%t/test.cpp","file":"%t/test.cpp"}]' | sed -e 's/\\/\//g' > 
%t/compile_commands.json
+// RUN: cp "%s" "%t/test.cpp"

nit: you can also just write compile_flags.txt, which in this case would be 
empty



Comment at: test/Tooling/clang-doc-basic.cpp:22
+ // CHECK: ---
+ // CHECK: Qualified Name:  ''
+ // CHECK: Name:''

what is this? The TU? The global namespace?
What's the value in emitting it?



Comment at: tools/clang-doc/ClangDoc.h:1
+//===-- ClangDoc.cpp - ClangDoc -*- C++ 
-*-===//
+//

This needs some high-level documentation: what does the clang-doc library do, 
what's the main user (clang-doc command-line tool), what are the major moving 
parts.

I don't personally have a strong opinion on how this is split between this 
header / the implementation / a documentation page for the tool itself, but 
we'll probably need *something* for each of those.

(I think it's OK to defer the user-facing documentation to another patch, but 
we should do it before the tool becomes widely publicized or included in an 
llvm release)



Comment at: tools/clang-doc/ClangDoc.h:27
+
+// A Context which contains extra options which are used in ClangMoveTool.
+struct ClangDocContext {

what's clangmovetool?



Comment at: tools/clang-doc/ClangDoc.h:27
+
+// A Context which contains extra options which are used in ClangMoveTool.
+struct ClangDocContext {

sammccall wrote:
> what's clangmovetool?
nit: this sounds more like "options" than a context to me, though there's only 
one member to go on :-)



Comment at: tools/clang-doc/ClangDoc.h:29
+struct ClangDocContext {
+  // Which format in which to emit representation.
+  OutFormat EmitFormat;

Is this the intermediate representation referred to in the design doc, or the 
final output format?

If the former, why two formats rather than picking one?
YAML is nice for being usable by out-of-tree tools (though not as nice as 
JSON). But it seems like providing YAML as a trivial backend format would fit 
well?
Bitcode is presumably more space-efficient - if this is significant in practice 
it seems like a better choice.



Comment at: tools/clang-doc/ClangDoc.h:33
+
+class ClangDocVisitor : public RecursiveASTVisitor {
+public:

This API makes essentially everything public. Is that the intent?

It seems like `ClangDocVisitor` is a detail, and the operation you want to 
expose is "extract doc from this AST into this reporter" or maybe "create an 
AST consumer that feeds this reporter".

It would be useful to have an API to extract documentation from individual AST 
nodes (e.g. a Decl). But I'd be nervous about trying to use the classes exposed 
here to do that. If it's efficiently possible, it'd be nice to expose a 
function.
(one use case for this is clangd)



Comment at: tools/clang-doc/ClangDoc.h:39
+
+  bool VisitTagDecl(const TagDecl *D);
+  bool VisitNamespaceDecl(const NamespaceDecl *D);

`override` where applicable



Comment at: tools/clang-doc/ClangDoc.h:80
+
+class ClangDocActionFactory : public tooling::FrontendActionFactory {
+public:

this class can definitely be hidden in the c++ file, behind a 
newClangDocActionFactory() func
(actually I think newFrontendActionFactory in Tooling.h could be extended to 
cover this, but not 100% sure)



Comment at: tools/clang-doc/ClangDocReporter.cpp:1
+//===-- ClangDocReporter.cpp - ClangDoc Reporter *- C++ 
-*-===//
+//

It looks like the plan for merging data across sources is to hold all 
information in one in-memory structure and incrementally add to it as you get 
information from TUs.
(This should be documented somewhere!)

This seems somewhat hostile to parallel processing: you're going to need to 
synchronize access to the structs owned by the ClangDocReporter if you want to 
gather from multiple TUs at once. Moreover, documenting large codebases using 
multiple machines in parallel seems very difficult.
And obviously it assumes you can fit the generated documentation for the 
codebase in memory, which would be nice to avoid.

Have you considered a mapreduce-like architecture, where the mapper gets AST 
callbacks and spits out data, and the reducer is responsible for assembling all 
the data together?

We 

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-30 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added inline comments.



Comment at: tools/clang-doc/ClangDocReporter.cpp:55
+Docs.Namespaces[Name] = std::move(I);
+populateBasicInfo(*Docs.Namespaces[Name], Name, D->getNameAsString(),
+  getParentNamespace(D));

If you make a "populateNamespaceInfo" method that just calls populateBasicInfo 
but checks to see that the namespace hasn't already been added you can move 
this outside of this if statement which will make it more uniform with the 
other invocations. Also if you then specialize a general form of 
"populateInfo",  include a specialization for NamespaceInfo, and do a few more 
things in other methods I think most of these methods become identical (the 
Function stuff is still different).



Comment at: tools/clang-doc/ClangDocReporter.cpp:70
+  if (Pair == Docs.Types.end()) {
+std::unique_ptr I = make_unique();
+Docs.Types[Name] = std::move(I);

ditto



Comment at: tools/clang-doc/ClangDocReporter.cpp:74
+
+  if (D->isThisDeclarationADefinition())
+populateTypeInfo(*Docs.Types[Name], D, Name, File);

Is this a check that populateTypeInfo could do instead? Or do we sometimes want 
to call populateTypeInfo on non-definitions?



Comment at: tools/clang-doc/ClangDocReporter.cpp:87
+  if (Pair == Docs.Enums.end()) {
+std::unique_ptr I = make_unique();
+Docs.Enums[Name] = std::move(I);

ditto



Comment at: tools/clang-doc/ClangDocReporter.cpp:91
+
+  if (D->isThisDeclarationADefinition())
+populateEnumInfo(*Docs.Enums[Name], D, Name, File);

Same comment here as I had on createTypeInfo/populateTypeInfo



Comment at: tools/clang-doc/ClangDocReporter.cpp:118
+
+  if (D->isThisDeclarationADefinition())
+populateFunctionInfo(*Docs.Namespaces[Namespace]->Functions[MangledName], 
D,

Is this something that can go inside populateFunctionInfo?



Comment at: tools/clang-doc/ClangDocReporter.h:166
+ StringRef Namespace);
+  void populateTypeInfo(TypeInfo , const RecordDecl *D, StringRef Name,
+StringRef File);

sans the BasicInfo one I think you should use the same specialization trick 
here. After you do that the main difference between createInfo methods will be 
what collection they add too. That suggests to me that the collection the info 
is added to should be made a parameter to a method that does all the actual 
work.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-01-30 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added inline comments.



Comment at: tools/clang-doc/ClangDocReporter.cpp:53-54
+  if (Pair == Docs.Namespaces.end()) {
+std::unique_ptr I = make_unique();
+Docs.Namespaces[Name] = std::move(I);
+populateBasicInfo(*Docs.Namespaces[Name], Name, D->getNameAsString(),

There's no need for I here, also use llvm::make_unique



Comment at: tools/clang-doc/ClangDocReporter.h:133
+
+  void createNamespaceInfo(const NamespaceDecl *D, const FullComment *C,
+   int LineNumber, StringRef File);

I think you should use explicit template specialization to make these 
"createFooInfo" methods uniform. This will enable other code that calls these 
methods to be written in a more uniform fashion.

so define something like

```
template
void createInfo(const T *D, const FullComment *C, ...);
```
 
and then define various specializations of that member function instead of 
creating a new method for each createFooInfo method.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D41102: Setup clang-doc frontend framework

2018-01-30 Thread Jake Ehrlich via Phabricator via cfe-commits
jakehehrlich added a comment.

If its possible to split VisitEnumDecl, and VisitRecordDecl into separate 
methods and the same is possible for VisitFunctionDecl and VisitCXXMethodDecl 
then I think all of your methods will look like the following 
VisitNamespaceDecl. That being the case you might want to factor this out 
somehow (which I think also would resolve my comment about isUnparsed being 
used the same way a lot).

for instance you might have a function like

  template 
  bool ClangDocVisitor::visitDecl(const T *D) {
if (!isUnparsed(D->getLocation()))
  return true;
Reporter.createInfo(D, getComment(D), getLine(D), getFile(D)); // Use 
explicit template specialization to make "createInfo" uniform
return true;
  }

and then define a macro like the following

  #define DEFINE_VISIT_METHOD(TYPE) \
  bool ClangDocVisitor::Visit##TYPE(const TYPE *D) { return visitDecl(D); }




Comment at: tools/clang-doc/ClangDoc.cpp:22
+
+bool ClangDocVisitor::VisitTagDecl(const TagDecl *D) {
+  if (!isUnparsed(D->getLocation()))

Is it possible to use VisitEnumDecl and VisitRecordDecl separately here?



Comment at: tools/clang-doc/ClangDoc.cpp:35
+
+  // Error?
+  return true;

I think you should use llvm_unrechable here



Comment at: tools/clang-doc/ClangDoc.cpp:40
+bool ClangDocVisitor::VisitNamespaceDecl(const NamespaceDecl *D) {
+  if (!isUnparsed(D->getLocation()))
+return true;

It looks like you're using this pattern a lot. It might be worth factoring this 
out somehow.



Comment at: tools/clang-doc/ClangDoc.cpp:46
+
+bool ClangDocVisitor::VisitFunctionDecl(const FunctionDecl *D) {
+  if (!isUnparsed(D->getLocation()))

Can you separate this into VisitFunctionDecl and VisitCXXMethodDecl?



Comment at: tools/clang-doc/ClangDoc.cpp:60
+
+comments::FullComment *ClangDocVisitor::getComment(const Decl *D) {
+  RawComment *Comment = Context->getRawCommentForDeclNoCache(D);

Can this be a const method?



Comment at: tools/clang-doc/ClangDoc.cpp:76
+
+std::string ClangDocVisitor::getFile(const Decl *D) const {
+  PresumedLoc PLoc = Manager.getPresumedLoc(D->getLocStart());

I think this method should return a StringRef instead of an std::string because 
the const char* returned by getFilename should live at least as long as the 
source manager.



Comment at: tools/clang-doc/ClangDoc.cpp:85
+  continue;
+std::shared_ptr CI = std::make_shared();
+Reporter.parseFullComment(Comment->parse(*Context, nullptr, nullptr), CI);

So haven't looked enough at the reporter code yet but it seems to me this 
should a unique pointer. You seem to already be aware of that based on a TODO I 
saw in the reporter code though. Is it possible that "parseFullComent" should 
just take a plain old pointer instead of a unique_ptr or shared_ptr? 



Comment at: tools/clang-doc/ClangDoc.cpp:92
+
+bool ClangDocVisitor::isUnparsed(SourceLocation Loc) const {
+  if (!Loc.isValid())

Can you add a comment documenting what this function does?



Comment at: tools/clang-doc/ClangDoc.cpp:108
+  if (const auto *Ctor = dyn_cast(D))
+MC->mangleCXXCtor(Ctor, CXXCtorType::Ctor_Complete, MangledName);
+  else if (const auto *Dtor = dyn_cast(D))

I think it's kind of annoying that this can't be a const method because of 
these mangle calls. I don't really understand why MangleContext works the way 
that it does but it could be that this is a situation where the "mutable" 
keyword should be used on MC to allow what should be a const method to actully 
be const. That might be something to look into.



Comment at: tools/clang-doc/ClangDoc.cpp:113
+MC->mangleName(D, MangledName);
+  return MangledName.str();
+}

I think you want to return S here so that the move constructor is used instead. 
str() returns a reference to S which will cause the copy constructor to be 
called. I *think* most std::string implementations have a copy on write 
optimization but it's strictly more ideal to use the move constructor.



Comment at: tools/clang-doc/ClangDoc.cpp:123
+ClangDocAction::CreateASTConsumer(CompilerInstance , StringRef InFile) {
+  return make_unique((), Reporter);
+}

Pro Tip: Always explicitly refer to this as "llvm::make_unique" because you'll 
have to revert this change if you don't.

Some of the build bots have C++14 headers instead of C++11 headers. This means 
that llvm::make_unique and std::make_unique will both be defined. This means 
that using "make_unique" will cause an error even though only llvm::make_unique 
can be referred to unqualified. So even if you're inside of the llvm namespace 
you should explicitly refer to "llvm::make_unique" and never use 

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-30 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 132075.
juliehockett edited the summary of this revision.
juliehockett added a reviewer: jakehehrlich.
juliehockett added a comment.

1. Updating and expanding tests
2. Updating output options (can now write to files)
3. Cleaning up pointers and whatnot




https://reviews.llvm.org/D41102

Files:
  test/CMakeLists.txt
  test/Tooling/clang-doc-basic.cpp
  test/Tooling/clang-doc-namespace.cpp
  test/Tooling/clang-doc-type.cpp
  test/lit.cfg.py
  tools/CMakeLists.txt
  tools/clang-doc/CMakeLists.txt
  tools/clang-doc/ClangDoc.cpp
  tools/clang-doc/ClangDoc.h
  tools/clang-doc/ClangDocReporter.cpp
  tools/clang-doc/ClangDocReporter.h
  tools/clang-doc/ClangDocYAML.h
  tools/clang-doc/tool/CMakeLists.txt
  tools/clang-doc/tool/ClangDocMain.cpp

Index: tools/clang-doc/tool/ClangDocMain.cpp
===
--- /dev/null
+++ tools/clang-doc/tool/ClangDocMain.cpp
@@ -0,0 +1,103 @@
+//===-- ClangDocMain.cpp - Clangdoc -*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "ClangDoc.h"
+#include "clang/Driver/Options.h"
+#include "clang/Frontend/FrontendActions.h"
+#include "clang/Tooling/CommonOptionsParser.h"
+#include "clang/Tooling/Tooling.h"
+#include "llvm/ADT/APFloat.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Process.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/raw_ostream.h"
+#include 
+
+using namespace clang;
+using namespace llvm;
+
+namespace {
+
+static cl::OptionCategory ClangDocCategory("clang-doc options");
+
+static cl::opt
+OutDirectory("root", cl::desc("Directory for generated files."),
+ cl::init("docs"), cl::cat(ClangDocCategory));
+
+static cl::opt
+EmitLLVM("emit-llvm",
+ cl::desc("Output in LLVM bitstream format (default is YAML)."),
+ cl::init(false), cl::cat(ClangDocCategory));
+
+static cl::opt DumpResult("dump", cl::desc("Dump results to stdout."),
+cl::init(false), cl::cat(ClangDocCategory));
+
+static cl::opt OmitFilenames("omit-filenames",
+   cl::desc("Omit filenames in output."),
+   cl::init(false), cl::cat(ClangDocCategory));
+
+static cl::opt
+DoxygenOnly("doxygen",
+cl::desc("Use only doxygen-style comments to generate docs."),
+cl::init(false), cl::cat(ClangDocCategory));
+
+} // namespace
+
+int main(int argc, const char **argv) {
+  sys::PrintStackTraceOnErrorSignal(argv[0]);
+  tooling::CommonOptionsParser OptionsParser(argc, argv, ClangDocCategory);
+  std::error_code OK;
+
+  doc::OutFormat EmitFormat;
+  SmallString<128> IRFilePath;
+  sys::path::native(OutDirectory, IRFilePath);
+  std::string IRFilename;
+  if (EmitLLVM) {
+EmitFormat = doc::OutFormat::LLVM;
+sys::path::append(IRFilePath, "llvm");
+IRFilename = "docs.bc";
+  } else {
+EmitFormat = doc::OutFormat::YAML;
+sys::path::append(IRFilePath, "yaml");
+IRFilename = "docs.yaml";
+  }
+
+  std::error_code DirectoryStatus = sys::fs::create_directories(IRFilePath);
+  if (DirectoryStatus != OK) {
+errs() << "Unable to create documentation directories.\n";
+return 1;
+  }
+
+  sys::path::append(IRFilePath, IRFilename);
+
+  doc::ClangDocReporter Reporter(OptionsParser.getSourcePathList(),
+ OmitFilenames);
+  doc::ClangDocContext Context{EmitFormat};
+
+  tooling::ClangTool Tool(OptionsParser.getCompilations(),
+  OptionsParser.getSourcePathList());
+
+  if (!DoxygenOnly)
+Tool.appendArgumentsAdjuster(tooling::getInsertArgumentAdjuster(
+"-fparse-all-comments", tooling::ArgumentInsertPosition::BEGIN));
+
+  doc::ClangDocActionFactory Factory(Context, Reporter);
+
+  outs() << "Parsing codebase...\n";
+  int Status = Tool.run();
+  if (Status)
+return Status;
+
+  outs() << "Writing intermediate docs...\n";
+  if (DumpResult)
+Reporter.serialize(EmitFormat, "");
+  else
+Reporter.serialize(EmitFormat, IRFilePath);
+  return 0;
+}
Index: tools/clang-doc/tool/CMakeLists.txt
===
--- /dev/null
+++ tools/clang-doc/tool/CMakeLists.txt
@@ -0,0 +1,18 @@
+include_directories(${CMAKE_CURRENT_SOURCE_DIR}/..)
+
+add_clang_executable(clang-doc
+  ClangDocMain.cpp
+  )
+
+target_link_libraries(clang-doc
+  PRIVATE
+  clangAST
+  clangASTMatchers
+  clangBasic
+  clangFormat
+  clangFrontend
+  clangDoc
+  clangRewrite
+  clangTooling
+  clangToolingCore
+  )
Index: tools/clang-doc/ClangDocYAML.h
===
--- /dev/null
+++ 

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-25 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 131481.
juliehockett added a comment.
Herald added a subscriber: hintonda.

Cleaning up a few unnecessary copyings


https://reviews.llvm.org/D41102

Files:
  test/CMakeLists.txt
  test/Tooling/clang-doc-basic.cpp
  test/lit.cfg.py
  tools/CMakeLists.txt
  tools/clang-doc/CMakeLists.txt
  tools/clang-doc/ClangDoc.cpp
  tools/clang-doc/ClangDoc.h
  tools/clang-doc/ClangDocReporter.cpp
  tools/clang-doc/ClangDocReporter.h
  tools/clang-doc/ClangDocYAML.h
  tools/clang-doc/tool/CMakeLists.txt
  tools/clang-doc/tool/ClangDocMain.cpp

Index: tools/clang-doc/tool/ClangDocMain.cpp
===
--- /dev/null
+++ tools/clang-doc/tool/ClangDocMain.cpp
@@ -0,0 +1,68 @@
+//===-- ClangDocMain.cpp - Clangdoc -*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "ClangDoc.h"
+#include "clang/Driver/Options.h"
+#include "clang/Frontend/FrontendActions.h"
+#include "clang/Tooling/CommonOptionsParser.h"
+#include "clang/Tooling/Tooling.h"
+#include "llvm/Support/Process.h"
+#include "llvm/Support/Signals.h"
+#include 
+
+using namespace clang;
+using namespace llvm;
+
+namespace {
+
+cl::OptionCategory ClangDocCategory("clang-doc options");
+
+cl::opt
+EmitLLVM("emit-llvm",
+ cl::desc("Output in LLVM bitstream format (default is YAML)."),
+ cl::init(false), cl::cat(ClangDocCategory));
+
+cl::opt
+DoxygenOnly("doxygen",
+cl::desc("Use only doxygen-style comments to generate docs."),
+cl::init(false), cl::cat(ClangDocCategory));
+
+} // namespace
+
+int main(int argc, const char **argv) {
+  sys::PrintStackTraceOnErrorSignal(argv[0]);
+  tooling::CommonOptionsParser OptionsParser(argc, argv, ClangDocCategory);
+
+  clang::doc::OutFormat EmitFormat =
+  EmitLLVM ? doc::OutFormat::LLVM : doc::OutFormat::YAML;
+
+  // TODO: Update the source path list to only consider changed files for
+  // incremental doc updates.
+  doc::ClangDocReporter Reporter(OptionsParser.getSourcePathList());
+  doc::ClangDocContext Context{EmitFormat};
+
+  tooling::ClangTool Tool(OptionsParser.getCompilations(),
+  OptionsParser.getSourcePathList());
+
+  if (!DoxygenOnly)
+Tool.appendArgumentsAdjuster(tooling::getInsertArgumentAdjuster(
+"-fparse-all-comments", tooling::ArgumentInsertPosition::BEGIN));
+
+  doc::ClangDocActionFactory Factory(Context, Reporter);
+
+  outs() << "Parsing codebase...\n";
+  int Status = Tool.run();
+  if (Status)
+return Status;
+
+  outs() << "Writing docs...\n";
+  Reporter.serialize(EmitFormat, outs());
+
+  return 0;
+}
Index: tools/clang-doc/tool/CMakeLists.txt
===
--- /dev/null
+++ tools/clang-doc/tool/CMakeLists.txt
@@ -0,0 +1,18 @@
+include_directories(${CMAKE_CURRENT_SOURCE_DIR}/..)
+
+add_clang_executable(clang-doc
+  ClangDocMain.cpp
+  )
+
+target_link_libraries(clang-doc
+  PRIVATE
+  clangAST
+  clangASTMatchers
+  clangBasic
+  clangFormat
+  clangFrontend
+  clangDoc
+  clangRewrite
+  clangTooling
+  clangToolingCore
+  )
Index: tools/clang-doc/ClangDocYAML.h
===
--- /dev/null
+++ tools/clang-doc/ClangDocYAML.h
@@ -0,0 +1,217 @@
+//===--  ClangDocYAML.h - ClangDoc YAML -*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_CLANG_DOC_YAML_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_CLANG_DOC_YAML_H
+
+#include "llvm/Support/YAMLTraits.h"
+
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::CommentInfo)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::NamedType)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Location)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+
+namespace llvm {
+namespace yaml {
+
+template  struct NormalizedMap {
+  NormalizedMap(IO &) {}
+  NormalizedMap(IO &, const StringMap ) {
+for (const auto  : Map) {
+  clang::doc::Pair Pair{Entry.getKeyData(), Entry.getValue()};
+  VectorMap.push_back(Pair);
+}
+  }
+
+  StringMap denormalize(IO &) {
+StringMap Map;
+for (const auto  : VectorMap)
+  Map[Pair.Key] = Pair.Value;
+return Map;
+  }
+
+  std::vector VectorMap;
+};
+
+template 

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-04 Thread Julie Hockett via Phabricator via cfe-commits
juliehockett updated this revision to Diff 128641.
juliehockett added a comment.

1. Adding in a basic test setup for the framework
2. Pulling the YAML specs out into their own file
3. Expanding the representation to consider different types of declarations 
(namespace, tag, and function) and store the appropriate information for output.


https://reviews.llvm.org/D41102

Files:
  test/CMakeLists.txt
  test/Tooling/clang-doc-basic.cpp
  test/lit.cfg.py
  tools/CMakeLists.txt
  tools/clang-doc/CMakeLists.txt
  tools/clang-doc/ClangDoc.cpp
  tools/clang-doc/ClangDoc.h
  tools/clang-doc/ClangDocReporter.cpp
  tools/clang-doc/ClangDocReporter.h
  tools/clang-doc/ClangDocYAML.h
  tools/clang-doc/tool/CMakeLists.txt
  tools/clang-doc/tool/ClangDocMain.cpp

Index: tools/clang-doc/tool/ClangDocMain.cpp
===
--- /dev/null
+++ tools/clang-doc/tool/ClangDocMain.cpp
@@ -0,0 +1,68 @@
+//===-- ClangDocMain.cpp - Clangdoc -*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "ClangDoc.h"
+#include "clang/Driver/Options.h"
+#include "clang/Frontend/FrontendActions.h"
+#include "clang/Tooling/CommonOptionsParser.h"
+#include "clang/Tooling/Tooling.h"
+#include "llvm/Support/Process.h"
+#include "llvm/Support/Signals.h"
+#include 
+
+using namespace clang;
+using namespace llvm;
+
+namespace {
+
+cl::OptionCategory ClangDocCategory("clang-doc options");
+
+cl::opt
+EmitLLVM("emit-llvm",
+ cl::desc("Output in LLVM bitstream format (default is YAML)."),
+ cl::init(false), cl::cat(ClangDocCategory));
+
+cl::opt
+DoxygenOnly("doxygen",
+cl::desc("Use only doxygen-style comments to generate docs."),
+cl::init(false), cl::cat(ClangDocCategory));
+
+} // namespace
+
+int main(int argc, const char **argv) {
+  sys::PrintStackTraceOnErrorSignal(argv[0]);
+  tooling::CommonOptionsParser OptionsParser(argc, argv, ClangDocCategory);
+
+  clang::doc::OutFormat EmitFormat =
+  EmitLLVM ? doc::OutFormat::LLVM : doc::OutFormat::YAML;
+
+  // TODO: Update the source path list to only consider changed files for
+  // incremental doc updates.
+  doc::ClangDocReporter Reporter(OptionsParser.getSourcePathList());
+  doc::ClangDocContext Context{EmitFormat};
+
+  tooling::ClangTool Tool(OptionsParser.getCompilations(),
+  OptionsParser.getSourcePathList());
+
+  if (!DoxygenOnly)
+Tool.appendArgumentsAdjuster(tooling::getInsertArgumentAdjuster(
+"-fparse-all-comments", tooling::ArgumentInsertPosition::BEGIN));
+
+  doc::ClangDocActionFactory Factory(Context, Reporter);
+
+  outs() << "Parsing codebase...\n";
+  int Status = Tool.run();
+  if (Status)
+return Status;
+
+  outs() << "Writing docs...\n";
+  Reporter.serialize(EmitFormat, outs());
+
+  return 0;
+}
Index: tools/clang-doc/tool/CMakeLists.txt
===
--- /dev/null
+++ tools/clang-doc/tool/CMakeLists.txt
@@ -0,0 +1,18 @@
+include_directories(${CMAKE_CURRENT_SOURCE_DIR}/..)
+
+add_clang_executable(clang-doc
+  ClangDocMain.cpp
+  )
+
+target_link_libraries(clang-doc
+  PRIVATE
+  clangAST
+  clangASTMatchers
+  clangBasic
+  clangFormat
+  clangFrontend
+  clangDoc
+  clangRewrite
+  clangTooling
+  clangToolingCore
+  )
Index: tools/clang-doc/ClangDocYAML.h
===
--- /dev/null
+++ tools/clang-doc/ClangDocYAML.h
@@ -0,0 +1,193 @@
+//===--  ClangDocYAML.h - ClangDoc YAML -*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "llvm/Support/YAMLTraits.h"
+
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::CommentInfo)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Link)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::NamedType)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+LLVM_YAML_IS_SEQUENCE_VECTOR(clang::doc::Pair)
+
+namespace llvm {
+namespace yaml {
+
+template  struct NormalizedMap {
+  NormalizedMap(IO &) {}
+  NormalizedMap(IO &, const StringMap ) {
+for (const auto  : Map) {
+  clang::doc::Pair Pair{Entry.getKeyData(), Entry.getValue()};
+  VectorMap.push_back(Pair);
+}
+  }
+
+  StringMap denormalize(IO &) {
+StringMap Map;
+for (const auto  : VectorMap)
+  Map[Pair.Key] = Pair.Value;
+return Map;
+  }
+
+  

[PATCH] D41102: Setup clang-doc frontend framework

2018-01-03 Thread Petr Hosek via Phabricator via cfe-commits
phosek added a comment.

In https://reviews.llvm.org/D41102#955200, @JDevlieghere wrote:

> I don't know what basis is used to differentiate between the two, but should 
> this be part of clang tools or clang-tools-extra?


AFAIK there's a general agreement that clang-tools-extra should be eventually 
merged into clang (see 
https://reviews.llvm.org/diffusion/L/browse/llvm/trunk/CMakeLists.txt;321745$127)
 so for new projects I think we should create them in clang directly rather 
than in clang-tools-extra only to be later moved to clang.


https://reviews.llvm.org/D41102



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >