paulkirth added inline comments.
================ Comment at: clang-tools-extra/clang-doc/HTMLGenerator.cpp:890-892 + // TODO If there are multiple Infos for this file name (for example, + // template specializations), this will generate multiple complete web pages + // (with <DOCTYPE> and <title>, etc.) concatenated together. This generator ---------------- brettw wrote: > paulkirth wrote: > > So does this patch make HTML unusable? Currently, if there is a conflict, > > the file gets replaced, right? > > > > I read this as we're going from a situation where we lost data, but still > > had valid HTML pages to one where we have complete, but incorrect HTML, is > > that correct? > > > > Also, why does that happen under template specialization? USRs are SHA1 > > hashes of the symbol. I would expect that hashing the symbol would be > > unambiguous and unique, given C++'s ODR. > > > > That last point made me wonder if an alternate way to handle this without > > using the hash would be to use the mangled names? > I think this makes HTML strictly better (though not ideal). > > Previously if there was a collision, they got written over the top of each > other without truncation (not sure why). So if the longer page was written > second, you got lucky and you get a complete valid page and just lost the > shorter struct's definition. If you got unlucky the shorter page was written > second and you get a corrupted file with the shorter content, followed by the > end of the longer content peeking out from the end. > > With this change, you get both content concatenated without corruption, you > just have duplicate doctype and title tags in between them. Browsers should > generally handle this so the page should be usable. > > Fixing this requires a refactor of the HTML generator such that the concept > of writing a struct is separate from writing a page. Even if I was going to > work on the HTML generator, I would probably want to do this in a separate > patch anyway. > > The markdown generator does the same, but markdown doesn't have all of the > headers stuff that shouldn't be in the middle of the file, you just get a new > `# ...` for the second section and it should be reasonable. > > The HTML and MD generators name files according to the `Name` of the > structure. USR is not involved. So any time to things in the same namespace > have the same name, you'll get a collision. This happens most commonly with > template specialization because the names are identical (the name doesn't > account for the template parameters). > > All of this complexity is why I changed the YAML output to use USR which > avoids the problem. I didn't want to make this change for the HTML and MD > generators because it will make ugly file names that the user will see, and > you probably want the template specializations to be in the same file anyway. > I think this makes HTML strictly better (though not ideal). > > Previously if there was a collision, they got written over the top of each > other without truncation (not sure why). So if the longer page was written > second, you got lucky and you get a complete valid page and just lost the > shorter struct's definition. If you got unlucky the shorter page was written > second and you get a corrupted file with the shorter content, followed by the > end of the longer content peeking out from the end. > > With this change, you get both content concatenated without corruption, you > just have duplicate doctype and title tags in between them. Browsers should > generally handle this so the page should be usable. > Thanks for the explanation. My concern was that this change made HTML unusable, if it is still valid (if not ideal) then I don't think there is an issue. > Fixing this requires a refactor of the HTML generator such that the concept > of writing a struct is separate from writing a page. Even if I was going to > work on the HTML generator, I would probably want to do this in a separate > patch anyway. > No arguments re splitting up work. Can you file an issue to track this on github and reference it in the TODO? That should help make sure this gets addressed. > The markdown generator does the same, but markdown doesn't have all of the > headers stuff that shouldn't be in the middle of the file, you just get a new > `# ...` for the second section and it should be reasonable. > > The HTML and MD generators name files according to the `Name` of the > structure. USR is not involved. So any time to things in the same namespace > have the same name, you'll get a collision. This happens most commonly with > template specialization because the names are identical (the name doesn't > account for the template parameters). > I think there should be a way to obtain the mangled name from whatever you're looking at. Those should be available, even from the AST, so if you've found a template specialization, it //should// be possible to get the mangled name (even for partial specializations). This shouldn't necessarily block you, but it seems like that would solve the issue nicely for all three generators. ================ Comment at: clang-tools-extra/test/clang-doc/single-file-public.cpp:6 // RUN: clang-doc --doxygen --public --executor=standalone -p %t %t/test.cpp -output=%t/docs -// RUN: cat %t/docs/GlobalNamespace/Record.yaml | FileCheck %s --check-prefix=CHECK +// RUN: cat %t/docs/`ls %t/docs | grep -v index.yaml` | FileCheck %s --check-prefix=CHECK // RUN: rm -rf %t ---------------- brettw wrote: > paulkirth wrote: > > Its OK to use `ls` and `grep` in the runline, but you cannot use the > > backtick syntax. It will cause problems on other platforms and I can't find > > any other usage of that in our codebase. TBH I'm a bit surprised this > > works, since `lit` has some interesting behavior around RUN. > > > > I think you can just use `cat < ls %t/docs | grep -v index.yaml` instead. I > > haven't tried it, but that is certainly a more typical method of doing this > > in a lit test. Using `find` may also work, but I only found one use in > > `llvm/test/ExecutionEngine/MCJIT/load-object-a.ll`. > Your example doesn't work (it's trying to load a file named "ls") but I found > another solution using xargs (which is hopefully supported on all of the > platforms). I don't think we require `xargs`, so I'm not sure you can rely on that. We do require `find`, so that may be a better choice, and you can even have it invoke `cat` directly. See: https://llvm.org/docs/GettingStarted.html#software CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138073/new/ https://reviews.llvm.org/D138073 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits