[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
llvmbot wrote: >/cherry-pick 8c7f0eaa6ee3f84e3d8260535cced234bed4fa28 Error: Command failed due to missing milestone. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
llvmbot wrote: /pull-request llvm/llvm-project#131209 https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: /cherry-pick 8c7f0eaa6ee3f84e3d8260535cced234bed4fa28 https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance milestoned https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: /cherry-pick 8c7f0eaa6ee3f84e3d8260535cced234bed4fa28 https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: Just merged this, thanks everyone for the reviews! If I hadn't gotten the pushback earlier, I may have never spotted the simplification and optimization. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance closed https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -69,17 +70,22 @@ class ParentMapContext::ParentMap { for (; N > 0; --N) push_back(Value); } -bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); +bool contains(const DynTypedNode &Value) const { + const void *Identity = Value.getMemoizationData(); + assert(Identity); + return Dedup.contains(Identity); } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + const void *Identity = Value.getMemoizationData(); + if (!Identity || Dedup.insert(Identity).second) { Items.push_back(Value); + } } llvm::ArrayRef view() const { return Items; } + private: -llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +std::vector Items; higher-performance wrote: Done. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: > Interesting... this ends up just being a set-vector with a different identity > being stored for the set. So this ends up being ~4 ptr-sizes (IIRC you said > that DynTypedNode is 3x the size of a pointer) per-record (though the extra 1 > is only for those with memoization data). Originally 5x (40 bytes vs. 8 bytes), not 3x. But it might be even higher now, since that was using `SmallDenseSet` and this one is using `SmallPtrSet` -- I'm not sure how each one represents a slot in the table, but I imagine the pointer one is better optimized for pointers. > Did you run the space benchmark here from the original bug? Yup. With the `SmallVector<1>` I just changed it back to, it's somewhere between 0.6× to 1.3×, depending on the number of elements. It was something like 0.9× to 1.1× with `std::vector`. > AS FAR AS the `SmallVector`/`SmallDenseSet` vs `std::vector`/`std::set`, the > decision here is interesting. `SmallVector` has a pretty massive stack size > in exchange for saving on an allocation (it isn't SSO-like, it is more, > "size-of-vector + N, so that our allocation is on-stack unless size > N). So > it kinda depends on whether we think MOST uses of this are going to be '1' > element, vs '0' or '1+' (that is, for the small-vector suggested by @cor3ntin > of size 1). If it is almost always going to be 1, I agree with him. If it is > almost always going to be 0 or more than 1, std::vector is probably correct > (as in the 0 case, std::vector is smaller, and in the 1+ case, the allocation > is happening anyway, and we just have dead stack space again). Yeah I don't know what to expect from user code here, but it seems the difference is small either way -- I'd say let's just go with `SmallVector<1>` now. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance updated https://github.com/llvm/llvm-project/pull/129934 >From d3923bdd318e6d466650a107d80272765a7e24cd Mon Sep 17 00:00:00 2001 From: higher-performance Date: Wed, 5 Mar 2025 15:49:06 -0500 Subject: [PATCH] Reduce memory usage in AST parent map generation by only keeping pointers in the cache --- clang/lib/AST/ParentMapContext.cpp | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp index e9387ec79c373..6337605a07738 100644 --- a/clang/lib/AST/ParentMapContext.cpp +++ b/clang/lib/AST/ParentMapContext.cpp @@ -12,10 +12,11 @@ //===--===// #include "clang/AST/ParentMapContext.h" -#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/Decl.h" #include "clang/AST/Expr.h" +#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/TemplateBase.h" +#include "llvm/ADT/SmallPtrSet.h" using namespace clang; @@ -69,17 +70,21 @@ class ParentMapContext::ParentMap { for (; N > 0; --N) push_back(Value); } -bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); +bool contains(const DynTypedNode &Value) const { + const void *Identity = Value.getMemoizationData(); + assert(Identity); + return Dedup.contains(Identity); } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + const void *Identity = Value.getMemoizationData(); + if (!Identity || Dedup.insert(Identity).second) { Items.push_back(Value); + } } llvm::ArrayRef view() const { return Items; } private: -llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +llvm::SmallVector Items; +llvm::SmallPtrSet Dedup; }; /// Maps from a node to its parents. This is used for nodes that have ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/erichkeane approved this pull request. Interesting... this ends up just being a set-vector with a different identity being stored for the set. So this ends up being ~4 ptr-sizes (IIRC you said that DynTypedNode is 3x the size of a pointer) per-record (though the extra 1 is only for those with memoization data). Did you run the space benchmark here from the original bug? I definitely believe that this is an improvement (changes from 6x-ptr/element to < 4x-ptr/element), and perhaps the deduplication makes up for the extra data (since it only has to remove ONE duplicate to make up for 3 elements). AS FAR AS the `SmallVector`/`SmallDenseSet` vs `std::vector`/`std::set`, the decision here is interesting. `SmallVector` has a pretty massive stack size in exchange for saving on an allocation (it isn't SSO-like, it is more, "size-of-vector + N, so that our allocation is on-stack unless size > N). So it kinda depends on whether we think MOST uses of this are going to be '1' element, vs '0' or '1+' (that is, for the small-vector suggested by @cor3ntin of size 1). If it is almost always going to be 1, I agree with him. If it is almost always going to be 0 or more than 1, std::vector is probably correct (as in the 0 case, std::vector is smaller, and in the 1+ case, the allocation is happening anyway, and we just have dead stack space again). https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/AaronBallman approved this pull request. LGTM with the vector changed back to a SmallVector. This is a much cleaner change, thank you for spotting it! https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -69,17 +70,22 @@ class ParentMapContext::ParentMap { for (; N > 0; --N) push_back(Value); } -bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); +bool contains(const DynTypedNode &Value) const { + const void *Identity = Value.getMemoizationData(); + assert(Identity); + return Dedup.contains(Identity); } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + const void *Identity = Value.getMemoizationData(); + if (!Identity || Dedup.insert(Identity).second) { Items.push_back(Value); + } } llvm::ArrayRef view() const { return Items; } + private: -llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +std::vector Items; cor3ntin wrote: Can you explain this change? I would expect the small vector to be faster overall. (maybe `llvm::SmallVector`) https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/cor3ntin commented: This looks reasonable, I just have one question https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -69,17 +70,22 @@ class ParentMapContext::ParentMap { for (; N > 0; --N) push_back(Value); } -bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); +bool contains(const DynTypedNode &Value) const { + const void *Identity = Value.getMemoizationData(); + assert(Identity); + return Dedup.contains(Identity); } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + const void *Identity = Value.getMemoizationData(); + if (!Identity || Dedup.insert(Identity).second) { Items.push_back(Value); + } } llvm::ArrayRef view() const { return Items; } + private: -llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +std::vector Items; cor3ntin wrote: I think it would be better (std::vector _always_ allocate) https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/cor3ntin edited https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: So, this is going to sound very silly, but it seems I may have missed the obvious, and the solution may have been extremely simple: `llvm::SmallPtrSet`, on the return value of `clang::DynTypedNode::getMemoizationData()`. The reason is that the original code only ever inserts into the vector whenever `getMemoizationData()` returned non-NULL: https://github.com/llvm/llvm-project/blob/a3c248db87ebe88084386950846678c9a52dd7c0/clang/lib/AST/ParentMapContext.cpp#L388-L391 Specifically, `llvm::is_contained(*Vector, node)` (and thus `Vector->contains(node)`) is only ever called when `node.getMemoizationData()` returns non-NULL. There was no containment check in the case where memoization data was absent, so we can just use pointer identity on `getMemoizationData()`. This drastically simplifies everything - please take another look and let me know if you see any issues! https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance updated https://github.com/llvm/llvm-project/pull/129934 >From 6966e56ddfd9a8c8e6a96b2ec6d977a0b7372ad6 Mon Sep 17 00:00:00 2001 From: higher-performance Date: Wed, 5 Mar 2025 15:49:06 -0500 Subject: [PATCH] Reduce memory usage in AST parent map generation by only keeping pointers in the cache --- clang/lib/AST/ParentMapContext.cpp | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp index e9387ec79c373..0f19cc20aaa28 100644 --- a/clang/lib/AST/ParentMapContext.cpp +++ b/clang/lib/AST/ParentMapContext.cpp @@ -12,10 +12,11 @@ //===--===// #include "clang/AST/ParentMapContext.h" -#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/Decl.h" #include "clang/AST/Expr.h" +#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/TemplateBase.h" +#include "llvm/ADT/SmallPtrSet.h" using namespace clang; @@ -69,17 +70,22 @@ class ParentMapContext::ParentMap { for (; N > 0; --N) push_back(Value); } -bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); +bool contains(const DynTypedNode &Value) const { + const void *Identity = Value.getMemoizationData(); + assert(Identity); + return Dedup.contains(Identity); } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + const void *Identity = Value.getMemoizationData(); + if (!Identity || Dedup.insert(Identity).second) { Items.push_back(Value); + } } llvm::ArrayRef view() const { return Items; } + private: -llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +std::vector Items; +llvm::SmallPtrSet Dedup; }; /// Maps from a node to its parents. This is used for nodes that have ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
erichkeane wrote: > I just realized there is another optimization we could do in `push_back` > (similar motivation as what @erichkeane mentioned > [here](#discussion_r1982274548), but different): > > We could avoid the duplication check in `push_back`, and defer it to the > `contains()`/`view()` accessors, thus making `push_back` simply do `if > (!Value.getMemoizationData()) { ... }`. If we do such a thing, then we'd > probably want to amortize it so that the number of unprocessed entries > doesn't grow unboundedly. We could keep the number of unprocessed elements > within (say) 25% of the size of the processed elements. > > Thoughts on this? I'm inclined to give it a try to see if it's worth the code > complexity. It sounds worth a try, and perhaps could help readability. I think a 'remove dupes' step (aka, std::unique/erase) is much more readable than what is being attempted here. As far as the `unprocessed` elements, it would be interesting to do some sort of benchmarking to see if we choose a percent. What MIGHT be useful instead is in push_back if `capacity==size`, do the 'remove dupes' step. That would keep us necessarily bounded to the handful of powers-of-2, but would also catch us before a particularly expensive step. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: Okay thanks, I'll give it a shot. It might take me a bit to get to it. If in the meantime anyone needs this PR urgently let me know and we can probably merge this and do that in a follow-up. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: I just realized there is another optimization we could do in `push_back` (similar motivation as what @erichkeane mentioned [here](#discussion_r1982274548), but different): We could avoid the duplication check in `push_back`, and defer it to the `contains()`/`view()` accessors, thus making `push_back` simply do `if (!Value.getMemoizationData()) { ... }`. If we do such a thing, then we'd probably want to amortize it so that the number of unprocessed entries doesn't grow unboundedly. We could keep the number of unprocessed elements within (say) 25% of the size of the processed elements. Thoughts on this? I'm inclined to give it a try to see if it's worth the code complexity. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/AaronBallman approved this pull request. LGTM, thanks for the fix! https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/AaronBallman edited https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +const auto it = Items.begin() + ItemsProcessed; higher-performance wrote: Fixed! https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: > > Checking in - has everybody had a chance to review this? Can we merge? > > Was @cor3ntin correct here? > > > I think we are a bit confused, so can someone tell me if I'm wrong > > N is always going to be fairly small (and always 1 in the non-template case) > > We care about the order of parents > > We are just trying to avoid duplicate > > Intrusive linked list Depends which part you mean, but I don't believe so - I tried to respond to it [here](#issuecomment-2704643649). Ultimately I was saying we can't have an intrusive linked list because of that constraint. The part about "just trying" to avoid duplicates is somewhat ambiguous; the code only seems to want to avoid duplicates nodes that carry memoization data, but not other nodes. It may just be an optimization but I'm not confident enough to say if duplicating those will cause a problem. The part about caring about parent order is true as I've mentioned earlier, and the part about N being small I just don't know -- it probably depends on the codebase and how many parents/ancestors a class has. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
AaronBallman wrote: > Checking in - has everybody had a chance to review this? Can we merge? Was @cor3ntin correct here? > I think we are a bit confused, so can someone tell me if I'm wrong > >N is always going to be fairly small (and always 1 in the non-template > case) We care about the order of parents We are just trying to avoid duplicate Intrusive linked list https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: Checking in - has everybody had a chance to review this? Can we merge? https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +const auto it = Items.begin() + ItemsProcessed; AaronBallman wrote: ```suggestion const auto It = Items.begin() + ItemsProcessed; ``` https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +const auto it = Items.begin() + ItemsProcessed; +found |= MapInfo::isEqual(&*it, &Value); +FragileLazySeenCache.insert(&*it); +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing higher-performance wrote: Fixed! https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
erichkeane wrote: > Ah sorry I forgot to reply to your review. > > > We might take a look at `SetVector` and see if there are any learnings we > > can take from that. It has a set and a vector together, but might have some > > interesting things. > > I did take a look at this actually, but it looked equivalent to what I was > doing before, particularly with respect to storing the expensive keys > (`DynTypedNode`) twice. It just uses `SmallVector` and `DenseSet` > underneath, whereas I just happened to use `SmallDenseSet` instead of > `DenseSet`. > > > Finally... I wonder if we were to just store this as a sorted-vector of > > `DynTypedNode` + `insert order`, then do a copy & sort on access of `view`. > > I don't think that works. You don't want to do a ton of work on every > `view()` -- it's assumed to be cheap, and called often. Also, maintaining > sort order gets expensive the more nodes you add; you don't want to do it > every insertion. > I see. Yes, so the problem is that I don't have a good idea here of the uses, I did a first run here. If `view` is the 'frequent' operation, then my suggestion isn't a good one at least :) > Furthermore, not only is it that elements without memoization data aren't > comparable, but they still need to appear in the `view` _in the same order as > they were inserted_. This means either they need to be in the same vector as > the comparable elements (which makes it impossible to sort the comparable > portion), or we need to create a façade over two completely disjoint > containers. > > The implementation in this PR addresses all of those constraints without any > drawbacks, I think. There's less code compared to writing a facade, and > performance-wise it feels just about optimal. > > > Can we do some benchmarking to figure out which things are done the 'most > > often' with what values of `N`? it might better guide this. > > Re: benchmarking -- note that I did a general benchmark both with the test > case that was provided in the associated issue, as well as with an expensive > TU that we had internally. It had great performance in both cases. Regarding > N -- I assume N is referring to the total container sizes (per the other > comment)? The thing is, I don't get the impression there's much room for > improvement here. The memory usage was just high by a constant (10x) factor. > Some ~5x of that almost certainly came from the data type itself (storing > `DynTypedNode` in the cache which is 40 bytes, instead of `DynTypedNode*`, > which is 8 bytes). Adding laziness on top of that has shrunk the difference > so much that the performance plots are already very close to coinciding. Ok, I think I'm ok with all that then. BUT as I said, I'm not the best person to be reviewing this, I've pinged Aaron and hope he'll come along. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; higher-performance wrote: Ahh I see, okay thanks, fixed! https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance updated https://github.com/llvm/llvm-project/pull/129934 >From 1bfd3712eeab8884ff44a5eab38293d33b917d66 Mon Sep 17 00:00:00 2001 From: higher-performance Date: Wed, 5 Mar 2025 15:49:06 -0500 Subject: [PATCH] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen --- clang/lib/AST/ParentMapContext.cpp | 52 -- 1 file changed, 49 insertions(+), 3 deletions(-) diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp index e9387ec79c373..9042171c0844b 100644 --- a/clang/lib/AST/ParentMapContext.cpp +++ b/clang/lib/AST/ParentMapContext.cpp @@ -60,6 +60,30 @@ class ParentMapContext::ParentMap { template friend struct ::MatchParents; + template struct IndirectDenseMapInfo { +using Ptr = T *; +using Base = llvm::DenseMapInfo>; +static inline Ptr getEmptyKey() { + return static_cast(llvm::DenseMapInfo::getEmptyKey()); +} +static inline Ptr getTombstoneKey() { + return static_cast(llvm::DenseMapInfo::getTombstoneKey()); +} +static unsigned getHashValue(Ptr Val) { + return Val == getEmptyKey() || Val == getTombstoneKey() + ? 0 + : Base::getHashValue(*Val); +} +static bool isEqual(Ptr LHS, Ptr RHS) { + if (LHS == getEmptyKey() || LHS == getTombstoneKey() || + RHS == getEmptyKey() || RHS == getTombstoneKey()) { +return LHS == RHS; + } + return Base::isEqual(*LHS, *RHS); +} + }; + using MapInfo = IndirectDenseMapInfo; + /// Contains parents of a node. class ParentVector { public: @@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool Found = FragileLazySeenCache.contains(&Value); + while (!Found && ItemsProcessed < Items.size()) { +const auto It = Items.begin() + ItemsProcessed; +Found = MapInfo::isEqual(&*It, &Value); +FragileLazySeenCache.insert(&*It); +++ItemsProcessed; + } + return Found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing. + FragileLazySeenCache.clear(); +} + } } llvm::ArrayRef view() const { return Items; } private: +// BE CAREFUL. Pointers into this container are stored in the +// `FragileLazySeenCache` set below. llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +// This cache is fragile because it contains pointers that are invalidated +// when the vector capacity changes. +llvm::SmallDenseSet FragileLazySeenCache; +// Lazily tracks which items have been processed for the cache. +size_t ItemsProcessed = 0; }; /// Maps from a node to its parents. This is used for nodes that have ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); higher-performance wrote: Fixed! https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
erichkeane wrote: > > One is that I noticed `ASTContext::getTraversalScope()` is returning a > > vector by copy and that doesn't seem necessary. I noticed this too! I'm actually working on validating a RAC fix for this right now, just to have it return an ArrayRef instead. Both uses of that function just do observation, so it seems reasonable enough https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
erichkeane wrote: > Just some minor nits from me so far. > > I think there may be some additional performance benefits we could eek out, > but I don't think they need to be done in this patch (or done at all, it's > speculative). One is that I noticed `ASTContext::getTraversalScope()` is > returning a vector by copy and that doesn't seem necessary. Another is that > perhaps we want to use a bloom filter here instead on the assumption that the > parent map will always be fairly large. But again, this is all speculation. Hmm... would SOME duplicates be acceptable? I THINK the original patch removed the duplicates, so they were presumably OK before then? IF that is acceptable (and uses are duplicate-tolerant), a Bloom filter that can reduce the number of duplicates sub-1% would still be a massive-win, right? We could perhaps do some tuning on the size of the filter to get that reasonable. According to the Wiki article on Bloom filter's intro, 10 bits-per-element is all that is necessary for a 1% false-positive rate, but the details of the article and reference get really "LaTEX-math-stuff" to me, so I don't have a good idea what it would take to get the false positive rate down low. BUT since the idea is to just reduce/eliminate duplicates without high memory overhead, it seems like it would be a great solution here. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/AaronBallman commented: Just some minor nits from me so far. I think there may be some additional performance benefits we could eek out, but I don't think they need to be done in this patch (or done at all, it's speculative). One is that I noticed `ASTContext::getTraversalScope()` is returning a vector by copy and that doesn't seem necessary. Another is that perhaps we want to use a bloom filter here instead on the assumption that the parent map will always be fairly large. But again, this is all speculation. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +const auto it = Items.begin() + ItemsProcessed; +found |= MapInfo::isEqual(&*it, &Value); +FragileLazySeenCache.insert(&*it); +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing AaronBallman wrote: ```suggestion // Free memory to avoid doubling peak memory usage during rehashing. ``` https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); AaronBallman wrote: ```suggestion bool Found = FragileLazySeenCache.contains(&Value); ``` https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; erichkeane wrote: I found it jarring/confusing at first, particularly with a bool. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: Ah sorry I forgot to reply to your review. > We might take a look at `SetVector` and see if there are any learnings we can > take from that. It has a set and a vector together, but might have some > interesting things. I did take a look at this actually, but it looked equivalent to what I was doing before, particularly with respect to storing the expensive keys (`DynTypedNode`) twice. It just uses `SmallVector` and `DenseSet` underneath, whereas I just happened to use `SmallDenseSet` instead of `DenseSet`. > Finally... I wonder if we were to just store this as a sorted-vector of > `DynTypedNode` + `insert order`, then do a copy & sort on access of `view`. I don't think that works. You don't want to do a ton of work on every `view()` -- it's assumed to be cheap, and called often. Also, maintaining sort order gets expensive the more nodes you add; you don't want to do it every insertion. Furthermore not elements without memoization data aren't comparable, but they still need to appear in the `view` *in the same order as they were inserted*. This means either they need to be in the same vector as the comparable elements (which makes it impossible to sort the comparable portion), or we need to create a façade over two completely disjoint containers. The implementation in this PR addresses all of those constraints without any drawbacks, I think. There's less code compared to writing a facade, and performance-wise it feels just about optimal. > Can we do some benchmarking to figure out which things are done the 'most > often' with what values of `N`? it might better guide this. Re: benchmarking -- note that I did a general benchmark both with the test case that was provided in the associated issue, as well as with an expensive TU that we had internally. It had great performance in both cases. Regarding N -- I assume N is referring to the total container sizes (per the other comment)? The thing is, I don't get the impression there's much room for improvement here. The memory usage was just high by a constant (10x) factor. Some ~5x of that almost certainly came from the data type itself (storing `DynTypedNode` in the cache which is 40 bytes, instead of `DynTypedNode*`, which is 8 bytes). Adding laziness on top of that has shrunk the difference so much that the performance plots are already very close to coinciding. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: Fixed the bug. Performance is unaffected. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance updated https://github.com/llvm/llvm-project/pull/129934 >From bcdf8831a7727bb9f6be712e37fafdccf181f1c5 Mon Sep 17 00:00:00 2001 From: higher-performance Date: Wed, 5 Mar 2025 15:49:06 -0500 Subject: [PATCH] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen --- clang/lib/AST/ParentMapContext.cpp | 52 -- 1 file changed, 49 insertions(+), 3 deletions(-) diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp index e9387ec79c373..afbf961d94629 100644 --- a/clang/lib/AST/ParentMapContext.cpp +++ b/clang/lib/AST/ParentMapContext.cpp @@ -60,6 +60,30 @@ class ParentMapContext::ParentMap { template friend struct ::MatchParents; + template struct IndirectDenseMapInfo { +using Ptr = T *; +using Base = llvm::DenseMapInfo>; +static inline Ptr getEmptyKey() { + return static_cast(llvm::DenseMapInfo::getEmptyKey()); +} +static inline Ptr getTombstoneKey() { + return static_cast(llvm::DenseMapInfo::getTombstoneKey()); +} +static unsigned getHashValue(Ptr Val) { + return Val == getEmptyKey() || Val == getTombstoneKey() + ? 0 + : Base::getHashValue(*Val); +} +static bool isEqual(Ptr LHS, Ptr RHS) { + if (LHS == getEmptyKey() || LHS == getTombstoneKey() || + RHS == getEmptyKey() || RHS == getTombstoneKey()) { +return LHS == RHS; + } + return Base::isEqual(*LHS, *RHS); +} + }; + using MapInfo = IndirectDenseMapInfo; + /// Contains parents of a node. class ParentVector { public: @@ -70,16 +94,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +const auto it = Items.begin() + ItemsProcessed; +found |= MapInfo::isEqual(&*it, &Value); +FragileLazySeenCache.insert(&*it); +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing + FragileLazySeenCache.clear(); +} + } } llvm::ArrayRef view() const { return Items; } private: +// BE CAREFUL. Pointers into this container are stored in the +// `FragileLazySeenCache` set below. llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +// This cache is fragile because it contains pointers that are invalidated +// when the vector capacity changes. +llvm::SmallDenseSet FragileLazySeenCache; +// Lazily tracks which items have been processed for the cache. +size_t ItemsProcessed = 0; }; /// Maps from a node to its parents. This is used for nodes that have ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance updated https://github.com/llvm/llvm-project/pull/129934 >From eaefea589f78e4a50e10ab579e5b27da3152e08e Mon Sep 17 00:00:00 2001 From: higher-performance Date: Wed, 5 Mar 2025 15:49:06 -0500 Subject: [PATCH] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen --- clang/lib/AST/ParentMapContext.cpp | 51 -- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp index e9387ec79c373..1062cc6a3ef0f 100644 --- a/clang/lib/AST/ParentMapContext.cpp +++ b/clang/lib/AST/ParentMapContext.cpp @@ -60,6 +60,29 @@ class ParentMapContext::ParentMap { template friend struct ::MatchParents; + template struct IndirectDenseMapInfo { +using Ptr = T *; +using Base = llvm::DenseMapInfo>; +static inline Ptr getEmptyKey() { + return static_cast(llvm::DenseMapInfo::getEmptyKey()); +} +static inline Ptr getTombstoneKey() { + return static_cast(llvm::DenseMapInfo::getTombstoneKey()); +} +static unsigned getHashValue(Ptr Val) { + return Val == getEmptyKey() || Val == getTombstoneKey() + ? 0 + : Base::getHashValue(*Val); +} +static bool isEqual(Ptr LHS, Ptr RHS) { + if (LHS == getEmptyKey() || LHS == getTombstoneKey() || + RHS == getEmptyKey() || RHS == getTombstoneKey()) { +return LHS == RHS; + } + return Base::isEqual(*LHS, *RHS); +} + }; + /// Contains parents of a node. class ParentVector { public: @@ -70,16 +93,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing + FragileLazySeenCache.clear(); +} + } } llvm::ArrayRef view() const { return Items; } private: +// BE CAREFUL. Pointers into this container are stored in the +// `FragileLazySeenCache` set below. llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +// This cache is fragile because it contains pointers that are invalidated +// when the vector capacity changes. +llvm::SmallDenseSet> +FragileLazySeenCache; +// Lazily tracks which items have been processed for the cache. +size_t ItemsProcessed = 0; }; /// Maps from a node to its parents. This is used for nodes that have ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance updated https://github.com/llvm/llvm-project/pull/129934 >From 55e0d3aca6c4051ebd68a9082005984e2d40c552 Mon Sep 17 00:00:00 2001 From: higher-performance Date: Wed, 5 Mar 2025 15:49:06 -0500 Subject: [PATCH] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen --- clang/lib/AST/ParentMapContext.cpp | 61 +++--- 1 file changed, 56 insertions(+), 5 deletions(-) diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp index e9387ec79c373..5c601d8716175 100644 --- a/clang/lib/AST/ParentMapContext.cpp +++ b/clang/lib/AST/ParentMapContext.cpp @@ -60,6 +60,29 @@ class ParentMapContext::ParentMap { template friend struct ::MatchParents; + template struct IndirectDenseMapInfo { +using Ptr = T *; +using Base = llvm::DenseMapInfo>; +static inline Ptr getEmptyKey() { + return static_cast(llvm::DenseMapInfo::getEmptyKey()); +} +static inline Ptr getTombstoneKey() { + return static_cast(llvm::DenseMapInfo::getTombstoneKey()); +} +static unsigned getHashValue(Ptr Val) { + return Val == getEmptyKey() || Val == getTombstoneKey() + ? 0 + : Base::getHashValue(*Val); +} +static bool isEqual(Ptr LHS, Ptr RHS) { + if (LHS == getEmptyKey() || LHS == getTombstoneKey() || + RHS == getEmptyKey() || RHS == getTombstoneKey()) { +return LHS == RHS; + } + return Base::isEqual(*LHS, *RHS); +} + }; + /// Contains parents of a node. class ParentVector { public: @@ -70,16 +93,44 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing + FragileLazySeenCache.clear(); +} + } +} +llvm::ArrayRef view() { + while (ItemsProcessed < Items.size()) { +FragileLazySeenCache.insert(&Items[ItemsProcessed]); +++ItemsProcessed; + } + return Items; } -llvm::ArrayRef view() const { return Items; } private: +// BE CAREFUL. Pointers into this container are stored in the +// `FragileLazySeenCache` set below. llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +// This cache is fragile because it contains pointers that are invalidated +// when the vector capacity changes. +llvm::SmallDenseSet> +FragileLazySeenCache; +// Lazily tracks which items have been processed for the cache. +size_t ItemsProcessed = 0; }; /// Maps from a node to its parents. This is used for nodes that have @@ -117,7 +168,7 @@ class ParentMapContext::ParentMap { if (I == Map.end()) { return llvm::ArrayRef(); } -if (const auto *V = dyn_cast(I->second)) { +if (auto *V = dyn_cast(I->second)) { return V->view(); } return getSingleDynTypedNodeFromParentMap(I->second); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; higher-performance wrote: This couples the correctness of `found` on this line to its presence in the `while (!found && ...)` condition, which is why I felt it's better not to do that. I can change it you'd really like me to though - do you find this hurts readability? https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance updated https://github.com/llvm/llvm-project/pull/129934 >From 6a99b1d1341340b780cf31dd1632b8d18ba65fe4 Mon Sep 17 00:00:00 2001 From: higher-performance Date: Wed, 5 Mar 2025 15:49:06 -0500 Subject: [PATCH] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen --- clang/lib/AST/ParentMapContext.cpp | 51 -- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp index e9387ec79c373..1062cc6a3ef0f 100644 --- a/clang/lib/AST/ParentMapContext.cpp +++ b/clang/lib/AST/ParentMapContext.cpp @@ -60,6 +60,29 @@ class ParentMapContext::ParentMap { template friend struct ::MatchParents; + template struct IndirectDenseMapInfo { +using Ptr = T *; +using Base = llvm::DenseMapInfo>; +static inline Ptr getEmptyKey() { + return static_cast(llvm::DenseMapInfo::getEmptyKey()); +} +static inline Ptr getTombstoneKey() { + return static_cast(llvm::DenseMapInfo::getTombstoneKey()); +} +static unsigned getHashValue(Ptr Val) { + return Val == getEmptyKey() || Val == getTombstoneKey() + ? 0 + : Base::getHashValue(*Val); +} +static bool isEqual(Ptr LHS, Ptr RHS) { + if (LHS == getEmptyKey() || LHS == getTombstoneKey() || + RHS == getEmptyKey() || RHS == getTombstoneKey()) { +return LHS == RHS; + } + return Base::isEqual(*LHS, *RHS); +} + }; + /// Contains parents of a node. class ParentVector { public: @@ -70,16 +93,38 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing + FragileLazySeenCache.clear(); +} + } } llvm::ArrayRef view() const { return Items; } private: +// BE CAREFUL. Pointers into this container are stored in the +// `FragileLazySeenCache` set below. llvm::SmallVector Items; -llvm::SmallDenseSet Seen; +// This cache is fragile because it contains pointers that are invalidated +// when the vector capacity changes. +llvm::SmallDenseSet> +FragileLazySeenCache; +// Lazily tracks which items have been processed for the cache. +size_t ItemsProcessed = 0; }; /// Maps from a node to its parents. This is used for nodes that have ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing + FragileLazySeenCache.clear(); +} + } } llvm::ArrayRef view() const { return Items; } private: +// BE CAREFUL. Pointers into this container are stored in the cache. higher-performance wrote: Done, thanks. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
higher-performance wrote: Oh shoot, it looks like my change may be buggy. One of the tests is breaking... I will take a look. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { higher-performance wrote: I could definitely optimize these further like in the ways you mention, but those are just (very small) constant-factor savings in time, at a significant readability and fragility cost. I didn't get the feeling it's worth it to be honest -- do you? Note that the performance graphs almost coincide with Clang 17's right now. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { erichkeane wrote: It seems unfortunate to potentially re-calculate in `contains` here to just potentially immediately re-calculate it. I wonder if there is value to special-case `ItemsProcessed == 0 && Items.capacity() == Items.size()` to just do a linear search? Also-also--- I wonder if it makes sense to only do the cache when `size()` is significant enough? It would be interesting to see if there is a value of N under which the set doesn't really make sense. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; +++ItemsProcessed; + } + return found; } void push_back(const DynTypedNode &Value) { - if (!Value.getMemoizationData() || Seen.insert(Value).second) + if (!Value.getMemoizationData() || !contains(Value)) { +const size_t OldCapacity = Items.capacity(); Items.push_back(Value); +if (OldCapacity != Items.capacity()) { + // Pointers are invalidated; remove them. + ItemsProcessed = 0; + // Free memory to avoid doubling peak memory usage during rehashing + FragileLazySeenCache.clear(); +} + } } llvm::ArrayRef view() const { return Items; } private: +// BE CAREFUL. Pointers into this container are stored in the cache. erichkeane wrote: ```suggestion // BE CAREFUL. Pointers into this container are stored in the `FragileLazySeenCache` set below. ``` https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
@@ -70,16 +93,37 @@ class ParentMapContext::ParentMap { push_back(Value); } bool contains(const DynTypedNode &Value) { - return Seen.contains(Value); + assert(Value.getMemoizationData()); + bool found = FragileLazySeenCache.contains(&Value); + while (!found && ItemsProcessed < Items.size()) { +found |= FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; erichkeane wrote: ```suggestion found = FragileLazySeenCache.insert(&Items[ItemsProcessed]).second; ``` https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/erichkeane commented: @AaronBallman should take a look at this as well. We might take a look at `SetVector` and see if there are any learnings we can take from that. It has a set and a vector together, but might have some interesting things. Finally... I wonder if we were to just store this as a sorted-vector of `DynTypedNode` + `insert order`, then do a copy & sort on access of `view`. Can we do some benchmarking to figure out which things are done the 'most often' with what values of `N`? it might better guide this. https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/erichkeane edited https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance edited https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
https://github.com/higher-performance edited https://github.com/llvm/llvm-project/pull/129934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits