dgoldman marked an inline comment as done. dgoldman added inline comments.
================ Comment at: clang-tools-extra/clangd/FindSymbols.cpp:535 +/// by range. +std::vector<DocumentSymbol> mergePragmas(std::vector<DocumentSymbol> &Syms, + std::vector<PragmaMarkSymbol> &Pragmas, ---------------- dgoldman wrote: > kadircet wrote: > > dgoldman wrote: > > > kadircet wrote: > > > > dgoldman wrote: > > > > > kadircet wrote: > > > > > > dgoldman wrote: > > > > > > > kadircet wrote: > > > > > > > > dgoldman wrote: > > > > > > > > > sammccall wrote: > > > > > > > > > > FWIW the flow control/how we make progress seem hard to > > > > > > > > > > follow here to me. > > > > > > > > > > > > > > > > > > > > In particular I think I'm struggling with the statefulness > > > > > > > > > > of "is there an open mark group". > > > > > > > > > > > > > > > > > > > > Possible simplifications: > > > > > > > > > > - define a dummy root symbol, which seems clearer than the > > > > > > > > > > vector<symbols> + range > > > > > > > > > > - avoid reverse-sorting the list of pragma symbols, and > > > > > > > > > > just consume from the front of an ArrayRef instead > > > > > > > > > > - make the outer loop over pragmas, rather than symbols. > > > > > > > > > > It would first check if the pragma belongs directly here or > > > > > > > > > > not, and if so, loop over symbols to work out which should > > > > > > > > > > become children. This seems very likely to be efficient > > > > > > > > > > enough in practice (few pragmas, or most children are > > > > > > > > > > grouped into pragmas) > > > > > > > > > > define a dummy root symbol, which seems clearer than the > > > > > > > > > > vector<symbols> + range > > > > > > > > > > > > > > > > > > I guess? Then we'd take in a `DocumentSymbol & and a > > > > > > > > > ArrayRef<PragmaMarkSymbol> & (or just by value and then > > > > > > > > > return it as well). The rest would be the same though > > > > > > > > > > > > > > > > > > > In particular I think I'm struggling with the statefulness > > > > > > > > > > of "is there an open mark group". > > > > > > > > > > > > > > > > > > We need to track the current open group if there is one in > > > > > > > > > order to move children to it. > > > > > > > > > > > > > > > > > > > make the outer loop over pragmas, rather than symbols. It > > > > > > > > > > would first check if the pragma belongs directly here or > > > > > > > > > > not, and if so, loop over symbols to work out which should > > > > > > > > > > become children. This seems very likely to be efficient > > > > > > > > > > enough in practice (few pragmas, or most children are > > > > > > > > > > grouped into pragmas) > > > > > > > > > > > > > > > > > > The important thing here is knowing where the pragma mark > > > > > > > > > ends - if it doesn't, it actually gets all of the children. > > > > > > > > > So we'd have to peak at the next pragma mark, add all symbols > > > > > > > > > before it to us as children, and then potentially recurse to > > > > > > > > > nest it inside of a symbol. I'll try it out and see if it's > > > > > > > > > simpler. > > > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > while(Pragmas) { > > > > > > > > // We'll figure out where the Pragmas.front() should go. > > > > > > > > Pragma P = Pragmas.front(); > > > > > > > > DocumentSymbol *Cur = Root; > > > > > > > > while(Cur->contains(P)) { > > > > > > > > auto *OldCur = Cur; > > > > > > > > for(auto *C : Cur->children) { > > > > > > > > // We assume at most 1 child can contain the pragma (as > > > > > > > > pragmas are on a single line, and children have disjoint ranges) > > > > > > > > if (C->contains(P)) { > > > > > > > > Cur = C; > > > > > > > > break; > > > > > > > > } > > > > > > > > } > > > > > > > > // Cur is immediate parent of P > > > > > > > > if (OldCur == Cur) { > > > > > > > > // Just insert P into children if it is not a group and we > > > > > > > > are done. > > > > > > > > // Otherwise we need to figure out when current pragma is > > > > > > > > terminated: > > > > > > > > // if next pragma is not contained in Cur, or is contained in > > > > > > > > one of the children, It is at the end of Cur, nest all the > > > > > > > > children that appear after P under the symbol node for P. > > > > > > > > // Otherwise nest all the children that appear after P but > > > > > > > > before next pragma under the symbol node for P. > > > > > > > > // Pop Pragmas and break > > > > > > > > } > > > > > > > > } > > > > > > > > } > > > > > > > > ``` > > > > > > > > > > > > > > > > Does that make sense, i hope i am not missing something > > > > > > > > obvious? Complexity-wise in the worst case we'll go all the way > > > > > > > > down to a leaf once per pragma, since there will only be a > > > > > > > > handful of pragmas most of the time it shouldn't be too bad. > > > > > > > I've implemented your suggestion. I don't think it's simpler, but > > > > > > > LMK, maybe it can be improved. > > > > > > oops, i was looking into an older revision and missed > > > > > > mergepragmas2, i think it looks quite similar to this one but we > > > > > > can probably get rid of the recursion as well and simplify a couple > > > > > > more cases > > > > > This makes sense, I think that works for the most part besides > > > > > dropping the recursion, specifically for > > > > > > > > > > ``` > > > > > // Next pragma is contained in the Sym, it belongs there and > > > > > doesn't > > > > > // affect us at all. > > > > > if (Sym.range.contains(NextPragma.DocSym.range)) { > > > > > Sym.children = mergePragmas2(Sym.children, Pragmas, > > > > > Sym.range); > > > > > continue; > > > > > } > > > > > ``` > > > > > > > > > > I guess we could explicitly forbid 3+ layers of nesting and handle it > > > > > inline there? But I'm not sure it's worth the effort to rewrite all > > > > > of this - the recursion shouldn't be deep and we avoid needing to > > > > > shift vector elements over by recreating a new one. > > > > Sorry I don't follow why we can't get rid of the recursion in this case. > > > > > > > > Two loop solution I described above literally tries to find the > > > > document symbol node, such that the current pragma is contained in that > > > > node && current pragma isn't contained in any of that node's children. > > > > Afterwards it inserts the pragma into that node and starts traversing > > > > the tree from root again for the next pragma. > > > > > > > > Again I don't follow where the `3+ layers of nesting` constraint came > > > > from. But I do feel like the iterative version is somewhat easier to > > > > reason about (especially keeping track of what's happening with > > > > `pragmas.front()` and the way it bails out via `parentrange` check). > > > > Shifting of the vector is definitely unfortunate but I think it > > > > shouldn't imply big performance hits in practice as we are only > > > > shifting the children of a single node. > > > Yeah, I understand the first part, I think specifically handling the > > > group case after you discover where it needs to be inserted is a bit more > > > complicated, something like the following: > > > > > > > > > ``` > > > // Pragma is a group, so we need to figure out where it > > > terminates: > > > // - If the next Pragma is not contained in Cur, P owns all of its > > > // parent's children which occur after P. > > > // - If the next pragma is contained in Cur but actually belongs > > > to one > > > // of the parent's children, we temporarily skip over it and > > > look at > > > // the next pragma to decide where we end. > > > // - Otherwise nest all of its parent's children which occur > > > after P but > > > // before the next pragma. > > > ``` > > > > > > And yeah, shifting in the worst case is definitely worst (due to repeat > > > shifting) although it shouldn't be too common in practice (things like a > > > large @implementation block would probably have the most children). > > > we temporarily skip over it and look at the next pragma to decide where > > > we end. > > > > Ah you are right I was missing this part. As you suggested we'll need to > > loop over all the remaining pragmas, which will make the complexity > > `((quadratic in terms of # of pragmas) * (depth of document symbol tree))` > > I think it is still acceptable in practice. I don't think there'll be more > > than a dozen pragmas in a file. (moreover i think this problem exists with > > the current implementation you've provided, see my comment below) > > > > Are you fine with reshaping the implementation in that way then? Because I > > still feel like the code will end up looking a lot more straightforward, > > even if not the most optimal and we can try to optimize the rest if it > > shows up in the latency graphs. > Implementing this now - there's one other edge case here, in that we don't > want to treat `#pragma mark Blah` as a group. I guess I can maintain state > similar to the first solution to solve that. Wait nevermind, I don't think that's a problem since the pragmas should never overlap, just have some other issue in my code I need to figure out Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105904/new/ https://reviews.llvm.org/D105904 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits