> On Aug 31, 2016, at 13:53, Durham Goode <dur...@fb.com> wrote: > > One of the performance costs that affects every command is the computehidden > function (which computes what commits are hidden based on a combination of > obsmarkers, bookmarks, workingcopy, phases, and tags). At Facebook this > function alone can add 300ms to every command a user runs (depending on a > variety of factors). > > I've begun work on a storage structure for this information, so we don't need > to recompute it during every read command. Before I go through the work to > polish it enough to send it upstream, I want to run it by people for comments. > > The idea is basically: > > - Have a file that acts as a bitmap, where a changelog rev number maps to a > particular bit, and if that bit is set then the node is considered hidden. > computehidden will return a class that uses this to answer > filteredrevs.__contains__ calls. > > - When something is changed that would affect visibility (bookmark edit, > phase change, obsmarker creation, dirstate movement, tag edit), that code is > responsible for calling hidden.updatevisibilty(repo, affectednodes) which > reports which nodes are touched by the edit (like the old bookmark node and > the new bookmark node), and updatevisibility() does the minimal calculation > of visibility changes and propagates those changes to the node ancestors as > required. > > - The file would be stored in .hg/cache/hidden, and would only be 122kb per > million commits, so we'd just use the normal copy + atomic rename strategy > for transactionality (i.e. transaction.addfilegenerator) > > - In theory the file would always be up-to-date, but we can put some metadata > at the front of the file to allow verifying the repo hasn't changed > significantly out from under it (recording tip, working copy parents, > obsmarker file timestamp, etc). If the repo has changed, or if the bitmap > doesn't yet exist, it can be trivially calculated using > hidden.updatevisibility(repo.revs('not public()')) in a manner similar to how > it works today. > > > Notable issues: > > - A number of places treat filteredrevs as a set, and do things like 'myset - > repo.filteredrevs'. With a bitmap this doesn't work, so we need to translate > it to '(r for r in myset if r not in repo.filteredrevs)'. Which is probably > a better O() anyway since it won't be affected by the size of filteredrevs. > > - filteredrevs is currently a frozen set. Editing the bitmap will have to be > done in a way where the edits only become visible when the 'frozen' set is > invalidated. So we maintain the existing behavior. > > > Follow up ideas: > > - Once this ships as a cache, it might be possible to progress it to > not-a-cache, such that we could define visibility in other terms. For > instance, we could allow hiding/reviving commits without obsolescence markers > in non-evolve repos. This is a bit more of a long term thought, but might be > worth brainstorming about.
Why not ship it as part of the store from the beginning (without the extra features)? Cache validation is hard. Tying the updates to store locks/transactions seems easier and guarantees no updates are needed at read time (which has historically caused issues on multi process HTTP servers). > > > My code so far can be seen at: > > https://bitbucket.org/DurhamG/hg/commits/branch/bitmap > > Though it's currently missing some key features (cache validity checking, > tests don't all pass yet, file is in .hg/store instead of .hg/cache). > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel