> On Aug 31, 2016, at 13:53, Durham Goode <dur...@fb.com> wrote:
> 
> One of the performance costs that affects every command is the computehidden 
> function (which computes what commits are hidden based on a combination of 
> obsmarkers, bookmarks, workingcopy, phases, and tags).  At Facebook this 
> function alone can add 300ms to every command a user runs (depending on a 
> variety of factors).
> 
> I've begun work on a storage structure for this information, so we don't need 
> to recompute it during every read command.  Before I go through the work to 
> polish it enough to send it upstream, I want to run it by people for comments.
> 
> The idea is basically:
> 
> - Have a file that acts as a bitmap, where a changelog rev number maps to a 
> particular bit, and if that bit is set then the node is considered hidden.  
> computehidden will return a class that uses this to answer 
> filteredrevs.__contains__ calls.
> 
> - When something is changed that would affect visibility (bookmark edit, 
> phase change, obsmarker creation, dirstate movement, tag edit), that code is 
> responsible for calling hidden.updatevisibilty(repo, affectednodes) which 
> reports which nodes are touched by the edit (like the old bookmark node and 
> the new bookmark node), and updatevisibility() does the minimal calculation 
> of visibility changes and propagates those changes to the node ancestors as 
> required.
> 
> - The file would be stored in .hg/cache/hidden, and would only be 122kb per 
> million commits, so we'd just use the normal copy + atomic rename strategy 
> for transactionality (i.e. transaction.addfilegenerator)
> 
> - In theory the file would always be up-to-date, but we can put some metadata 
> at the front of the file to allow verifying the repo hasn't changed 
> significantly out from under it (recording tip, working copy parents, 
> obsmarker file timestamp, etc). If the repo has changed, or if the bitmap 
> doesn't yet exist, it can be trivially calculated using 
> hidden.updatevisibility(repo.revs('not public()')) in a manner similar to how 
> it works today.
> 
> 
> Notable issues:
> 
> - A number of places treat filteredrevs as a set, and do things like 'myset - 
> repo.filteredrevs'.  With a bitmap this doesn't work, so we need to translate 
> it to '(r for r in myset if r not in repo.filteredrevs)'.  Which is probably 
> a better O() anyway since it won't be affected by the size of filteredrevs.
> 
> - filteredrevs is currently a frozen set.  Editing the bitmap will have to be 
> done in a way where the edits only become visible when the 'frozen' set is 
> invalidated.  So we maintain the existing behavior.
> 
> 
> Follow up ideas:
> 
> - Once this ships as a cache, it might be possible to progress it to 
> not-a-cache, such that we could define visibility in other terms.  For 
> instance, we could allow hiding/reviving commits without obsolescence markers 
> in non-evolve repos.  This is a bit more of a long term thought, but might be 
> worth brainstorming about.

Why not ship it as part of the store from the beginning (without the extra 
features)? Cache validation is hard. Tying the updates to store 
locks/transactions seems easier and guarantees no updates are needed at read 
time (which has historically caused issues on multi process HTTP servers).

> 
> 
> My code so far can be seen at:
> 
> https://bitbucket.org/DurhamG/hg/commits/branch/bitmap
> 
> Though it's currently missing some key features (cache validity checking, 
> tests don't all pass yet, file is in .hg/store instead of .hg/cache).
> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to