Peter Suter wrote: > > For background I started writing a new git backend for Trac based on > > libgit2/pygit2 years ago, but didn't complete it because of the mismatch > > between Trac repository datamodel requirements/expectations and the Git > > data model. :\ > > > > My WIP is still at http://git.stuge.se/?p=trac.git;a=commitdiff;h=ea5437b > > Are you aware of / do you have any thoughts on TracPygit2Plugin? > https://trac-hacks.org/wiki/TracPygit2Plugin > https://trac.edgewall.org/ticket/10606
I wasn't aware. I think that development may have started after my work. I haven't focused on this topic since then, so haven't seen it. Thoughts - well I think that the aggressive caching of repo information into the Trac database is fundamentally broken, and that all repo plugin implementations will suffer from that. I would expect TracPygit2Plugin to be better than my WIP, because Jun is much more familiar with Trac internals, but I am unable to do a proper comparison. > I don't know Git or the Trac VCS model all that well. The primary mismatch is that Trac expects to traverse commit history equally easily in both directions, while Git only stores history in reverse chronological order. Another smaller point is that Git allows commit histories outside of branches. A tag or even a plain commit id can also be the end point for a series of commits. > Just from reading > the source it seems that enabling the persistent_cache leads to storing > a reference to this (in-memory) revision cache in a Python class > variable (StorageFactory.__dict_nonweak). > Presumably the goal is to reuse that cache across requests to improve > performance. For anything but minimal repositories on private Trac instances, the fact that git_fs spawns a git process is absolutely outrageous - let alone the large number of processes that were spawned when I looked at this. I think the proper solution is to move some of what Trac tries to do with metadata caching out of Trac and closer to the repo, and have the Trac repo plugin take advantage both of pygit2 and of that external parent/child index. Trac should already have access to all needed information, it should not need to build caches, Trac is just the wrong place for that, IMHO. > So the trade-off would be better performance vs. higher memory usage > plus, as you pointed out, a higher risk for something to go wrong due to > the increased complexity. > Also it only helps if the same Python interpreter is used for another > request (not the case in e.g. CGI), right? Anything but the most trivial instances will use long-running processes, typically either tracd or using WSGI. > I'm not sure if it makes sense to use both this in-memory cache and > Trac's DB CachedRepository at the same time or if that's redundant and > pointless. > Any thoughts on that? I get the impression that the persistent_cache is stored, and not just in-memory - but it is possible that I am wrong about this. Thanks //Peter -- You received this message because you are subscribed to the Google Groups "Trac Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/trac-dev. For more options, visit https://groups.google.com/d/optout.
