On Fri, 3 Feb 2017 20:03:18 +0000, Jun Wu wrote: > Excerpts from Yuya Nishihara's message of 2017-02-04 00:11:22 +0900: > > On Thu, 2 Feb 2017 16:56:11 +0000, Jun Wu wrote: > > > Excerpts from Yuya Nishihara's message of 2017-02-03 00:45:22 +0900: > > > > On Thu, 2 Feb 2017 09:34:47 +0000, Jun Wu wrote: > > > > > So what state do we store? > > > > > > > > > > {repopath: {name: (hash, content)}}. For example: > > > > > > > > > > cache = {'/home/foo/repo1': {'index': ('hash', changelogindex), > > > > > 'bookmarks': ('hash', bookmarks), > > > > > .... }, > > > > > '/home/foo/repo2': { .... }, .... } > > > > > > > > > > The main ideas here are: > > > > > 1) Store the lowest level objects, like the C changelog index. > > > > > Because higher level objects could be changed by extensions in > > > > > unpredictable ways. (this is not true in my hacky prototype > > > > > though) > > > > > 2) Hash everything. For changelog, it's like the file stat of > > > > > changelog.i. There must be a strong guarantee that the hash > > > > > matches > > > > > the content, which could be challenging, but not impossible. > > > > > I'll > > > > > cover more details below. > > > > > > > > > > The cache is scoped by repo to make the API simpler/easy to use. It > > > > > may > > > > > be interesting to have some global state (like passing back the > > > > > extension > > > > > path to import them at runtime). > > > > > > > > Regarding this and "2) Side-effect-free repo", can or should we design > > > > the API > > > > as something like a low-level storage interface? That will allow us to > > > > not > > > > make repo/revlog know too much about chg. > > > > > > > > I don't have any concrete idea, but that would work as follows: > > > > > > > > 1. chg injects an object to select storage backends > > > > e.g. repo.storage = chgpreloadable(repo.storage, cache) > > > > 2. repo passes it to revlog, etc. > > > > 3. revlog uses it to read indexfile, where in-memory cache may be > > > > returned > > > > e.g. storage.parserevlog(indexfile) > > > > > > > > Perhaps, this 'storage' object is similar to the one you call > > > > 'baserepository'. > > > > > > I'm not sure if I get the idea (probably not). How does the implementation > > > in the master server look like? > > > > I was just thinking about how to hack the real repo object without > > introducing > > much mess. Perhaps the master server wouldn't be that different from your > > idea. > > > > > It feels more like "repo.chgcache" to me and the difference is that the > > > vanilla hg will be changed to access objects via it (so the interface > > > looks > > > more consistent). > > > > Yeah, it might be like repo.chgcache. > > > > Since we shouldn't pass repo to revlog (it's layering violation), I think > > we'll need a thin wrapper for chgcache anyway. > > I mentioned this in the second mail, "4) Where to get preloaded results (in > worker)", we could just expose some kind of global state, like a > "globalcache" module.
Does it mean any low-level objects will directly access to the global cache? That seems ugly (but maybe I'm biased as I'm really allergic to global data.) > > > Things to consider: > > > > > > a) Objects being preloaded have dependency - ex. the obsstore depends on > > > changelog and other things. The preload functions run in a defined > > > order. > > > > Maybe dependencies could be passed as arguments? > > Ideally, these expensive calculating (ex. obsstore) could be moved to the > index object. In the reality, that requires too much work, and obsstore > preloading requires a subset of "repo", including "repo.revs". > > It's possible to decouple obsstore preloading from the repo object, but that > could be a lot of work too. My opinion for obsstore is that the most costly part would be populating 100k+ objects from file, and the other costly parts could be mitigated by some higher- level cache in repoview.py. But I think this topic was discussed thoroughly between you and pyd before. I'm not intended to bring it up again. > > > b) The index file is not always a single file, depending on "vfs". > > > > Yes. vfs could be owned by storage/chgcache class. > > > > > c) The user may want to control what to preload. For example, if they > > > have > > > an incompatible manifest, they could make changelog preloaded, but > > > not > > > manifest. > > > > No idea about (c). > > > > > d) Users can add other preloading items easily, not only just the > > > predefined ones. > > > > So probably we'll need an extensible table of preloadable items. > > If you check my prototype code, it's using a registrar to collect all > @preload functions. Yes. I wanted to say we would need this kind of abstraction anyway. _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel