On Fri, 3 Feb 2017 19:50:48 +0000, Jun Wu wrote: > Excerpts from Yuya Nishihara's message of 2017-02-04 00:31:45 +0900: > > On Fri, 3 Feb 2017 09:33:32 +0000, Jun Wu wrote: > > > 1) What does the worker talk to the master via IPC? > > > > > > Current: worker sends "possible_dirty_repo: $REPO_PATH" after runcommand > > > > > > It's coupled with the (pseudo) repo object. But the master could > > > preload > > > things in correct order - first changelog, then phase, obsstore, etc. > > > > > > What if: worker sends "possible_dirty_index: $INDEX_PATH", > > > "possible_dirty_obsstore $OBSSTORE_PATH" etc. > > > > > > It looks more flexible at first sight. However the dependency issue > > > could be tricky to solve - for example, the obsstore requires > > > changelog > > > (and revset). It's much harder to preload obsstore without a pseudo > > > repo object with repo.changelog, repo.revs available. > > > > > > Therefore, probably just send the repo path. > > > > Given this, I think pipes would be simpler than using shared memory. The > > master > > IO can be multiplexed by select(). > > If it's one pipe per worker, the code change (I tried) is too much. I think > a single non-blocking pipe could be an option.
Yep. As we'll rely on atomic write(), we won't have to create pipe per worker. > > > 2) Preload function signature (used by master's background thread). > > > > > > This function is to hash and load things. It runs in the master, and is > > > basically a generator (so the "loading" part could be skipped on cache > > > hit, and it's easier to make content and hash_with_content consistent, > > > if > > > they share states): > > > > > > @preloadregistar(something) > > > def preloadfunc(...): > > > yield quick_hash > > > yield content, hash_with_content > > > > Isn't it simpler if preloadfunc() returns quick_hash and a delayed function > > to build content? I feel this generator is kinda abuse and makes it less > > clear > > where the costly part starts. > > I have thought about this. "Generator" is used as a lightweight process that > could be terminated cheaply. It shares states (local variables) for hashing > and loading, making certain things easier. > > Compare: > > @preload('index') > def preloadindex(repo): > f = repo.svfs('changelog.i') # "f" can be shared naturally > hash = fstat(f).st_size > yield hash > > indexdata = f.read() > hash = len(indexdata) > yield parseindex(indexdata), hash > > vs. > > def loadindex(f): > indexdata = f.read() > hash = len(indexdata) > return parseindex(indexdata), hash > > @preload('index') > def preloadindex(repo): > f = repo.svfs('changelog.i') > hash = repo.svfs('changelog.i').st_size > loadfunc = lambda: loadindex(f) # pass "f" explicitly > return hash, loadfunc > > Generator seems to be the shortest way to express the logic. > I think it's easy to say that things after the first "yield" is costly. > > For readibility, I think improved form mentioned at the end of the previous > mail "yield preload.foo(...)" makes things obvious. Perhaps that is a matter of taste, I like the explicit one. But I heard Python 3 has a generator-based coroutine, so using "yield" for that purpose is more common in Python 3? @preload('index') def preloadindex(repo): f = repo.svfs('changelog.i') # "f" can be shared naturally hash = fstat(f).st_size def loadindex(): indexdata = f.read() hash = len(indexdata) return parseindex(indexdata), hash return hash, loadindex BTW, when will 'f' be closed if we take the generator-based abstraction? def preloadindex(repo): f = repo.svfs('changelog.i') try: yield ... yield ... finally: f.close() _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel