Re: [PATCH 1 of 4 RFC] pycompat: delay loading modules registered to stub
"Replacement _pycompatstub designed to be compatible with our demandimporter" On Sun, Aug 14, 2016 at 6:02 PM, Yuya Nishihara wrote: > On Sun, 14 Aug 2016 09:21:07 -0700, Gregory Szorc wrote: >> On Sun, Aug 14, 2016 at 2:33 AM, Yuya Nishihara wrote: >> > # HG changeset patch >> > # User Yuya Nishihara >> > # Date 1471153584 -32400 >> > # Sun Aug 14 14:46:24 2016 +0900 >> > # Node ID 131cc484d7fe95d07f677c16dbb0f506847e1d38 >> > # Parent 0cb2d4db308b97e8fe7faa8d45a47b228037f230 >> > pycompat: delay loading modules registered to stub >> > >> > New _pycompatstub is compatible with our demandimporter. try-except is >> > replaced by version comparison because ImportError will no longer be raised >> > immediately. >> >> Nit: I think you mean "incompatible." This patch looks good otherwise. > > I wanted to say something like "new _pycompatstub works well with our > demandimporter." Can you update the commit message in flight? > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH] hgweb: tweak zlib chunking behavior
# HG changeset patch # User Gregory Szorc # Date 1471235386 25200 # Sun Aug 14 21:29:46 2016 -0700 # Node ID 9428cb5b1bb55c2e72f0713cc6e2536cc3de0292 # Parent 279cd80059d41bbdb91ea9073278cbbe7f1b43d5 hgweb: tweak zlib chunking behavior When doing streaming compression with zlib, zlib appears to emit chunks with data after ~20-30kb on average is available. In other words, most calls to compress() return an empty string. On the mozilla-unified repo, only 48,433 of 921,167 (5.26%) of calls to compress() returned data. In other words, we were sending hundreds of thousands of empty chunks via a generator where they touched who knows how many frames (my guess is millions). Filtering out the empty chunks from the generator cuts down on overhead. In addition, we were previously feeding 8kb chunks into zlib compression. Since this function tends to emit *compressed* data after 20-30kb is available, it would take several calls before data was produced. We increase the amount of data fed in at a time to 32kb. This reduces the number of calls to compress() from 921,167 to 115,146. It also reduces the number of output chunks from 48,433 to 31,377. This does increase the average output chunk size by a little. But I don't think this will matter in most scenarios. The combination of these 2 changes appears to shave ~6s CPU time or ~3% from a server serving the mozilla-unified repo. diff --git a/mercurial/hgweb/protocol.py b/mercurial/hgweb/protocol.py --- a/mercurial/hgweb/protocol.py +++ b/mercurial/hgweb/protocol.py @@ -71,20 +71,24 @@ class webproto(wireproto.abstractserverp self.ui.ferr = self.ui.fout = stringio() def restore(self): val = self.ui.fout.getvalue() self.ui.ferr, self.ui.fout = self.oldio return val def groupchunks(self, cg): z = zlib.compressobj(self.ui.configint('server', 'zliblevel', -1)) while True: -chunk = cg.read(4096) +chunk = cg.read(32768) if not chunk: break -yield z.compress(chunk) +data = z.compress(chunk) +# Not all calls to compress() emit data. It is cheaper to inspect +# that here than to send it via the generator. +if data: +yield data yield z.flush() def _client(self): return 'remote:%s:%s:%s' % ( self.req.env.get('wsgi.url_scheme') or 'http', urlreq.quote(self.req.env.get('REMOTE_HOST', '')), urlreq.quote(self.req.env.get('REMOTE_USER', ''))) def iscmd(cmd): ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 5 of 7 V2] profiling: don't error with statprof when profiling has already started
# HG changeset patch # User Gregory Szorc # Date 1471224523 25200 # Sun Aug 14 18:28:43 2016 -0700 # Node ID ef37b1df0058f54c7b23252b7356bb97b07652d7 # Parent fd7b55561853e9d5532f3a9265433a7b614f2687 profiling: don't error with statprof when profiling has already started statprof.reset() asserts if profiling has already started. So don't call if it profiling is already running. diff --git a/mercurial/profiling.py b/mercurial/profiling.py --- a/mercurial/profiling.py +++ b/mercurial/profiling.py @@ -83,17 +83,19 @@ def statprofile(ui, fp): try: import statprof except ImportError: raise error.Abort(_( 'statprof not available - install using "easy_install statprof"')) freq = ui.configint('profiling', 'freq', default=1000) if freq > 0: -statprof.reset(freq) +# Cannot reset when profiler is already active. So silently no-op. +if statprof.state.profile_level == 0: +statprof.reset(freq) else: ui.warn(_("invalid sampling frequency '%s' - ignoring\n") % freq) statprof.start() try: yield finally: statprof.stop() ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 7 of 7 V2] hgweb: profile HTTP requests
# HG changeset patch # User Gregory Szorc # Date 1471225044 25200 # Sun Aug 14 18:37:24 2016 -0700 # Node ID b4b5e216155e6943d32aef14fd461450af98b57b # Parent 31fae7f40d317b325ba05930a11a9d0bf91f0c8e hgweb: profile HTTP requests Currently, running `hg serve --profile` doesn't yield anything useful: when the process is terminated the profiling output displays results from the main thread, which typically spends most of its time in select.select(). Furthermore, it has no meaningful results from mercurial.* modules because the threads serving HTTP requests don't actually get profiled. This patch teaches the hgweb wsgi applications to profile individual requests. If profiling is enabled, the profiler kicks in after HTTP/WSGI environment processing but before Mercurial's main request processing. The profile results are printed to the configured profiling output. If running `hg serve` from a shell, they will be printed to stderr, just before the HTTP request line is logged. If profiling to a file, we only write a single profile to the file because the file is not opened in append mode. We could add support for appending to files in a future patch if someone wants it. Per request profiling doesn't work with the statprof profiler because internally that profiler collects samples from the thread that *initially* requested profiling be enabled. I have plans to address this by vendoring Facebook's customized statprof and then improving it. diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py --- a/mercurial/hgweb/hgweb_mod.py +++ b/mercurial/hgweb/hgweb_mod.py @@ -23,16 +23,17 @@ from .common import ( ) from .request import wsgirequest from .. import ( encoding, error, hg, hook, +profiling, repoview, templatefilters, templater, ui as uimod, util, ) from . import ( @@ -300,18 +301,19 @@ class hgweb(object): def run_wsgi(self, req): """Internal method to run the WSGI application. This is typically only called by Mercurial. External consumers should be using instances of this class as the WSGI application. """ with self._obtainrepo() as repo: -for r in self._runwsgi(req, repo): -yield r +with profiling.maybeprofile(repo.ui): +for r in self._runwsgi(req, repo): +yield r def _runwsgi(self, req, repo): rctx = requestcontext(self, repo) # This state is global across all threads. encoding.encoding = rctx.config('web', 'encoding', encoding.encoding) rctx.repo.ui.environ = req.env diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py --- a/mercurial/hgweb/hgwebdir_mod.py +++ b/mercurial/hgweb/hgwebdir_mod.py @@ -26,16 +26,17 @@ from .common import ( staticfile, ) from .request import wsgirequest from .. import ( encoding, error, hg, +profiling, scmutil, templater, ui as uimod, util, ) from . import ( hgweb_mod, @@ -212,17 +213,19 @@ class hgwebdir(object): allow_read = ui.configlist('web', 'allow_read', untrusted=True) # by default, allow reading if no allow_read option has been set if (not allow_read) or ismember(ui, user, allow_read): return True return False def run_wsgi(self, req): -return self._runwsgi(req) +with profiling.maybeprofile(self.ui): +for r in self._runwsgi(req): +yield r def _runwsgi(self, req): try: self.refresh() virtual = req.env.get("PATH_INFO", "").strip('/') tmpl = self.templater(req) ctype = tmpl('mimetype', encoding=encoding.encoding) diff --git a/tests/test-profile.t b/tests/test-profile.t --- a/tests/test-profile.t +++ b/tests/test-profile.t @@ -26,9 +26,23 @@ test --profile $ hg --profile st 2>../out $ grep 'events: Ticks' ../out > /dev/null || cat ../out $ hg --profile --config profiling.output=../out st $ grep 'events: Ticks' ../out > /dev/null || cat ../out #endif +#if lsprof serve + +Profiling of HTTP requests works + + $ hg --profile --config profiling.format=text --config profiling.output=../profile.log serve -d -p $HGPORT --pid-file ../hg.pid -A ../access.log + $ cat ../hg.pid >> $DAEMON_PIDS + $ hg -q clone -U http://localhost:$HGPORT ../clone + +A single profile is logged because file logging doesn't append + $ grep CallCount ../profile.log | wc -l + \s*1 (re) + +#endif + $ cd .. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 6 of 7 V2] hgweb: abstract call to hgwebdir wsgi function
# HG changeset patch # User Gregory Szorc # Date 1471215810 25200 # Sun Aug 14 16:03:30 2016 -0700 # Node ID 31fae7f40d317b325ba05930a11a9d0bf91f0c8e # Parent ef37b1df0058f54c7b23252b7356bb97b07652d7 hgweb: abstract call to hgwebdir wsgi function The function names and behavior now matches hgweb. The reason for this will be obvious in the next patch. diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py --- a/mercurial/hgweb/hgwebdir_mod.py +++ b/mercurial/hgweb/hgwebdir_mod.py @@ -212,16 +212,19 @@ class hgwebdir(object): allow_read = ui.configlist('web', 'allow_read', untrusted=True) # by default, allow reading if no allow_read option has been set if (not allow_read) or ismember(ui, user, allow_read): return True return False def run_wsgi(self, req): +return self._runwsgi(req) + +def _runwsgi(self, req): try: self.refresh() virtual = req.env.get("PATH_INFO", "").strip('/') tmpl = self.templater(req) ctype = tmpl('mimetype', encoding=encoding.encoding) ctype = templater.stringify(ctype) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 4 of 7 V2] profiling: add a context manager that no-ops if profiling isn't enabled
# HG changeset patch # User Gregory Szorc # Date 147172 25200 # Sun Aug 14 17:51:12 2016 -0700 # Node ID fd7b55561853e9d5532f3a9265433a7b614f2687 # Parent 3338d4536f490fca055881b4e3884b28e4e25d22 profiling: add a context manager that no-ops if profiling isn't enabled And refactor dispatch.py to use it. As you can see, the resulting code is much simpler. I was tempted to inline _runcommand as part of writing this series. However, a number of extensions wrap _runcommand. So keeping it around is necessary (extensions can't easily wrap runcommand because it calls hooks before and after command execution). diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -893,31 +893,22 @@ def _dispatch(req): try: return runcommand(lui, repo, cmd, fullargs, ui, options, d, cmdpats, cmdoptions) finally: if repo and repo != req.repo: repo.close() def _runcommand(ui, options, cmd, cmdfunc): -"""Enables the profiler if applicable. - -``profiling.enabled`` - boolean config that enables or disables profiling -""" -def checkargs(): +"""Run a command function, possibly with profiling enabled.""" +with profiling.maybeprofile(ui): try: return cmdfunc() except error.SignatureError: -raise error.CommandError(cmd, _("invalid arguments")) - -if ui.configbool('profiling', 'enabled'): -with profiling.profile(ui): -return checkargs() -else: -return checkargs() +raise error.CommandError(cmd, _('invalid arguments')) def _exceptionwarning(ui): """Produce a warning message for the current active exception""" # For compatibility checking, we discard the portion of the hg # version after the + on the assumption that if a "normal # user" is running a build with a + in it the packager # probably built from fairly close to a tag and anyone with a diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt --- a/mercurial/help/config.txt +++ b/mercurial/help/config.txt @@ -1388,16 +1388,22 @@ Specifies profiling type, format, and fi supported: an instrumenting profiler (named ``ls``), and a sampling profiler (named ``stat``). In this section description, 'profiling data' stands for the raw data collected during profiling, while 'profiling report' stands for a statistical text report generated from the profiling data. The profiling is done using lsprof. +``enabled`` +Enable the profiler. +(default: false) + +This is equivalent to passing ``--profile`` on the command line. + ``type`` The type of profiler to use. (default: ls) ``ls`` Use Python's built-in instrumenting profiler. This profiler works on all platforms, but each line number it reports is the first line of a function. This restriction makes it difficult to diff --git a/mercurial/profiling.py b/mercurial/profiling.py --- a/mercurial/profiling.py +++ b/mercurial/profiling.py @@ -138,8 +138,26 @@ def profile(ui): if output: if output == 'blackbox': val = 'Profile:\n%s' % fp.getvalue() # ui.log treats the input as a format string, # so we need to escape any % signs. val = val.replace('%', '%%') ui.log('profile', val) fp.close() + +@contextlib.contextmanager +def maybeprofile(ui): +"""Profile if enabled, else do nothing. + +This context manager can be used to optionally profile if profiling +is enabled. Otherwise, it does nothing. + +The purpose of this context manager is to make calling code simpler: +just use a single code path for calling into code you may want to profile +and this function determines whether to start profiling. +""" +# internal config profiling.enabled +if ui.configbool('profiling', 'enabled'): +with profile(ui): +yield +else: +yield diff --git a/tests/test-devel-warnings.t b/tests/test-devel-warnings.t --- a/tests/test-devel-warnings.t +++ b/tests/test-devel-warnings.t @@ -91,17 +91,16 @@ */mercurial/dispatch.py:* in run (glob) */mercurial/dispatch.py:* in dispatch (glob) */mercurial/dispatch.py:* in _runcatch (glob) */mercurial/dispatch.py:* in callcatch (glob) */mercurial/dispatch.py:* in _runcatchfunc (glob) */mercurial/dispatch.py:* in _dispatch (glob) */mercurial/dispatch.py:* in runcommand (glob) */mercurial/dispatch.py:* in _runcommand (glob) - */mercurial/dispatch.py:* in checkargs (glob) */mercurial/dispatch.py:* in (glob) */mercurial/util.py:* in check (glob) $TESTTMP/buggylocking.py:* in buggylocking (glob) $ hg properlocking $ hg nowaitlocking $ echo a > a $ hg add a @@ -127,17 +126,16 @@ */mercurial/dispatch.py:* in run (glob) */mercurial/dispatch.py:* i
[PATCH 3 of 7 V2] profiling: make profiling functions context managers (API)
# HG changeset patch # User Gregory Szorc # Date 1471224322 25200 # Sun Aug 14 18:25:22 2016 -0700 # Node ID 3338d4536f490fca055881b4e3884b28e4e25d22 # Parent fd888ffaab6720688e4ad3f0358be09509effb6f profiling: make profiling functions context managers (API) This makes profiling more flexible since we can now call multiple functions when a profiler is active. But the real reason for this is to enable a future consumer to profile a function that returns a generator. We can't do this from the profiling function itself because functions can either be generators or have return values: they can't be both. So therefore it isn't possible to have a generic profiling function that can both consume and re-emit a generator and return a value. diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -904,17 +904,18 @@ def _runcommand(ui, options, cmd, cmdfun """ def checkargs(): try: return cmdfunc() except error.SignatureError: raise error.CommandError(cmd, _("invalid arguments")) if ui.configbool('profiling', 'enabled'): -return profiling.profile(ui, checkargs) +with profiling.profile(ui): +return checkargs() else: return checkargs() def _exceptionwarning(ui): """Produce a warning message for the current active exception""" # For compatibility checking, we discard the portion of the hg # version after the + on the assumption that if a "normal diff --git a/mercurial/profiling.py b/mercurial/profiling.py --- a/mercurial/profiling.py +++ b/mercurial/profiling.py @@ -2,27 +2,29 @@ # # Copyright 2016 Gregory Szorc # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. from __future__ import absolute_import, print_function +import contextlib import os import sys import time from .i18n import _ from . import ( error, util, ) -def lsprofile(ui, func, fp): +@contextlib.contextmanager +def lsprofile(ui, fp): format = ui.config('profiling', 'format', default='text') field = ui.config('profiling', 'sort', default='inlinetime') limit = ui.configint('profiling', 'limit', default=30) climit = ui.configint('profiling', 'nested', default=0) if format not in ['text', 'kcachegrind']: ui.warn(_("unrecognized profiling format '%s'" " - Ignored\n") % format) @@ -32,76 +34,83 @@ def lsprofile(ui, func, fp): from . import lsprof except ImportError: raise error.Abort(_( 'lsprof not available - install from ' 'http://codespeak.net/svn/user/arigo/hack/misc/lsprof/')) p = lsprof.Profiler() p.enable(subcalls=True) try: -return func() +yield finally: p.disable() if format == 'kcachegrind': from . import lsprofcalltree calltree = lsprofcalltree.KCacheGrind(p) calltree.output(fp) else: # format == 'text' stats = lsprof.Stats(p.getstats()) stats.sort(field) stats.pprint(limit=limit, file=fp, climit=climit) -def flameprofile(ui, func, fp): +@contextlib.contextmanager +def flameprofile(ui, fp): try: from flamegraph import flamegraph except ImportError: raise error.Abort(_( 'flamegraph not available - install from ' 'https://github.com/evanhempel/python-flamegraph')) # developer config: profiling.freq freq = ui.configint('profiling', 'freq', default=1000) filter_ = None collapse_recursion = True thread = flamegraph.ProfileThread(fp, 1.0 / freq, filter_, collapse_recursion) start_time = time.clock() try: thread.start() -func() +yield finally: thread.stop() thread.join() print('Collected %d stack frames (%d unique) in %2.2f seconds.' % ( time.clock() - start_time, thread.num_frames(), thread.num_frames(unique=True))) -def statprofile(ui, func, fp): +@contextlib.contextmanager +def statprofile(ui, fp): try: import statprof except ImportError: raise error.Abort(_( 'statprof not available - install using "easy_install statprof"')) freq = ui.configint('profiling', 'freq', default=1000) if freq > 0: statprof.reset(freq) else: ui.warn(_("invalid sampling frequency '%s' - ignoring\n") % freq) statprof.start() try: -return func() +yield finally: statprof.stop() statprof.display(fp) -def profile(ui, fn): -"""Profile a function call.""" +@contextlib.contextmanager +def profile(ui): +"""Start profiling. + +Profiling is active when the context manager is acti
[PATCH 2 of 7 V2] dispatch: set profiling.enabled when profiling is enabled
# HG changeset patch # User Gregory Szorc # Date 1471217758 25200 # Sun Aug 14 16:35:58 2016 -0700 # Node ID fd888ffaab6720688e4ad3f0358be09509effb6f # Parent 40cbc513713837acd6a9f593bfd48759eabcce33 dispatch: set profiling.enabled when profiling is enabled We do this for other global command arguments. We don't for --profile for reasons that are unknown to me. Probably because nobody has needed it. An upcoming patch will introduce a new consumer of the profiling code. It doesn't have access to command line arguments. So let's set the config option during argument processing. We also remove a check for "options['profile']" because it is now redundant. diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -814,16 +814,20 @@ def _dispatch(req): uis.add(req.repo.ui) if options['verbose'] or options['debug'] or options['quiet']: for opt in ('verbose', 'debug', 'quiet'): val = str(bool(options[opt])) for ui_ in uis: ui_.setconfig('ui', opt, val, '--' + opt) +if options['profile']: +for ui_ in uis: +ui_.setconfig('profiling', 'enabled', 'true', '--profile') + if options['traceback']: for ui_ in uis: ui_.setconfig('ui', 'traceback', 'on', '--traceback') if options['noninteractive']: for ui_ in uis: ui_.setconfig('ui', 'interactive', 'off', '-y') @@ -899,17 +903,17 @@ def _runcommand(ui, options, cmd, cmdfun ``profiling.enabled`` - boolean config that enables or disables profiling """ def checkargs(): try: return cmdfunc() except error.SignatureError: raise error.CommandError(cmd, _("invalid arguments")) -if options['profile'] or ui.configbool('profiling', 'enabled'): +if ui.configbool('profiling', 'enabled'): return profiling.profile(ui, checkargs) else: return checkargs() def _exceptionwarning(ui): """Produce a warning message for the current active exception""" # For compatibility checking, we discard the portion of the hg ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 7 V2] profiling: move profiling code from dispatch.py (API)
# HG changeset patch # User Gregory Szorc # Date 1471217444 25200 # Sun Aug 14 16:30:44 2016 -0700 # Node ID 40cbc513713837acd6a9f593bfd48759eabcce33 # Parent 279cd80059d41bbdb91ea9073278cbbe7f1b43d5 profiling: move profiling code from dispatch.py (API) Currently, profiling code lives in dispatch.py, which is a low-level module centered around command dispatch. Furthermore, dispatch.py imports a lot of other modules, meaning that importing dispatch.py to get at profiling functionality would often result in a module import cycle. Profiling is a generic activity. It shouldn't be limited to command dispatch. This patch moves profiling code from dispatch.py to the new profiling.py. The low-level "run a profiler against a function" functions have been moved verbatim. The code for determining how to invoke the profiler has been extracted to its own function. I decided to create a new module rather than stick this code elsewhere (such as util.py) because util.py is already quite large. And, I foresee this file growing larger once Facebook's profiling enhancements get added to it. diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -29,16 +29,17 @@ from . import ( demandimport, encoding, error, extensions, fancyopts, fileset, hg, hook, +profiling, revset, templatefilters, templatekw, templater, ui as uimod, util, ) @@ -887,140 +888,29 @@ def _dispatch(req): d = lambda: util.checksignature(func)(ui, *args, **cmdoptions) try: return runcommand(lui, repo, cmd, fullargs, ui, options, d, cmdpats, cmdoptions) finally: if repo and repo != req.repo: repo.close() -def lsprofile(ui, func, fp): -format = ui.config('profiling', 'format', default='text') -field = ui.config('profiling', 'sort', default='inlinetime') -limit = ui.configint('profiling', 'limit', default=30) -climit = ui.configint('profiling', 'nested', default=0) - -if format not in ['text', 'kcachegrind']: -ui.warn(_("unrecognized profiling format '%s'" -" - Ignored\n") % format) -format = 'text' - -try: -from . import lsprof -except ImportError: -raise error.Abort(_( -'lsprof not available - install from ' -'http://codespeak.net/svn/user/arigo/hack/misc/lsprof/')) -p = lsprof.Profiler() -p.enable(subcalls=True) -try: -return func() -finally: -p.disable() - -if format == 'kcachegrind': -from . import lsprofcalltree -calltree = lsprofcalltree.KCacheGrind(p) -calltree.output(fp) -else: -# format == 'text' -stats = lsprof.Stats(p.getstats()) -stats.sort(field) -stats.pprint(limit=limit, file=fp, climit=climit) - -def flameprofile(ui, func, fp): -try: -from flamegraph import flamegraph -except ImportError: -raise error.Abort(_( -'flamegraph not available - install from ' -'https://github.com/evanhempel/python-flamegraph')) -# developer config: profiling.freq -freq = ui.configint('profiling', 'freq', default=1000) -filter_ = None -collapse_recursion = True -thread = flamegraph.ProfileThread(fp, 1.0 / freq, - filter_, collapse_recursion) -start_time = time.clock() -try: -thread.start() -func() -finally: -thread.stop() -thread.join() -print('Collected %d stack frames (%d unique) in %2.2f seconds.' % ( -time.clock() - start_time, thread.num_frames(), -thread.num_frames(unique=True))) - - -def statprofile(ui, func, fp): -try: -import statprof -except ImportError: -raise error.Abort(_( -'statprof not available - install using "easy_install statprof"')) - -freq = ui.configint('profiling', 'freq', default=1000) -if freq > 0: -statprof.reset(freq) -else: -ui.warn(_("invalid sampling frequency '%s' - ignoring\n") % freq) - -statprof.start() -try: -return func() -finally: -statprof.stop() -statprof.display(fp) - def _runcommand(ui, options, cmd, cmdfunc): """Enables the profiler if applicable. ``profiling.enabled`` - boolean config that enables or disables profiling """ def checkargs(): try: return cmdfunc() except error.SignatureError: raise error.CommandError(cmd, _("invalid arguments")) if options['profile'] or ui.configbool('profiling', 'enabled'): -profiler = os.getenv('HGPROF') -if profiler is None: -profiler = ui.config('profiling', 'type', default='ls') -if profiler not in ('ls', 'stat', 'flame'): -ui.warn(_("unrecogni
Re: [PATCH 5 of 5 RFC] osutil: switch to placeholder module that imports cpy/pure selectively
On Sun, Aug 14, 2016 at 6:12 PM, Yuya Nishihara wrote: > On Sun, 14 Aug 2016 09:52:19 -0700, Gregory Szorc wrote: > > On Sat, Aug 13, 2016 at 3:15 AM, Yuya Nishihara wrote: > > > # HG changeset patch > > > # User Yuya Nishihara > > > # Date 1470969317 -32400 > > > # Fri Aug 12 11:35:17 2016 +0900 > > > # Node ID e1318b322922baf91a146d92b1ee21cc0e509c04 > > > # Parent 3d8b3e09190aaacc3c57ae37b265838df4accb1a > > > osutil: switch to placeholder module that imports cpy/pure selectively > > > > > > You have to do "make clean" to test this change. Otherwise, the > existing > > > osutil.so would be loaded directly. Also, you need "make clean" to go > back > > > to previous revisions. > > > > > > Nit: should we move osutil.c to mercurial/cpy? > > > > I like the spirit of this series to establish some more order around the > > various module implementations. > > > > I'm pretty sure mpm won't like the `make clean` requirement, as he likes > > the ability to bisect without having to worry about build system foo. > > Yeah, I know that would be the most controversial part of this series. > Good news is you can bisect without "make clean" as long as you have > mercurial/osutil.so compiled by old revisions. .so appears to precede .py. > > > I haven't been paying close attention to the cpy/pure/cffi discussions. I > > know there is a common pattern in other Python projects of having modules > > define the pure Python code inline then have code at the bottom of the > > module/file that imports C extensions/libraries and overwrites the Python > > bits. So e.g. we could move mercurial/pure/osutil.py to > mercurial/osutil.py > > then at the bottom of the file do something like: > > > > if modulepolicy != 'py': > > globals().update(policy.uimportvars(u'osutil')) > > > > Or we could potentially inline the cffi bits without having to a) > maintain > > a separate file/module or b) abstract the import mechanism for modules > with > > C implementations. In other words, we could add C implementations to any > > module without having to tell the module import mechanism about which > > modules contain C implementations. We would likely still need this > > policy.importvars() trick for C extensions, since C extensions need to > live > > in their own module. But at least we wouldn't have an extra module for > cffi. > > Another idea I came up with is to add a helper to load cffi functions into > pure. This works only if cffi functions don't depend on pure functions and > classes. > > # mercurial/pure/osutil.py > ... > globals().update(policy.importcffioverrides(u'osutil')) > > What I want to address right now are: > > a) move cffi codes and compiled objects out of the root directory > b) centralize the import rule of cffi, instead of try-except in pure > modules > I think that this series is a stop in the right direction towards making the cffi integration nicer. We can always work on "unifying" the modules later. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 3 of 5 RFC] policy: add helper to import cpy/pure module selectively
On Sun, 14 Aug 2016 09:38:26 -0700, Gregory Szorc wrote: > On Sat, Aug 13, 2016 at 3:15 AM, Yuya Nishihara wrote: > > # HG changeset patch > > # User Yuya Nishihara > > # Date 1470969017 -32400 > > # Fri Aug 12 11:30:17 2016 +0900 > > # Node ID 8bb0a9223bce791bae9e6a07238d6b9722e5577a > > # Parent bda4422fc315b54356ba1949e5ba861dd2660e86 > > policy: add helper to import cpy/pure module selectively > > > > Unfortunately, __import__() nor getattr() does not take bytes, and > > globals() > > is a dict keyed by unicodes on Python 3. uimportvars() is a unicode API to > > comply with their requirement. I really hated mixing bytes and unicodes, > > but > > bytes version of importvars() would be useless. > > > > Nit: KeyError would be raised for invalid HGMODULEPOLICY. What should we > > do in that case? > > > > We have precedence for this in the `hg` script: we write an untranslated > abort message > to sys.stderr and call sys.exit. Ideally we would translate that message. > But without > a module load policy, that gets difficult. We would probably ideally change > the policy to > "pure" (since that will always work), load i18n.py, then do a normal abort. > But to do that > may require moving some policy loading code into __init__.py. I don't care much about i18n stuff for early exceptions. I'll try to handle invalid policy in "hg" script. > > +def uimportvars(uname): > > +"""Import module according to policy and return exported vars""" > > +upkgnames = _upackagetable[policy] > > +for upkg in upkgnames: > > +try: > > +# pass globals() to resolve name relative to this module > > +mod = __import__(u'%s.%s' % (upkg, uname), globals(), level=1) > > +mod = getattr(mod, uname) > > +return _uexportedvars(mod) > > +except ImportError: > > +if upkg == upkgnames[-1]: > > +raise > > The "u" prefixing feels a bit weird. Your commit message says the bytes > version of "importvars" would be useless. So why the prefix? Callers have to pass u'' instead of '', which is very different from the other Mercurial APIs. So the function should have some ugliness to signal it. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 5 of 5 RFC] osutil: switch to placeholder module that imports cpy/pure selectively
On Sun, 14 Aug 2016 09:52:19 -0700, Gregory Szorc wrote: > On Sat, Aug 13, 2016 at 3:15 AM, Yuya Nishihara wrote: > > # HG changeset patch > > # User Yuya Nishihara > > # Date 1470969317 -32400 > > # Fri Aug 12 11:35:17 2016 +0900 > > # Node ID e1318b322922baf91a146d92b1ee21cc0e509c04 > > # Parent 3d8b3e09190aaacc3c57ae37b265838df4accb1a > > osutil: switch to placeholder module that imports cpy/pure selectively > > > > You have to do "make clean" to test this change. Otherwise, the existing > > osutil.so would be loaded directly. Also, you need "make clean" to go back > > to previous revisions. > > > > Nit: should we move osutil.c to mercurial/cpy? > > I like the spirit of this series to establish some more order around the > various module implementations. > > I'm pretty sure mpm won't like the `make clean` requirement, as he likes > the ability to bisect without having to worry about build system foo. Yeah, I know that would be the most controversial part of this series. Good news is you can bisect without "make clean" as long as you have mercurial/osutil.so compiled by old revisions. .so appears to precede .py. > I haven't been paying close attention to the cpy/pure/cffi discussions. I > know there is a common pattern in other Python projects of having modules > define the pure Python code inline then have code at the bottom of the > module/file that imports C extensions/libraries and overwrites the Python > bits. So e.g. we could move mercurial/pure/osutil.py to mercurial/osutil.py > then at the bottom of the file do something like: > > if modulepolicy != 'py': > globals().update(policy.uimportvars(u'osutil')) > > Or we could potentially inline the cffi bits without having to a) maintain > a separate file/module or b) abstract the import mechanism for modules with > C implementations. In other words, we could add C implementations to any > module without having to tell the module import mechanism about which > modules contain C implementations. We would likely still need this > policy.importvars() trick for C extensions, since C extensions need to live > in their own module. But at least we wouldn't have an extra module for cffi. Another idea I came up with is to add a helper to load cffi functions into pure. This works only if cffi functions don't depend on pure functions and classes. # mercurial/pure/osutil.py ... globals().update(policy.importcffioverrides(u'osutil')) What I want to address right now are: a) move cffi codes and compiled objects out of the root directory b) centralize the import rule of cffi, instead of try-except in pure modules ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 4 RFC] pycompat: delay loading modules registered to stub
On Sun, 14 Aug 2016 09:21:07 -0700, Gregory Szorc wrote: > On Sun, Aug 14, 2016 at 2:33 AM, Yuya Nishihara wrote: > > # HG changeset patch > > # User Yuya Nishihara > > # Date 1471153584 -32400 > > # Sun Aug 14 14:46:24 2016 +0900 > > # Node ID 131cc484d7fe95d07f677c16dbb0f506847e1d38 > > # Parent 0cb2d4db308b97e8fe7faa8d45a47b228037f230 > > pycompat: delay loading modules registered to stub > > > > New _pycompatstub is compatible with our demandimporter. try-except is > > replaced by version comparison because ImportError will no longer be raised > > immediately. > > Nit: I think you mean "incompatible." This patch looks good otherwise. I wanted to say something like "new _pycompatstub works well with our demandimporter." Can you update the commit message in flight? ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 8 of 9 RFC] wireproto: introduce listkeys2 command
On Sun, Aug 14, 2016 at 04:59:50PM -0700, Gregory Szorc wrote: > On Sun, Aug 14, 2016 at 3:12 PM, Mike Hommey wrote: > > > On Sun, Aug 14, 2016 at 02:10:07PM -0700, Gregory Szorc wrote: > > > # HG changeset patch > > > # User Gregory Szorc > > > # Date 1471208237 25200 > > > # Sun Aug 14 13:57:17 2016 -0700 > > > # Node ID d2870bcbc43041909e9f637b294cb889f7ed4933 > > > # Parent eb2bc1ac7869ad255965d16004524a95cea83c9d > > > wireproto: introduce listkeys2 command > > > > > > The command behaves exactly like "listkeys" except it uses a more > > > efficient and more robust binary encoding mechanism. > > > > Nowhere in the patch queue I see mentioned why you want this. Not saying > > that this shouldn't be done, but it's really not clear what the expected > > benefit is of all this refactoring and this new command. > > > I said it concisely in the commit message you just quoted: the wire > encoding is smaller and is able to represent all binary values. I offer > more detail at > https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/087243.html. ... and a large part of that message should be in this commit message. Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 6] profiling: move profiling code from dispatch.py (API)
On Sun, Aug 14, 2016 at 5:03 PM, Gregory Szorc wrote: > # HG changeset patch > # User Gregory Szorc > # Date 1471217444 25200 > # Sun Aug 14 16:30:44 2016 -0700 > # Node ID 40cbc513713837acd6a9f593bfd48759eabcce33 > # Parent 279cd80059d41bbdb91ea9073278cbbe7f1b43d5 > profiling: move profiling code from dispatch.py (API) > I found a few minor issues with this series. Please don't ninja queue. I'll send a v2, hopefully tonight. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 4 of 6] profiling: add a context manager that no-ops if profiling isn't enabled
# HG changeset patch # User Gregory Szorc # Date 1471217701 25200 # Sun Aug 14 16:35:01 2016 -0700 # Node ID 7d945851998c89444a8a53ac7daf1e0798cb7838 # Parent cf7b933cbc7fd0dfc4c5d5d67deae2d52866088a profiling: add a context manager that no-ops if profiling isn't enabled And refactor dispatch.py to use it. As you can see, the resulting code is much simpler. I was tempted to inline _runcommand as part of writing this series. However, a number of extensions wrap _runcommand. So keeping it around is necessary (extensions can't easily wrap runcommand because it calls hooks before and after command execution). diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -893,31 +893,22 @@ def _dispatch(req): try: return runcommand(lui, repo, cmd, fullargs, ui, options, d, cmdpats, cmdoptions) finally: if repo and repo != req.repo: repo.close() def _runcommand(ui, options, cmd, cmdfunc): -"""Enables the profiler if applicable. - -``profiling.enabled`` - boolean config that enables or disables profiling -""" -def checkargs(): +"""Run a command function, possibly with profiling enabled.""" +with profiling.maybeprofile(ui): try: return cmdfunc() except error.SignatureError: -raise error.CommandError(cmd, _("invalid arguments")) - -if ui.configbool('profiling', 'enabled'): -with profiling.profile(ui): -return checkargs() -else: -return checkargs() +raise error.CommandError(cmd, _('invalid arguments')) def _exceptionwarning(ui): """Produce a warning message for the current active exception""" # For compatibility checking, we discard the portion of the hg # version after the + on the assumption that if a "normal # user" is running a build with a + in it the packager # probably built from fairly close to a tag and anyone with a diff --git a/mercurial/profiling.py b/mercurial/profiling.py --- a/mercurial/profiling.py +++ b/mercurial/profiling.py @@ -137,8 +137,25 @@ def profile(ui): if output: if output == 'blackbox': val = 'Profile:\n%s' % fp.getvalue() # ui.log treats the input as a format string, # so we need to escape any % signs. val = val.replace('%', '%%') ui.log('profile', val) fp.close() + +@contextlib.contextmanager +def maybeprofile(ui): +"""Profile if enabled, else do nothing. + +This context manager can be used to optionally profile if profiling +is enabled. Otherwise, it does nothing. + +The purpose of this context manager is to make calling code simpler: +just use a single code path for calling into code you may want to profile +and this function determines whether to start profiling. +""" +if ui.configbool('profiling', 'enabled'): +with profile(ui): +yield +else: +yield ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 5 of 6] hgweb: abstract call to hgwebdir wsgi function
# HG changeset patch # User Gregory Szorc # Date 1471215810 25200 # Sun Aug 14 16:03:30 2016 -0700 # Node ID a04a42c139225b2f5b3166541fd7466190459822 # Parent 7d945851998c89444a8a53ac7daf1e0798cb7838 hgweb: abstract call to hgwebdir wsgi function The function names and behavior now matches hgweb. The reason for this will be obvious in the next patch. diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py --- a/mercurial/hgweb/hgwebdir_mod.py +++ b/mercurial/hgweb/hgwebdir_mod.py @@ -212,16 +212,19 @@ class hgwebdir(object): allow_read = ui.configlist('web', 'allow_read', untrusted=True) # by default, allow reading if no allow_read option has been set if (not allow_read) or ismember(ui, user, allow_read): return True return False def run_wsgi(self, req): +return self._runwsgi(req) + +def _runwsgi(self, req): try: self.refresh() virtual = req.env.get("PATH_INFO", "").strip('/') tmpl = self.templater(req) ctype = tmpl('mimetype', encoding=encoding.encoding) ctype = templater.stringify(ctype) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 6 of 6] hgweb: profile HTTP requests
# HG changeset patch # User Gregory Szorc # Date 1471219358 25200 # Sun Aug 14 17:02:38 2016 -0700 # Node ID 817b24db6bea900c690dae7ab0f2d15244d78724 # Parent a04a42c139225b2f5b3166541fd7466190459822 hgweb: profile HTTP requests Currently, running `hg serve --profile` doesn't yield anything useful: when the process is terminated the profiling output displays results from the main thread, which typically spends most of its time in select.select(). Furthermore, it has no meaningful results from mercurial.* modules because the threads serving HTTP requests don't actually get profiled. This patch teaches the hgweb wsgi applications to profile individual requests. If profiling is enabled, the profiler kicks in after HTTP/WSGI environment processing but before Mercurial's main request processing. The profile results are printed to the configured profiling output. If running `hg serve` from a shell, they will be printed to stderr, just before the HTTP request line is logged. If profiling to a file, we only write a single profile to the file because the file is not opened in append mode. We could add support for appending to files in a future patch if someone wants it. Having this profiling data has already given me new avenues of performance optimization to pursue in the HTTP server. diff --git a/mercurial/hgweb/hgweb_mod.py b/mercurial/hgweb/hgweb_mod.py --- a/mercurial/hgweb/hgweb_mod.py +++ b/mercurial/hgweb/hgweb_mod.py @@ -23,16 +23,17 @@ from .common import ( ) from .request import wsgirequest from .. import ( encoding, error, hg, hook, +profiling, repoview, templatefilters, templater, ui as uimod, util, ) from . import ( @@ -300,18 +301,19 @@ class hgweb(object): def run_wsgi(self, req): """Internal method to run the WSGI application. This is typically only called by Mercurial. External consumers should be using instances of this class as the WSGI application. """ with self._obtainrepo() as repo: -for r in self._runwsgi(req, repo): -yield r +with profiling.maybeprofile(repo.ui): +for r in self._runwsgi(req, repo): +yield r def _runwsgi(self, req, repo): rctx = requestcontext(self, repo) # This state is global across all threads. encoding.encoding = rctx.config('web', 'encoding', encoding.encoding) rctx.repo.ui.environ = req.env diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py --- a/mercurial/hgweb/hgwebdir_mod.py +++ b/mercurial/hgweb/hgwebdir_mod.py @@ -26,16 +26,17 @@ from .common import ( staticfile, ) from .request import wsgirequest from .. import ( encoding, error, hg, +profiling, scmutil, templater, ui as uimod, util, ) from . import ( hgweb_mod, @@ -212,17 +213,19 @@ class hgwebdir(object): allow_read = ui.configlist('web', 'allow_read', untrusted=True) # by default, allow reading if no allow_read option has been set if (not allow_read) or ismember(ui, user, allow_read): return True return False def run_wsgi(self, req): -return self._runwsgi(req) +with profiling.maybeprofile(self.ui): +for r in self._runwsgi(req): +yield r def _runwsgi(self, req): try: self.refresh() virtual = req.env.get("PATH_INFO", "").strip('/') tmpl = self.templater(req) ctype = tmpl('mimetype', encoding=encoding.encoding) diff --git a/tests/test-profile.t b/tests/test-profile.t --- a/tests/test-profile.t +++ b/tests/test-profile.t @@ -26,9 +26,23 @@ test --profile $ hg --profile st 2>../out $ grep 'events: Ticks' ../out > /dev/null || cat ../out $ hg --profile --config profiling.output=../out st $ grep 'events: Ticks' ../out > /dev/null || cat ../out #endif +#if lsprof serve + +Profiling of HTTP requests works + + $ hg --profile --config profiling.format=text --config profiling.output=../profile.log serve -d -p $HGPORT --pid-file ../hg.pid -A ../access.log + $ cat ../hg.pid >> $DAEMON_PIDS + $ hg -q clone -U http://localhost:$HGPORT ../clone + +A single profile is logged because file logging doesn't append + $ grep CallCount ../profile.log | wc -l + 1 + +#endif + $ cd .. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 3 of 6] profiling: make profiling functions context managers (API)
# HG changeset patch # User Gregory Szorc # Date 1471217557 25200 # Sun Aug 14 16:32:37 2016 -0700 # Node ID cf7b933cbc7fd0dfc4c5d5d67deae2d52866088a # Parent fd888ffaab6720688e4ad3f0358be09509effb6f profiling: make profiling functions context managers (API) This makes profiling more flexible since we can now call multiple functions when a profiler is active. But the real reason for this is to enable a future consumer to profile a function that returns a generator. We can't do this from the profiling function itself because functions can either be generators or have return values: they can't be both. So therefore it isn't possible to have a generic profiling function that can both consume and re-emit a generator and return a value. diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -904,17 +904,18 @@ def _runcommand(ui, options, cmd, cmdfun """ def checkargs(): try: return cmdfunc() except error.SignatureError: raise error.CommandError(cmd, _("invalid arguments")) if ui.configbool('profiling', 'enabled'): -return profiling.profile(ui, checkargs) +with profiling.profile(ui): +return checkargs() else: return checkargs() def _exceptionwarning(ui): """Produce a warning message for the current active exception""" # For compatibility checking, we discard the portion of the hg # version after the + on the assumption that if a "normal diff --git a/mercurial/profiling.py b/mercurial/profiling.py --- a/mercurial/profiling.py +++ b/mercurial/profiling.py @@ -2,27 +2,29 @@ # # Copyright 2016 Gregory Szorc # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. from __future__ import absolute_import, print_function +import contextlib import os import sys import time from .i18n import _ from . import ( error, util, ) -def lsprofile(ui, func, fp): +@contextlib.contextmanager +def lsprofile(ui, fp): format = ui.config('profiling', 'format', default='text') field = ui.config('profiling', 'sort', default='inlinetime') limit = ui.configint('profiling', 'limit', default=30) climit = ui.configint('profiling', 'nested', default=0) if format not in ['text', 'kcachegrind']: ui.warn(_("unrecognized profiling format '%s'" " - Ignored\n") % format) @@ -32,76 +34,82 @@ def lsprofile(ui, func, fp): from . import lsprof except ImportError: raise error.Abort(_( 'lsprof not available - install from ' 'http://codespeak.net/svn/user/arigo/hack/misc/lsprof/')) p = lsprof.Profiler() p.enable(subcalls=True) try: -return func() +yield finally: p.disable() if format == 'kcachegrind': from . import lsprofcalltree calltree = lsprofcalltree.KCacheGrind(p) calltree.output(fp) else: # format == 'text' stats = lsprof.Stats(p.getstats()) stats.sort(field) stats.pprint(limit=limit, file=fp, climit=climit) -def flameprofile(ui, func, fp): +@contextlib.contextmanager +def flameprofile(ui, fp): try: from flamegraph import flamegraph except ImportError: raise error.Abort(_( 'flamegraph not available - install from ' 'https://github.com/evanhempel/python-flamegraph')) # developer config: profiling.freq freq = ui.configint('profiling', 'freq', default=1000) filter_ = None collapse_recursion = True thread = flamegraph.ProfileThread(fp, 1.0 / freq, filter_, collapse_recursion) start_time = time.clock() try: thread.start() -func() +yield finally: thread.stop() thread.join() print('Collected %d stack frames (%d unique) in %2.2f seconds.' % ( time.clock() - start_time, thread.num_frames(), thread.num_frames(unique=True))) -def statprofile(ui, func, fp): +def statprofile(ui, fp): try: import statprof except ImportError: raise error.Abort(_( 'statprof not available - install using "easy_install statprof"')) freq = ui.configint('profiling', 'freq', default=1000) if freq > 0: statprof.reset(freq) else: ui.warn(_("invalid sampling frequency '%s' - ignoring\n") % freq) statprof.start() try: -return func() +yield finally: statprof.stop() statprof.display(fp) -def profile(ui, fn): -"""Profile a function call.""" +@contextlib.contextmanager +def profile(ui): +"""Start profiling. + +Profiling is active when the context manager is active. When the context +ma
[PATCH 2 of 6] dispatch: set profiling.enabled when profiling is enabled
# HG changeset patch # User Gregory Szorc # Date 1471217758 25200 # Sun Aug 14 16:35:58 2016 -0700 # Node ID fd888ffaab6720688e4ad3f0358be09509effb6f # Parent 40cbc513713837acd6a9f593bfd48759eabcce33 dispatch: set profiling.enabled when profiling is enabled We do this for other global command arguments. We don't for --profile for reasons that are unknown to me. Probably because nobody has needed it. An upcoming patch will introduce a new consumer of the profiling code. It doesn't have access to command line arguments. So let's set the config option during argument processing. We also remove a check for "options['profile']" because it is now redundant. diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -814,16 +814,20 @@ def _dispatch(req): uis.add(req.repo.ui) if options['verbose'] or options['debug'] or options['quiet']: for opt in ('verbose', 'debug', 'quiet'): val = str(bool(options[opt])) for ui_ in uis: ui_.setconfig('ui', opt, val, '--' + opt) +if options['profile']: +for ui_ in uis: +ui_.setconfig('profiling', 'enabled', 'true', '--profile') + if options['traceback']: for ui_ in uis: ui_.setconfig('ui', 'traceback', 'on', '--traceback') if options['noninteractive']: for ui_ in uis: ui_.setconfig('ui', 'interactive', 'off', '-y') @@ -899,17 +903,17 @@ def _runcommand(ui, options, cmd, cmdfun ``profiling.enabled`` - boolean config that enables or disables profiling """ def checkargs(): try: return cmdfunc() except error.SignatureError: raise error.CommandError(cmd, _("invalid arguments")) -if options['profile'] or ui.configbool('profiling', 'enabled'): +if ui.configbool('profiling', 'enabled'): return profiling.profile(ui, checkargs) else: return checkargs() def _exceptionwarning(ui): """Produce a warning message for the current active exception""" # For compatibility checking, we discard the portion of the hg ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 6] profiling: move profiling code from dispatch.py (API)
# HG changeset patch # User Gregory Szorc # Date 1471217444 25200 # Sun Aug 14 16:30:44 2016 -0700 # Node ID 40cbc513713837acd6a9f593bfd48759eabcce33 # Parent 279cd80059d41bbdb91ea9073278cbbe7f1b43d5 profiling: move profiling code from dispatch.py (API) Currently, profiling code lives in dispatch.py, which is a low-level module centered around command dispatch. Furthermore, dispatch.py imports a lot of other modules, meaning that importing dispatch.py to get at profiling functionality would often result in a module import cycle. Profiling is a generic activity. It shouldn't be limited to command dispatch. This patch moves profiling code from dispatch.py to the new profiling.py. The low-level "run a profiler against a function" functions have been moved verbatim. The code for determining how to invoke the profiler has been extracted to its own function. I decided to create a new module rather than stick this code elsewhere (such as util.py) because util.py is already quite large. And, I foresee this file growing larger once Facebook's profiling enhancements get added to it. diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -29,16 +29,17 @@ from . import ( demandimport, encoding, error, extensions, fancyopts, fileset, hg, hook, +profiling, revset, templatefilters, templatekw, templater, ui as uimod, util, ) @@ -887,140 +888,29 @@ def _dispatch(req): d = lambda: util.checksignature(func)(ui, *args, **cmdoptions) try: return runcommand(lui, repo, cmd, fullargs, ui, options, d, cmdpats, cmdoptions) finally: if repo and repo != req.repo: repo.close() -def lsprofile(ui, func, fp): -format = ui.config('profiling', 'format', default='text') -field = ui.config('profiling', 'sort', default='inlinetime') -limit = ui.configint('profiling', 'limit', default=30) -climit = ui.configint('profiling', 'nested', default=0) - -if format not in ['text', 'kcachegrind']: -ui.warn(_("unrecognized profiling format '%s'" -" - Ignored\n") % format) -format = 'text' - -try: -from . import lsprof -except ImportError: -raise error.Abort(_( -'lsprof not available - install from ' -'http://codespeak.net/svn/user/arigo/hack/misc/lsprof/')) -p = lsprof.Profiler() -p.enable(subcalls=True) -try: -return func() -finally: -p.disable() - -if format == 'kcachegrind': -from . import lsprofcalltree -calltree = lsprofcalltree.KCacheGrind(p) -calltree.output(fp) -else: -# format == 'text' -stats = lsprof.Stats(p.getstats()) -stats.sort(field) -stats.pprint(limit=limit, file=fp, climit=climit) - -def flameprofile(ui, func, fp): -try: -from flamegraph import flamegraph -except ImportError: -raise error.Abort(_( -'flamegraph not available - install from ' -'https://github.com/evanhempel/python-flamegraph')) -# developer config: profiling.freq -freq = ui.configint('profiling', 'freq', default=1000) -filter_ = None -collapse_recursion = True -thread = flamegraph.ProfileThread(fp, 1.0 / freq, - filter_, collapse_recursion) -start_time = time.clock() -try: -thread.start() -func() -finally: -thread.stop() -thread.join() -print('Collected %d stack frames (%d unique) in %2.2f seconds.' % ( -time.clock() - start_time, thread.num_frames(), -thread.num_frames(unique=True))) - - -def statprofile(ui, func, fp): -try: -import statprof -except ImportError: -raise error.Abort(_( -'statprof not available - install using "easy_install statprof"')) - -freq = ui.configint('profiling', 'freq', default=1000) -if freq > 0: -statprof.reset(freq) -else: -ui.warn(_("invalid sampling frequency '%s' - ignoring\n") % freq) - -statprof.start() -try: -return func() -finally: -statprof.stop() -statprof.display(fp) - def _runcommand(ui, options, cmd, cmdfunc): """Enables the profiler if applicable. ``profiling.enabled`` - boolean config that enables or disables profiling """ def checkargs(): try: return cmdfunc() except error.SignatureError: raise error.CommandError(cmd, _("invalid arguments")) if options['profile'] or ui.configbool('profiling', 'enabled'): -profiler = os.getenv('HGPROF') -if profiler is None: -profiler = ui.config('profiling', 'type', default='ls') -if profiler not in ('ls', 'stat', 'flame'): -ui.warn(_("unrecogni
Re: [PATCH 8 of 9 RFC] wireproto: introduce listkeys2 command
On Sun, Aug 14, 2016 at 3:12 PM, Mike Hommey wrote: > On Sun, Aug 14, 2016 at 02:10:07PM -0700, Gregory Szorc wrote: > > # HG changeset patch > > # User Gregory Szorc > > # Date 1471208237 25200 > > # Sun Aug 14 13:57:17 2016 -0700 > > # Node ID d2870bcbc43041909e9f637b294cb889f7ed4933 > > # Parent eb2bc1ac7869ad255965d16004524a95cea83c9d > > wireproto: introduce listkeys2 command > > > > The command behaves exactly like "listkeys" except it uses a more > > efficient and more robust binary encoding mechanism. > > Nowhere in the patch queue I see mentioned why you want this. Not saying > that this shouldn't be done, but it's really not clear what the expected > benefit is of all this refactoring and this new command. I said it concisely in the commit message you just quoted: the wire encoding is smaller and is able to represent all binary values. I offer more detail at https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/087243.html. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 8 of 9 RFC] wireproto: introduce listkeys2 command
On Sun, Aug 14, 2016 at 02:10:07PM -0700, Gregory Szorc wrote: > # HG changeset patch > # User Gregory Szorc > # Date 1471208237 25200 > # Sun Aug 14 13:57:17 2016 -0700 > # Node ID d2870bcbc43041909e9f637b294cb889f7ed4933 > # Parent eb2bc1ac7869ad255965d16004524a95cea83c9d > wireproto: introduce listkeys2 command > > The command behaves exactly like "listkeys" except it uses a more > efficient and more robust binary encoding mechanism. Nowhere in the patch queue I see mentioned why you want this. Not saying that this shouldn't be done, but it's really not clear what the expected benefit is of all this refactoring and this new command. Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] hgweb: config option to control zlib compression level
On Mon, Aug 8, 2016 at 6:45 AM, Yuya Nishihara wrote: > On Sun, 07 Aug 2016 18:12:37 -0700, Gregory Szorc wrote: > > # HG changeset patch > > # User Gregory Szorc > > # Date 1470618598 25200 > > # Sun Aug 07 18:09:58 2016 -0700 > > # Node ID d5e9dedbc7f912ff2dc2e92586347250aee11ac2 > > # Parent 3ef9aa7ad1fc4c43b92d48e4bb1f4e3de68b6910 > > hgweb: config option to control zlib compression level > > > > Before this patch, the HTTP transport protocol would always zlib > > compress certain responses (notably "getbundle" wire protocol commands) > > at zlib compression level 6. > > > > zlib can be a massive CPU resource sink for servers. Some server > > operators may wish to reduce server-side CPU requirements while > > requiring more bandwidth. This is common on corporate intranets, for > > example. Others may wish to use more CPU but reduce bandwidth. > > > > This patch introduces a config option to allow server operators > > to control the zlib compression level. > > Sounds good, queued, thanks. > > > --- a/mercurial/hgweb/protocol.py > > +++ b/mercurial/hgweb/protocol.py > > @@ -69,17 +69,17 @@ class webproto(wireproto.abstractserverp > > def redirect(self): > > self.oldio = self.ui.fout, self.ui.ferr > > self.ui.ferr = self.ui.fout = stringio() > > def restore(self): > > val = self.ui.fout.getvalue() > > self.ui.ferr, self.ui.fout = self.oldio > > return val > > def groupchunks(self, cg): > > -z = zlib.compressobj() > > +z = zlib.compressobj(self.ui.configint('server', 'zliblevel', > -1)) > > Any idea if untrusted hgrc should be allowed? > I can go both ways on this. However, I'm inclined to say "no" because setting the compression level very high or disabling it could make a CPU or network usage increase significantly. I think server operators would want to retain control over this setting for that reason. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 5 of 9 RFC] pushkey: define raw listing functions on each namespace (API)
# HG changeset patch # User Gregory Szorc # Date 1471208975 25200 # Sun Aug 14 14:09:35 2016 -0700 # Node ID a94e1b0bb3ba0f6b0a3e62d5daece6a5f9ff295c # Parent 67501b93c9875cc6cdef0e77d2d148a14a5d60a8 pushkey: define raw listing functions on each namespace (API) In preparation for adding a wire protocol command that sends pushkey data with minimal encoding, we teach the pushkey namespace registration APIs about the recently-implemented raw variants to obtain keys. This will likely break extensions defining custom pushkey namespaces. Extensions can look at the length of the tuple returned by pushkey.get() to determine if they need to define a raw listkeys variant for the Mercurial version they are running on. diff --git a/mercurial/pushkey.py b/mercurial/pushkey.py --- a/mercurial/pushkey.py +++ b/mercurial/pushkey.py @@ -18,40 +18,46 @@ def _nslist(repo): n = {} for k in _namespaces: n[k] = "" if not obsolete.isenabled(repo, obsolete.exchangeopt): n.pop('obsolete') return n # Maps pushkey namespace to a tuple defining functions to call for -# setting and listing entries in the namespace. +# setting, listing, and listing raw entries in the namespace. _namespaces = { -'namespaces': (lambda *x: False, _nslist), -'bookmarks': (bookmarks.pushbookmark, bookmarks.listbookmarks), -'phases': (phases.pushphase, phases.listphases), -'obsolete': (obsolete.pushmarker, obsolete.listmarkers), +'namespaces': (lambda *x: False, _nslist, _nslist), +'bookmarks': (bookmarks.pushbookmark, bookmarks.listbookmarks, + bookmarks.listbookmarksraw), +'phases': (phases.pushphase, phases.listphases, phases.listphasesraw), +'obsolete': (obsolete.pushmarker, obsolete.listmarkers, + obsolete.listmarkersraw), } -def register(namespace, pushkey, listkeys): -_namespaces[namespace] = (pushkey, listkeys) +def register(namespace, pushkey, listkeys, listkeysraw): +_namespaces[namespace] = (pushkey, listkeys, listkeysraw) def _get(namespace): -return _namespaces.get(namespace, (lambda *x: False, lambda *x: {})) +return _namespaces.get(namespace, (lambda *x: False, lambda *x: {}, + lambda *x: {})) def push(repo, namespace, key, old, new): '''should succeed iff value was old''' pk = _get(namespace)[0] return pk(repo, key, old, new) def list(repo, namespace): '''return a dict''' lk = _get(namespace)[1] return lk(repo) +def listraw(repo, namespace): +return _get(namespace)[2](repo) + encode = encoding.fromlocal decode = encoding.tolocal def encodekeys(keys): """encode the content of a pushkey namespace for exchange over the wire""" return '\n'.join(['%s\t%s' % (encode(k), encode(v)) for k, v in keys]) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 7 of 9 RFC] pushkey: support for encoding and decoding raw listkeys dicts
# HG changeset patch # User Gregory Szorc # Date 1471207500 25200 # Sun Aug 14 13:45:00 2016 -0700 # Node ID eb2bc1ac7869ad255965d16004524a95cea83c9d # Parent 1fe812eb8b9e79d1182c4a6593e7ce8fa2938264 pushkey: support for encoding and decoding raw listkeys dicts Now that we have support for retrieving raw/binary versions of pushkey data, the last step before we expose it on the wire protocol is a method to encode and decode it. This patch implements those functions. The new listkeys data representation is framed binary data. We simply have pairs of frames corresponding to keys and values. A 0 length key signals end of payload. All binary sequences can be encoded in keys and values. Of course, not all binary values can be used in existing namespaces because it may not be encoded properly in the existing wire protocol command. But going forward we can do whatever we want in new namespaces. diff --git a/mercurial/pushkey.py b/mercurial/pushkey.py --- a/mercurial/pushkey.py +++ b/mercurial/pushkey.py @@ -2,16 +2,18 @@ # # Copyright 2010 Matt Mackall # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. from __future__ import absolute_import +import struct + from . import ( bookmarks, encoding, obsolete, phases, ) def _nslist(repo): @@ -63,8 +65,62 @@ def encodekeys(keys): def decodekeys(data): """decode the content of a pushkey namespace from exchange over the wire""" result = {} for l in data.splitlines(): k, v = l.split('\t') result[decode(k)] = decode(v) return result + +def encodekeysraw(keys): +"""Encode pushkey namespace keys using a binary encoding. + +The response consists of framed data packets of the form: + + + +Where the ``size`` is a little endian 32-bit integer. + +Data is emitted in pairs of frames where the first frame is the key +name and the second frame is the value. + +A frame with size 0 indicates end of stream. +""" +s = struct.struct('https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 9 of 9 RFC] exchange: call listkeys2 to fetch obsolescence data if available
# HG changeset patch # User Gregory Szorc # Date 1471208941 25200 # Sun Aug 14 14:09:01 2016 -0700 # Node ID f878f764517369b28d0dfd9bd9fb14514dc44862 # Parent d2870bcbc43041909e9f637b294cb889f7ed4933 exchange: call listkeys2 to fetch obsolescence data if available diff --git a/mercurial/exchange.py b/mercurial/exchange.py --- a/mercurial/exchange.py +++ b/mercurial/exchange.py @@ -1454,23 +1454,30 @@ def _pullobsolete(pullop): Exists mostly to allow overriding for experimentation purpose""" if 'obsmarkers' in pullop.stepsdone: return pullop.stepsdone.add('obsmarkers') tr = None if obsolete.isenabled(pullop.repo, obsolete.exchangeopt): pullop.repo.ui.debug('fetching remote obsolete markers\n') -remoteobs = pullop.remote.listkeys('obsolete') +raw = pullop.remote.capable('listkeys2') +if raw: +fn = pullop.remote.listkeys2 +else: +fn = pullop.remote.listkeys +remoteobs = fn('obsolete') if 'dump0' in remoteobs: tr = pullop.gettransaction() markers = [] for key in sorted(remoteobs, reverse=True): if key.startswith('dump'): -data = base85.b85decode(remoteobs[key]) +data = remoteobs[key] +if not raw: +data = base85.b85decode(data) version, newmarks = obsolete._readmarkers(data) markers += newmarks if markers: pullop.repo.obsstore.add(tr, markers) pullop.repo.invalidatevolatilesets() return tr def caps20to10(repo): ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 8 of 9 RFC] wireproto: introduce listkeys2 command
# HG changeset patch # User Gregory Szorc # Date 1471208237 25200 # Sun Aug 14 13:57:17 2016 -0700 # Node ID d2870bcbc43041909e9f637b294cb889f7ed4933 # Parent eb2bc1ac7869ad255965d16004524a95cea83c9d wireproto: introduce listkeys2 command The command behaves exactly like "listkeys" except it uses a more efficient and more robust binary encoding mechanism. diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py --- a/mercurial/wireproto.py +++ b/mercurial/wireproto.py @@ -358,16 +358,27 @@ class wirepeer(peer.peerrepository): f = future() self.ui.debug('preparing listkeys for "%s"\n' % namespace) yield {'namespace': encoding.fromlocal(namespace)}, f d = f.value self.ui.debug('received listkey for "%s": %i bytes\n' % (namespace, len(d))) yield pushkeymod.decodekeys(d) +@batchable +def listkeys2(self, namespace): +self.requirecap('listkeys2', 'binary pushkey retrieval') +f = future() +self.ui.debug('preparing listkeys2 for "%s"\n' % namespace) +yield {'namespace': encoding.fromlocal(namespace)}, f +d = f.value +self.ui.debug('received listkeys2 for "%s": %i bytes\n' + % (namespace, len(d))) +yield pushkeymod.decodekeysraw(d) + def stream_out(self): return self._callstream('stream_out') def changegroup(self, nodes, kind): n = encodelist(nodes) f = self._callcompressable("changegroup", roots=n) return changegroupmod.cg1unpacker(f, 'UN') @@ -681,17 +692,17 @@ def clonebundles(repo, proto): Extensions may wrap this command to filter or dynamically emit data depending on the request. e.g. you could advertise URLs for the closest data center given the client's IP address. """ return repo.opener.tryread('clonebundles.manifest') wireprotocaps = ['lookup', 'changegroupsubset', 'branchmap', 'pushkey', - 'known', 'getbundle', 'unbundlehash', 'batch'] + 'known', 'getbundle', 'unbundlehash', 'batch', 'listkeys2'] def _capabilities(repo, proto): """return a list of capabilities for a repo This function exists to allow extensions to easily wrap capabilities computation - returns a lists: easy to alter @@ -790,16 +801,21 @@ def hello(repo, proto): ''' return "capabilities: %s\n" % (capabilities(repo, proto)) @wireprotocommand('listkeys', 'namespace') def listkeys(repo, proto, namespace): d = repo.listkeys(encoding.tolocal(namespace)).items() return pushkeymod.encodekeys(d) +@wireprotocommand('listkeys2', 'namespace') +def listkeys2(repo, proto, namespace): +d = repo.listkeys(encoding.tolocal(namespace), raw=True).items() +return pushkeymod.encodekeysraw(d) + @wireprotocommand('lookup', 'key') def lookup(repo, proto, key): try: k = encoding.tolocal(key) c = repo[k] r = c.hex() success = 1 except Exception as inst: diff --git a/tests/test-hgweb-commands.t b/tests/test-hgweb-commands.t --- a/tests/test-hgweb-commands.t +++ b/tests/test-hgweb-commands.t @@ -1898,17 +1898,17 @@ raw graph capabilities $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=capabilities'; echo 200 Script output follows - lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch bundle2=HG20%0Achangegroup%3D01%2C02%0Adigests%3Dmd5%2Csha1%2Csha512%0Aerror%3Dabort%2Cunsupportedcontent%2Cpushraced%2Cpushkey%0Ahgtagsfnodes%0Alistkeys%0Apushkey%0Aremote-changegroup%3Dhttp%2Chttps unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024 + lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch listkeys2 bundle2=HG20%0Achangegroup%3D01%2C02%0Adigests%3Dmd5%2Csha1%2Csha512%0Aerror%3Dabort%2Cunsupportedcontent%2Cpushraced%2Cpushkey%0Ahgtagsfnodes%0Alistkeys%0Apushkey%0Aremote-changegroup%3Dhttp%2Chttps unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024 heads $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=heads' 200 Script output follows cad8025a2e87f88c06259790adfa15acb4080123 @@ -2141,16 +2141,17 @@ capabilities lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch + listkeys2 stream-preferred streamreqs=generaldelta,revlogv1 bundle2=HG20%0Achangegroup%3D01%2C02%0Adigests%3Dmd5%2Csha1%2Csha512%0Aerror%3Dabort%2Cunsupportedcontent%2Cpushraced%2Cpushkey%0Ahgtagsfnodes%0Alistkeys%0Apushkey%0Aremote-changegroup%3Dhttp%2Chttps unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024 heads diff --git a/tests/test-ssh-bundle1.t b/tests/test-ssh-bundle1.t --- a/tests/test-ssh-bundle1.t +++ b/tests/test-ssh-bundle1.t @@ -457,18 +457,18 @@ stderr from remote commands should be pr debug output $ hg pull --debug ssh://user@dummy/remote pulling from ssh://user@dummy/remote running python ".*/dummyssh" user@dummy ('|")hg -R remote serve --stdi
[PATCH 6 of 9 RFC] localrepo: support retrieving raw pushkeys
# HG changeset patch # User Gregory Szorc # Date 1471207051 25200 # Sun Aug 14 13:37:31 2016 -0700 # Node ID 1fe812eb8b9e79d1182c4a6593e7ce8fa2938264 # Parent a94e1b0bb3ba0f6b0a3e62d5daece6a5f9ff295c localrepo: support retrieving raw pushkeys Now that we have pushkey support for retrieving raw values, wire support for calling it into repo instances. diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py --- a/mercurial/localrepo.py +++ b/mercurial/localrepo.py @@ -200,18 +200,18 @@ class localpeer(peer.peerrepository): return self._repo.lock() def addchangegroup(self, cg, source, url): return cg.apply(self._repo, source, url) def pushkey(self, namespace, key, old, new): return self._repo.pushkey(namespace, key, old, new) -def listkeys(self, namespace): -return self._repo.listkeys(namespace) +def listkeys(self, namespace, raw=False): +return self._repo.listkeys(namespace, raw=raw) def debugwireargs(self, one, two, three=None, four=None, five=None): '''used to test argument passing over the wire''' return "%s %s %s %s %s" % (one, two, three, four, five) class locallegacypeer(localpeer): '''peer extension which implements legacy methods too; used for tests with restricted capabilities''' @@ -1902,21 +1902,28 @@ class localrepository(object): self.ui.debug('pushing key for "%s:%s"\n' % (namespace, key)) ret = pushkey.push(self, namespace, key, old, new) def runhook(): self.hook('pushkey', namespace=namespace, key=key, old=old, new=new, ret=ret) self._afterlock(runhook) return ret -def listkeys(self, namespace): +def listkeys(self, namespace, raw=False): self.hook('prelistkeys', throw=True, namespace=namespace) self.ui.debug('listing keys for "%s"\n' % namespace) values = pushkey.list(self, namespace) self.hook('listkeys', namespace=namespace, values=values) + +# Hooks above always receive the non-raw values for BC reasons. +# Unfortunately for performance, this means we have to obtain +# both the raw and non-raw forms if raw data is requested. +if raw: +values = pushkey.listraw(self, namespace) + return values def debugwireargs(self, one, two, three=None, four=None, five=None): '''used to test argument passing over the wire''' return "%s %s %s %s %s" % (one, two, three, four, five) def savecommitmessage(self, text): fp = self.vfs('last-message.txt', 'wb') diff --git a/tests/test-obsolete.t b/tests/test-obsolete.t --- a/tests/test-obsolete.t +++ b/tests/test-obsolete.t @@ -12,19 +12,19 @@ > } $ getid() { >hg log -T "{node}\n" --hidden -r "desc('$1')" > } $ cat > debugkeys.py < def reposetup(ui, repo): > class debugkeysrepo(repo.__class__): - > def listkeys(self, namespace): + > def listkeys(self, namespace, raw=False): > ui.write('listkeys %s\n' % (namespace,)) - > return super(debugkeysrepo, self).listkeys(namespace) + > return super(debugkeysrepo, self).listkeys(namespace, raw=raw) > > if repo.local(): > repo.__class__ = debugkeysrepo > EOF $ hg init tmpa $ cd tmpa $ mkcommit kill_me ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 3 of 9 RFC] phases: add a binary version of listphases
# HG changeset patch # User Gregory Szorc # Date 1471204404 25200 # Sun Aug 14 12:53:24 2016 -0700 # Node ID cab15fd54ee60cc067920454544586cc4aa03d8e # Parent d06d34dd880f58ecef32d96baec16508497d5639 phases: add a binary version of listphases More preparation for adding a binary listkeys wire protocol command. diff --git a/mercurial/phases.py b/mercurial/phases.py --- a/mercurial/phases.py +++ b/mercurial/phases.py @@ -369,22 +369,28 @@ def retractboundary(repo, tr, targetphas This function move boundary *backward* this means that all nodes are set in the target phase or kept in a *higher* phase. Simplify boundary to contains phase roots only.""" phcache = repo._phasecache.copy() phcache.retractboundary(repo, tr, targetphase, nodes) repo._phasecache.replace(phcache) -def listphases(repo): -"""List phases root for serialization over pushkey""" +def listphasesraw(repo): +"""Obtain phases pushkey keys. + +Keys are raw binary nodes. Values are '1' to indicate a draft root. + +The special key ``publishing`` with value of ``True`` indicates that the +repo is publishing. +""" keys = {} value = '%i' % draft for root in repo._phasecache.phaseroots[draft]: -keys[hex(root)] = value +keys[root] = value if repo.publishing(): # Add an extra data to let remote know we are a publishing # repo. Publishing repo can't just pretend they are old repo. # When pushing to a publishing repo, the client still need to # push phase boundary # # Push do not only push changeset. It also push phase data. @@ -396,16 +402,26 @@ def listphases(repo): # 3) repo B push to repo A. X is not pushed but the data that #X as now public should # # The server can't handle it on it's own as it has no idea of # client phase data. keys['publishing'] = 'True' return keys +def listphases(repo): +"""List phases root for serialization over pushkey""" +d = {} +for k, v in listphasesraw(repo).iteritems(): +if k != 'publishing': +k = hex(k) +d[k] = v + +return d + def pushphase(repo, nhex, oldphasestr, newphasestr): """List phases root for serialization over pushkey""" repo = repo.unfiltered() with repo.lock(): currentphase = repo[nhex].phase() newphase = abs(int(newphasestr)) # let's avoid negative index surprise oldphase = abs(int(oldphasestr)) # let's avoid negative index surprise if currentphase == oldphase and newphase < oldphase: ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 4 of 9 RFC] obsolete: add binary version of listmarkers (API)
# HG changeset patch # User Gregory Szorc # Date 1471206127 25200 # Sun Aug 14 13:22:07 2016 -0700 # Node ID 67501b93c9875cc6cdef0e77d2d148a14a5d60a8 # Parent cab15fd54ee60cc067920454544586cc4aa03d8e obsolete: add binary version of listmarkers (API) Continuing our work to implement binary/raw versions of listkeys functions. We had to update a caller in exchange.py because we stopped returning base85 encoded data from obsolete._pushkeyescape. diff --git a/mercurial/exchange.py b/mercurial/exchange.py --- a/mercurial/exchange.py +++ b/mercurial/exchange.py @@ -1014,16 +1014,17 @@ def _pushobsolete(pushop): pushop.stepsdone.add('obsmarkers') if pushop.outobsmarkers: pushop.ui.debug('try to push obsolete markers to remote\n') rslts = [] remotedata = obsolete._pushkeyescape(sorted(pushop.outobsmarkers)) for key in sorted(remotedata, reverse=True): # reverse sort to ensure we end with dump0 data = remotedata[key] +data = base85.b85encode(data) rslts.append(remote.pushkey('obsolete', key, '', data)) if [r for r in rslts if not r]: msg = _('failed to push some obsolete markers!\n') repo.ui.warn(msg) def _pushbookmark(pushop): """Update bookmark position on remote""" if pushop.cgresult == 0 or 'bookmarks' in pushop.stepsdone: diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py --- a/mercurial/obsolete.py +++ b/mercurial/obsolete.py @@ -722,39 +722,44 @@ def commonversion(versions): # you have to take in account: # - the version header # - the base85 encoding _maxpayload = 5300 def _pushkeyescape(markers): """encode markers into a dict suitable for pushkey exchange -- binary data is base85 encoded - split in chunks smaller than 5300 bytes""" keys = {} parts = [] currentlen = _maxpayload * 2 # ensure we create a new part for marker in markers: nextdata = _fm0encodeonemarker(marker) if (len(nextdata) + currentlen > _maxpayload): currentpart = [] currentlen = 0 parts.append(currentpart) currentpart.append(nextdata) currentlen += len(nextdata) for idx, part in enumerate(reversed(parts)): data = ''.join([_pack('>B', _fm0version)] + part) -keys['dump%i' % idx] = base85.b85encode(data) +keys['dump%i' % idx] = data return keys +def listmarkersraw(repo): +if not repo.obsstore: +return {} +return _pushkeyescape(sorted(repo.obsstore)) + def listmarkers(repo): """List markers over pushkey""" -if not repo.obsstore: -return {} -return _pushkeyescape(sorted(repo.obsstore)) +d = {} +for k, v in listmarkersraw(repo).iteritems(): +d[k] = base85.b85encode(v) +return d def pushmarker(repo, key, old, new): """Push markers over pushkey""" if not key.startswith('dump'): repo.ui.warn(_('unknown key: %r') % key) return 0 if old: repo.ui.warn(_('unexpected old value for %r') % key) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 9 RFC] pushkey: reindent and document _namespaces
# HG changeset patch # User Gregory Szorc # Date 1471203456 25200 # Sun Aug 14 12:37:36 2016 -0700 # Node ID 5aff57cb77481737c73a0acd4885917985bfcf63 # Parent 279cd80059d41bbdb91ea9073278cbbe7f1b43d5 pushkey: reindent and document _namespaces In preparation for tweaking it in subsequent commits. diff --git a/mercurial/pushkey.py b/mercurial/pushkey.py --- a/mercurial/pushkey.py +++ b/mercurial/pushkey.py @@ -17,21 +17,24 @@ from . import ( def _nslist(repo): n = {} for k in _namespaces: n[k] = "" if not obsolete.isenabled(repo, obsolete.exchangeopt): n.pop('obsolete') return n -_namespaces = {"namespaces": (lambda *x: False, _nslist), - "bookmarks": (bookmarks.pushbookmark, bookmarks.listbookmarks), - "phases": (phases.pushphase, phases.listphases), - "obsolete": (obsolete.pushmarker, obsolete.listmarkers), - } +# Maps pushkey namespace to a tuple defining functions to call for +# setting and listing entries in the namespace. +_namespaces = { +'namespaces': (lambda *x: False, _nslist), +'bookmarks': (bookmarks.pushbookmark, bookmarks.listbookmarks), +'phases': (phases.pushphase, phases.listphases), +'obsolete': (obsolete.pushmarker, obsolete.listmarkers), +} def register(namespace, pushkey, listkeys): _namespaces[namespace] = (pushkey, listkeys) def _get(namespace): return _namespaces.get(namespace, (lambda *x: False, lambda *x: {})) def push(repo, namespace, key, old, new): ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 2 of 9 RFC] bookmarks: add function for retrieving raw bookmarks values
# HG changeset patch # User Gregory Szorc # Date 1471204082 25200 # Sun Aug 14 12:48:02 2016 -0700 # Node ID d06d34dd880f58ecef32d96baec16508497d5639 # Parent 5aff57cb77481737c73a0acd4885917985bfcf63 bookmarks: add function for retrieving raw bookmarks values In preparation for adding a binary listkeys wire protocol command, we need to introduce functions for each pushkey namespace that return raw, binary values minus any unncessary encoding. We start with the bookmarks namespace. listbookmarks() has effectively been copied to listbookmarksraw() and changed so keys are str/bytes (not localstr) and values are raw node values instead of hex encoded. listbookmarks() has been re-implemented in terms of listbookmarksraw(). diff --git a/mercurial/bookmarks.py b/mercurial/bookmarks.py --- a/mercurial/bookmarks.py +++ b/mercurial/bookmarks.py @@ -279,27 +279,42 @@ def update(repo, parents, node): lock = repo.lock() tr = repo.transaction('bookmark') marks.recordchange(tr) tr.close() finally: lockmod.release(tr, lock) return update -def listbookmarks(repo): +def listbookmarksraw(repo): +"""Obtain a dict of bookmarks to be used by the pushkey protocol. + +Keys and values are bytes/binary, not encoded. +""" # We may try to list bookmarks on a repo type that does not # support it (e.g., statichttprepository). marks = getattr(repo, '_bookmarks', {}) d = {} hasnode = repo.changelog.hasnode for k, v in marks.iteritems(): # don't expose local divergent bookmarks if hasnode(v) and ('@' not in k or k.endswith('@')): -d[k] = hex(v) +d[encoding.fromlocal(k)] = v + +return d + +def listbookmarks(repo): +"""Obtain a dict of bookmarks to be used by the pushkey protocol. + +Keys are localstr. Values are hex encoded. +""" +d = {} +for k, v in listbookmarksraw(repo).iteritems(): +d[encoding.tolocal(k)] = hex(v) return d def pushbookmark(repo, key, old, new): w = l = tr = None try: w = repo.wlock() l = repo.lock() tr = repo.transaction('bookmarks') ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 3 of 3] listkeypattern: add listkeypattern wireproto method
On Fri, Aug 12, 2016 at 5:09 AM, Stanislau Hlebik wrote: > # HG changeset patch > # User Stanislau Hlebik > # Date 1470999441 25200 > # Fri Aug 12 03:57:21 2016 -0700 > # Node ID c2ee493e216c60ff439ab93cc1efe6ac5922d8eb > # Parent fd2185d7c2f7aa529b2ad0a6584832fb2b1b4ecb > listkeypattern: add listkeypattern wireproto method > > wireproto method to list remote keys by pattern > > diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py > --- a/mercurial/wireproto.py > +++ b/mercurial/wireproto.py > @@ -353,6 +353,22 @@ >% (namespace, len(d))) > yield pushkeymod.decodekeys(d) > > +@batchable > +def listkeypattern(self, namespace, patterns): > +if not self.capable('pushkey'): > +yield {}, None > +f = future() > +self.ui.debug('preparing listkeys for "%s" with pattern "%s"\n' % > + (namespace, patterns)) > +yield { > +'namespace': encoding.fromlocal(namespace), > +'patterns': encodelist(patterns) > +}, f > +d = f.value > +self.ui.debug('received listkey for "%s": %i bytes\n' > + % (namespace, len(d))) > +yield pushkeymod.decodekeys(d) > + > def stream_out(self): > return self._callstream('stream_out') > > @@ -676,7 +692,8 @@ > return repo.opener.tryread('clonebundles.manifest') > > wireprotocaps = ['lookup', 'changegroupsubset', 'branchmap', 'pushkey', > - 'known', 'getbundle', 'unbundlehash', 'batch'] > + 'known', 'getbundle', 'unbundlehash', 'batch', > + 'listkeypattern'] > > def _capabilities(repo, proto): > """return a list of capabilities for a repo > @@ -791,6 +808,12 @@ > d = repo.listkeys(encoding.tolocal(namespace)).items() > return pushkeymod.encodekeys(d) > > +@wireprotocommand('listkeypattern', 'namespace patterns *') > Why the "*" here? "others" is not used in the function implementation. > +def listkeypattern(repo, proto, namespace, patterns, others): > +patterns = decodelist(patterns) > +d = repo.listkeys(encoding.tolocal(namespace), > patterns=patterns).items() > +return pushkeymod.encodekeys(d) > + > @wireprotocommand('lookup', 'key') > def lookup(repo, proto, key): > try: > I think introducing a new wire protocol command is the correct way to solve this problem (as opposed to introducing a new argument on the existing command). However, if we're introducing a new wire protocol command for obtaining pushkey values, I think we should improve deficiencies in the response encoding rather than propagate its problems. The "listkeys" response encoding can't transmit the full range of binary values. This can lead to larger (and slower) responses sizes. For example, a number of pushkey namespaces exchange lists of nodes. These have to be represented as hex instead of binary. For pushkey namespaces like phases or obsolescence that can exchange hundreds or thousands of nodes, the overhead can add up. I think the response from a new listkeys command should be using framing to encode the key names and values so the full range of binary values can be efficiently transferred. We may also want a special mechanism to represent a list of nodes, as avoiding the overhead of framing on fixed width values would be desirable. Of course, at the point you introduce a new response encoding, we may want to call the command "listkeys2." If others agree, I can code up an implementation and you can add the patterns functionality on top of it. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 2] vfs: add the possibility to have a "ward" to check vfs usage
On 08/14/2016 04:48 PM, Pierre-Yves David wrote: […] For atomictmp, we probably want to channel their usage through as vfs to be able to cover them. I mean atomictemp=True is handled before calling ward(). And I think atomic operation should be allowed without taking a lock, but ward() should be called no matter if operation is atomic or not. Ha, good catch. yeah we should convey the value of atomictemp to the ward. So fixing the atomictemp case raise a whole new herd of warning. I'll have to have a closer look at what they are but we could come with many exceptions or rethink the approach a bit. Cheers. -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 5 of 5 RFC] osutil: switch to placeholder module that imports cpy/pure selectively
On Sat, Aug 13, 2016 at 3:15 AM, Yuya Nishihara wrote: > # HG changeset patch > # User Yuya Nishihara > # Date 1470969317 -32400 > # Fri Aug 12 11:35:17 2016 +0900 > # Node ID e1318b322922baf91a146d92b1ee21cc0e509c04 > # Parent 3d8b3e09190aaacc3c57ae37b265838df4accb1a > osutil: switch to placeholder module that imports cpy/pure selectively > > You have to do "make clean" to test this change. Otherwise, the existing > osutil.so would be loaded directly. Also, you need "make clean" to go back > to previous revisions. > > Nit: should we move osutil.c to mercurial/cpy? > I like the spirit of this series to establish some more order around the various module implementations. I'm pretty sure mpm won't like the `make clean` requirement, as he likes the ability to bisect without having to worry about build system foo. I haven't been paying close attention to the cpy/pure/cffi discussions. I know there is a common pattern in other Python projects of having modules define the pure Python code inline then have code at the bottom of the module/file that imports C extensions/libraries and overwrites the Python bits. So e.g. we could move mercurial/pure/osutil.py to mercurial/osutil.py then at the bottom of the file do something like: if modulepolicy != 'py': globals().update(policy.uimportvars(u'osutil')) Or we could potentially inline the cffi bits without having to a) maintain a separate file/module or b) abstract the import mechanism for modules with C implementations. In other words, we could add C implementations to any module without having to tell the module import mechanism about which modules contain C implementations. We would likely still need this policy.importvars() trick for C extensions, since C extensions need to live in their own module. But at least we wouldn't have an extra module for cffi. > > diff --git a/contrib/check-py3-compat.py b/contrib/check-py3-compat.py > --- a/contrib/check-py3-compat.py > +++ b/contrib/check-py3-compat.py > @@ -55,7 +55,9 @@ def check_compat_py3(f): > # out module paths for things not in a package can be confusing. > if f.startswith(('hgext/', 'mercurial/')) and not > f.endswith('__init__.py'): > assert f.endswith('.py') > -name = f.replace('/', '.')[:-3].replace('.pure.', '.') > +name = f.replace('/', '.')[:-3] > +if not f.endswith('osutil.py'): > +name = name.replace('.pure.', '.') > with open(f, 'r') as fh: > try: > imp.load_module(name, fh, '', ('py', 'r', imp.PY_SOURCE)) > diff --git a/contrib/import-checker.py b/contrib/import-checker.py > --- a/contrib/import-checker.py > +++ b/contrib/import-checker.py > @@ -688,7 +688,8 @@ def main(argv): > used_imports = {} > any_errors = False > for source_path in argv[1:]: > -modname = dotted_name_of_path(source_path, trimpure=True) > +trimpure = not source_path.endswith('osutil.py') > +modname = dotted_name_of_path(source_path, trimpure=trimpure) > localmods[modname] = source_path > for localmodname, source_path in sorted(localmods.items()): > for src, modname, name, line in sources(source_path, > localmodname): > diff --git a/mercurial/__init__.py b/mercurial/__init__.py > --- a/mercurial/__init__.py > +++ b/mercurial/__init__.py > @@ -27,7 +27,6 @@ modulepolicy = policy.policy > 'mercurial.bdiff', > 'mercurial.diffhelpers', > 'mercurial.mpatch', > -'mercurial.osutil', > 'mercurial.parsers', > ]) > > diff --git a/mercurial/osutil.py b/mercurial/osutil.py > new file mode 100644 > --- /dev/null > +++ b/mercurial/osutil.py > @@ -0,0 +1,11 @@ > +# osutil.py - native operating system services > +# > +# Copyright 2007 Matt Mackall and others > +# > +# This software may be used and distributed according to the terms of the > +# GNU General Public License version 2 or any later version. > + > +from __future__ import absolute_import > + > +from . import policy > +globals().update(policy.uimportvars(u'osutil')) > diff --git a/mercurial/pure/osutil.py b/mercurial/pure/osutil.py > --- a/mercurial/pure/osutil.py > +++ b/mercurial/pure/osutil.py > @@ -14,7 +14,7 @@ import socket > import stat as statmod > import sys > > -from . import policy > +from .. import policy > modulepolicy = policy.policy > policynocffi = policy.policynocffi > > diff --git a/setup.py b/setup.py > --- a/setup.py > +++ b/setup.py > @@ -577,7 +577,7 @@ extmodules = [ > 'mercurial/pathencode.c'], >include_dirs=common_include_dirs, >depends=common_depends), > -Extension('mercurial.osutil', ['mercurial/osutil.c'], > +Extension('mercurial.cpy.osutil', ['mercurial/osutil.c'], >include_dirs=common_include_dirs, >extra_link_args=osutil_ldflags, >depends=common_depends), > ___ > Mercurial-devel mailing
Re: [PATCH 3 of 5 RFC] policy: add helper to import cpy/pure module selectively
On Sat, Aug 13, 2016 at 3:15 AM, Yuya Nishihara wrote: > # HG changeset patch > # User Yuya Nishihara > # Date 1470969017 -32400 > # Fri Aug 12 11:30:17 2016 +0900 > # Node ID 8bb0a9223bce791bae9e6a07238d6b9722e5577a > # Parent bda4422fc315b54356ba1949e5ba861dd2660e86 > policy: add helper to import cpy/pure module selectively > > Unfortunately, __import__() nor getattr() does not take bytes, and > globals() > is a dict keyed by unicodes on Python 3. uimportvars() is a unicode API to > comply with their requirement. I really hated mixing bytes and unicodes, > but > bytes version of importvars() would be useless. > > Nit: KeyError would be raised for invalid HGMODULEPOLICY. What should we > do in that case? > We have precedence for this in the `hg` script: we write an untranslated abort message to sys.stderr and call sys.exit. Ideally we would translate that message. But without a module load policy, that gets difficult. We would probably ideally change the policy to "pure" (since that will always work), load i18n.py, then do a normal abort. But to do that may require moving some policy loading code into __init__.py. > > diff --git a/mercurial/policy.py b/mercurial/policy.py > --- a/mercurial/policy.py > +++ b/mercurial/policy.py > @@ -23,6 +23,14 @@ policy = 'c' > policynoc = ('cffi', 'cffi-allow', 'py') > policynocffi = ('c', 'py') > > +_upackagetable = { > +'c': [u'cpy'], > +'allow': [u'cpy', u'pure'], > +'cffi': [u'pure'], # TODO: [u'cffi'] > +'cffi-allow': [u'pure'], # TODO: [u'cffi', u'pure'] > +'py': [u'pure'], > +} > + > try: > from . import __modulepolicy__ > policy = __modulepolicy__.modulepolicy > @@ -43,3 +51,24 @@ if sys.version_info[0] >= 3: > > # Environment variable can always force settings. > policy = os.environ.get('HGMODULEPOLICY', policy) > + > +def _uexportedvars(mod): > +"""Build dict of public names to simulate 'from mod import *'""" > +modvars = vars(mod) > +if u'__all__' in modvars: > +allnames = modvars[u'__all__'] > +return dict((k, v) for k, v in modvars.items() if k in allnames) > +return dict((k, v) for k, v in modvars.items() if not > k.startswith(u'_')) > + > +def uimportvars(uname): > +"""Import module according to policy and return exported vars""" > +upkgnames = _upackagetable[policy] > +for upkg in upkgnames: > +try: > +# pass globals() to resolve name relative to this module > +mod = __import__(u'%s.%s' % (upkg, uname), globals(), level=1) > +mod = getattr(mod, uname) > +return _uexportedvars(mod) > +except ImportError: > +if upkg == upkgnames[-1]: > +raise > The "u" prefixing feels a bit weird. Your commit message says the bytes version of "importvars" would be useless. So why the prefix? ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 4 of 4 RFC] pycompat: patch (del|get|has|set)attr to accept bytes on Python 3
On Sun, Aug 14, 2016 at 2:33 AM, Yuya Nishihara wrote: > # HG changeset patch > # User Yuya Nishihara > # Date 1471146681 -32400 > # Sun Aug 14 12:51:21 2016 +0900 > # Node ID 2a300f882c077fa014046db96b545f6aff771d32 > # Parent d53bd633898ec6edafccea9f798d314699faf90a > pycompat: patch (del|get|has|set)attr to accept bytes on Python 3 > > getattr() and setattr() are widely used in our code. We wouldn't probably > want to rewrite every single call of getattr/setattr. delattr() and > hasattr() > aren't that important, but they are functions of the same kind. > > Another option would be util.(get|set)attr. > I'm generally opposed to modifying the behavior of stdlib functionality because, well, it's modifying the stdlib. Extensions may load packages/modules that rely on the default behavior, perhaps in subtle ways. I would prefer proxy functions (possibly in util.py) over modifying the standard library. > > diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py > --- a/mercurial/pycompat.py > +++ b/mercurial/pycompat.py > @@ -31,8 +31,27 @@ else: > > if sys.version_info[0] >= 3: > import builtins > +import functools > builtins.xrange = range > > +def _wrapattrfunc(f): > +try: > +# avoid useless wrapping on reload > +f.__wrapped__ > +return f > +except AttributeError: > +pass > +@functools.wraps(f) > +def w(object, name, *args): > +if isinstance(name, bytes): > +name = name.decode('utf-8') > +return f(object, name, *args) > +return w > +builtins.delattr = _wrapattrfunc(builtins.delattr) > +builtins.getattr = _wrapattrfunc(builtins.getattr) > +builtins.hasattr = _wrapattrfunc(builtins.hasattr) > +builtins.setattr = _wrapattrfunc(builtins.setattr) > + > stringio = io.StringIO > empty = _queue.Empty > queue = _queue.Queue > diff --git a/tests/test-check-py3-compat.t b/tests/test-check-py3-compat.t > --- a/tests/test-check-py3-compat.t > +++ b/tests/test-check-py3-compat.t > @@ -122,48 +122,48 @@ >mercurial/hook.py: error importing: str expected, not bytes > (error at i18n.py:*) (glob) >mercurial/httpconnection.py: error importing: str expected, > not bytes (error at i18n.py:*) (glob) >mercurial/httppeer.py: error importing: str expected, not > bytes (error at i18n.py:*) (glob) > - mercurial/keepalive.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/localrepo.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/lock.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/mail.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/manifest.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/match.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/mdiff.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/merge.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/minirst.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/namespaces.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/obsolete.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/patch.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/pathutil.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/peer.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/pushkey.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/pvec.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/registrar.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/repair.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/repoview.py: error importing: getattr(): > attribute name must be string (error at util.py:*) (glob) > - mercurial/revlog.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/revset.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/scmutil.py: error importing: getattr(): attribute > name must be string (error at util.py:*) (glob) > - mercurial/scmwindows.py: error importing: geta
Re: [PATCH 1 of 4 RFC] pycompat: delay loading modules registered to stub
On Sun, Aug 14, 2016 at 2:33 AM, Yuya Nishihara wrote: > # HG changeset patch > # User Yuya Nishihara > # Date 1471153584 -32400 > # Sun Aug 14 14:46:24 2016 +0900 > # Node ID 131cc484d7fe95d07f677c16dbb0f506847e1d38 > # Parent 0cb2d4db308b97e8fe7faa8d45a47b228037f230 > pycompat: delay loading modules registered to stub > > New _pycompatstub is compatible with our demandimporter. try-except is > replaced by version comparison because ImportError will no longer be raised > immediately. > Nit: I think you mean "incompatible." This patch looks good otherwise. > > diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py > --- a/mercurial/pycompat.py > +++ b/mercurial/pycompat.py > @@ -34,30 +34,32 @@ empty = _queue.Empty > queue = _queue.Queue > > class _pycompatstub(object): > -pass > - > -def _alias(alias, origin, items): > -""" populate a _pycompatstub > +def __init__(self): > +self._aliases = {} > > -copies items from origin to alias > -""" > -for item in items: > +def _registeraliases(self, origin, items): > +"""Add items that will be populated at the first access""" > +self._aliases.update((item.replace('_', '').lower(), (origin, > item)) > + for item in items) > + > +def __getattr__(self, name): > try: > -lcase = item.replace('_', '').lower() > -setattr(alias, lcase, getattr(origin, item)) > -except AttributeError: > -pass > +origin, item = self._aliases[name] > +except KeyError: > +raise AttributeError(name) > +self.__dict__[name] = obj = getattr(origin, item) > +return obj > > httpserver = _pycompatstub() > urlreq = _pycompatstub() > urlerr = _pycompatstub() > -try: > +if sys.version_info[0] < 3: > import BaseHTTPServer > import CGIHTTPServer > import SimpleHTTPServer > import urllib2 > import urllib > -_alias(urlreq, urllib, ( > +urlreq._registeraliases(urllib, ( > "addclosehook", > "addinfourl", > "ftpwrapper", > @@ -71,7 +73,7 @@ try: > "url2pathname", > "urlencode", > )) > -_alias(urlreq, urllib2, ( > +urlreq._registeraliases(urllib2, ( > "AbstractHTTPHandler", > "BaseHandler", > "build_opener", > @@ -87,24 +89,24 @@ try: > "Request", > "urlopen", > )) > -_alias(urlerr, urllib2, ( > +urlerr._registeraliases(urllib2, ( > "HTTPError", > "URLError", > )) > -_alias(httpserver, BaseHTTPServer, ( > +httpserver._registeraliases(BaseHTTPServer, ( > "HTTPServer", > "BaseHTTPRequestHandler", > )) > -_alias(httpserver, SimpleHTTPServer, ( > +httpserver._registeraliases(SimpleHTTPServer, ( > "SimpleHTTPRequestHandler", > )) > -_alias(httpserver, CGIHTTPServer, ( > +httpserver._registeraliases(CGIHTTPServer, ( > "CGIHTTPRequestHandler", > )) > > -except ImportError: > +else: > import urllib.request > -_alias(urlreq, urllib.request, ( > +urlreq._registeraliases(urllib.request, ( > "AbstractHTTPHandler", > "addclosehook", > "addinfourl", > @@ -132,12 +134,12 @@ except ImportError: > "urlopen", > )) > import urllib.error > -_alias(urlerr, urllib.error, ( > +urlerr._registeraliases(urllib.error, ( > "HTTPError", > "URLError", > )) > import http.server > -_alias(httpserver, http.server, ( > +httpserver._registeraliases(http.server, ( > "HTTPServer", > "BaseHTTPRequestHandler", > "SimpleHTTPRequestHandler", > diff --git a/tests/test-check-py3-compat.t b/tests/test-check-py3-compat.t > --- a/tests/test-check-py3-compat.t > +++ b/tests/test-check-py3-compat.t > @@ -122,24 +122,22 @@ >mercurial/hook.py: error importing: str expected, not bytes > (error at i18n.py:*) (glob) >mercurial/httpconnection.py: error importing: str expected, > not bytes (error at i18n.py:*) (glob) >mercurial/httppeer.py: error importing: str expected, not > bytes (error at i18n.py:*) (glob) > - mercurial/keepalive.py: error importing: getattr(): > attribute name must be string (error at pycompat.py:*) (glob) > - mercurial/localrepo.py: error importing: getattr(): > attribute name must be string (error at pycompat.py:*) (glob) > - mercurial/lock.py: error importing: getattr(): attribute > name must be string (error at pycompat.py:*) (glob) > - mercurial/mail.py: error importing: getattr(): attribute > name must be string (error at pycompat.py:*) (glob) > - mercurial/manifest.py: error importing: getattr(): > attribute name must be string (error at pycompat.py:*) (glob) > - mercurial/match.py: error importing: getattr(): attribute > name must be string (error at pycompat.py:*) (glob) > - mercurial/mdiff.py: error importing: getattr(): attribute > n
Re: [PATCH 3 of 5] outgoing: add a 'missingroots' argument
On Thu, Aug 11, 2016 at 1:06 PM, Pierre-Yves David < pierre-yves.da...@ens-lyon.org> wrote: > # HG changeset patch > # User Pierre-Yves David > # Date 1470774698 -7200 > # Tue Aug 09 22:31:38 2016 +0200 > # Node ID 8c4fcb1244bdf79caadadc73fc1e489a160a07ec > # Parent 9ff7059253fd00094799f592462590cd837fb457 > # EXP-Topic outgoing > outgoing: add a 'missingroots' argument > > This argument can be used instead of 'commonheads' to determine the > 'outgoing' > set. We remove the outgoingbetween function as its role can now be handled > by > 'outgoing' itself. > > I've thought of using an external function instead of making the > constructor > more complicated. However, there is low hanging fruit to improve the > current > code flow by storing some side products of the processing of > 'missingroots'. So > in my opinion it make senses to add all this to the class. > > diff -r 9ff7059253fd -r 8c4fcb1244bd mercurial/changegroup.py > --- a/mercurial/changegroup.py Tue Aug 09 15:55:44 2016 +0200 > +++ b/mercurial/changegroup.py Tue Aug 09 22:31:38 2016 +0200 > @@ -943,7 +943,7 @@ def changegroupsubset(repo, roots, heads > Another wrinkle is doing the reverse, figuring out which changeset in > the changegroup a particular filenode or manifestnode belongs to. > """ > -outgoing = discovery.outgoingbetween(repo, roots, heads) > +outgoing = discovery.outgoing(repo, missingroots=roots, > missingheads=heads) > bundler = getbundler(version, repo) > return getsubset(repo, outgoing, bundler, source) > > diff -r 9ff7059253fd -r 8c4fcb1244bd mercurial/discovery.py > --- a/mercurial/discovery.pyTue Aug 09 15:55:44 2016 +0200 > +++ b/mercurial/discovery.pyTue Aug 09 22:31:38 2016 +0200 > @@ -76,11 +76,25 @@ class outgoing(object): > The sets are computed on demand from the heads, unless provided > upfront > by discovery.''' > > -def __init__(self, repo, commonheads=None, missingheads=None): > +def __init__(self, repo, commonheads=None, missingheads=None, > + missingroots=None): > +# at least one of them must not be set > +assert None in (commonheads, missingroots) > cl = repo.changelog > if not missingheads: > missingheads = cl.heads() > -if not commonheads: > +if missingroots: > +discbases = [] > +for n in missingroots: > +discbases.extend([p for p in cl.parents(n) if p != > nullid]) > +# TODO remove call to nodesbetween. > +# TODO populate attributes on outgoing instance instead of > setting > +# discbases. > +csets, roots, heads = cl.nodesbetween(missingroots, > missingheads) > +included = set(csets) > +missingheads = heads > +commonheads = [n for n in discbases if n not in included] > +elif not commonheads: > commonheads = [nullid] > self.commonheads = commonheads > self.missingheads = missingheads > @@ -106,27 +120,6 @@ class outgoing(object): > self._computecommonmissing() > return self._missing > > -def outgoingbetween(repo, roots, heads): > -"""create an ``outgoing`` consisting of nodes between roots and heads > - > -The ``missing`` nodes will be descendants of any of the ``roots`` and > -ancestors of any of the ``heads``, both are which are defined as a > list > -of binary nodes. > -""" > -cl = repo.changelog > -if not roots: > -roots = [nullid] > -discbases = [] > -for n in roots: > -discbases.extend([p for p in cl.parents(n) if p != nullid]) > -# TODO remove call to nodesbetween. > -# TODO populate attributes on outgoing instance instead of setting > -# discbases. > -csets, roots, heads = cl.nodesbetween(roots, heads) > -included = set(csets) > -discbases = [n for n in discbases if n not in included] > -return outgoing(repo, discbases, heads) > - > def findcommonoutgoing(repo, other, onlyheads=None, force=False, > commoninc=None, portable=False): > '''Return an outgoing instance to identify the nodes present in repo > but > I'm not thrilled about the complexity of outgoing.__init__. Personally, I'd rather have a "dumb" container object that is populated by N specialized functions (like "outgoingbetween"). But that's just my style. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 5 of 5] computeoutgoing: move the function from 'changegroup' to 'exchange'
On Thu, Aug 11, 2016 at 1:06 PM, Pierre-Yves David < pierre-yves.da...@ens-lyon.org> wrote: > # HG changeset patch > # User Pierre-Yves David > # Date 1470755195 -7200 > # Tue Aug 09 17:06:35 2016 +0200 > # Node ID c60098bda68be8901e564445fe157888f317bf5a > # Parent 0a35b949b92ac83e66ddaa706deb8c0f228a045b > # EXP-Topic outgoing > computeoutgoing: move the function from 'changegroup' to 'exchange' > > Now that all users are in exchange, we can safely move the code in the > 'exchange' module. This function is really about processing the argument > of a > 'getbundle' call, so it even makes senses to do so. > > diff -r 0a35b949b92a -r c60098bda68b mercurial/changegroup.py > --- a/mercurial/changegroup.py Tue Aug 09 17:00:38 2016 +0200 > +++ b/mercurial/changegroup.py Tue Aug 09 17:06:35 2016 +0200 > @@ -15,7 +15,6 @@ import weakref > from .i18n import _ > from .node import ( > hex, > -nullid, > nullrev, > short, > ) > @@ -969,25 +968,6 @@ def getlocalchangegroup(repo, source, ou > bundler = getbundler(version, repo, bundlecaps) > return getsubset(repo, outgoing, bundler, source) > > -def computeoutgoing(repo, heads, common): > -"""Computes which revs are outgoing given a set of common > -and a set of heads. > - > -This is a separate function so extensions can have access to > -the logic. > - > -Returns a discovery.outgoing object. > -""" > -cl = repo.changelog > -if common: > -hasnode = cl.hasnode > -common = [n for n in common if hasnode(n)] > -else: > -common = [nullid] > -if not heads: > -heads = cl.heads() > -return discovery.outgoing(repo, common, heads) > - > def getchangegroup(repo, source, outgoing, bundlecaps=None, > version='01'): > """Like changegroupsubset, but returns the set difference between the > diff -r 0a35b949b92a -r c60098bda68b mercurial/exchange.py > --- a/mercurial/exchange.py Tue Aug 09 17:00:38 2016 +0200 > +++ b/mercurial/exchange.py Tue Aug 09 17:06:35 2016 +0200 > @@ -257,6 +257,25 @@ def buildobsmarkerspart(bundler, markers > return bundler.newpart('obsmarkers', data=stream) > return None > > +def _computeoutgoing(repo, heads, common): > +"""Computes which revs are outgoing given a set of common > +and a set of heads. > + > +This is a separate function so extensions can have access to > +the logic. > + > +Returns a discovery.outgoing object. > +""" > +cl = repo.changelog > +if common: > +hasnode = cl.hasnode > +common = [n for n in common if hasnode(n)] > +else: > +common = [nullid] > +if not heads: > +heads = cl.heads() > +return discovery.outgoing(repo, common, heads) > + > def _forcebundle1(op): > """return true if a pull/push must use bundle1 > > @@ -1535,7 +1554,7 @@ def getbundle(repo, source, heads=None, > if kwargs: > raise ValueError(_('unsupported getbundle arguments: %s') > % ', '.join(sorted(kwargs.keys( > -outgoing = changegroup.computeoutgoing(repo, heads, common) > +outgoing = _computeoutgoing(repo, heads, common) > return changegroup.getchangegroup(repo, source, outgoing, >bundlecaps=bundlecaps) > > @@ -1572,7 +1591,7 @@ def _getbundlechangegrouppart(bundler, r > if not cgversions: > raise ValueError(_('no common changegroup version')) > version = max(cgversions) > -outgoing = changegroup.computeoutgoing(repo, heads, common) > +outgoing = _computeoutgoing(repo, heads, common) > cg = changegroup.getlocalchangegroupraw(repo, source, outgoing, > bundlecaps=bundlecaps, > version=version) > @@ -1625,7 +1644,7 @@ def _getbundletagsfnodes(bundler, repo, > if not (kwargs.get('cg', True) and 'hgtagsfnodes' in b2caps): > return > > -outgoing = changegroup.computeoutgoing(repo, heads, common) > +outgoing = _computeoutgoing(repo, heads, common) > > if not outgoing.missingheads: > return > I have a slight preference for this code living in discovery.py since discovery.py seems to be the place where we figure out the difference in changesets between peers. But there is overlap between discovery.py and exchange.py, so it isn't a strong preference. I think the overall series is a good refactor. A lot of this code didn't belong in changegroup.py and contributed to a more complicated than necessary API. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 2] vfs: add the possibility to have a "ward" to check vfs usage
On 08/14/2016 02:30 PM, Yuya Nishihara wrote: On Sun, 14 Aug 2016 13:47:38 +0200, Pierre-Yves David wrote: On 08/14/2016 05:03 AM, Yuya Nishihara wrote: On Thu, 11 Aug 2016 12:52:24 +0200, Pierre-Yves David wrote: @@ -583,6 +589,8 @@ class vfs(abstractvfs): self._trustnlink = nlink > 1 or util.checknlink(f) if nlink > 1 or not self._trustnlink: util.rename(util.mktempcopy(f), f) +if self._ward is not None: +self._ward(f, mode) fp = util.posixfile(f, mode) Do you have an idea to extend this to the other operations and open of atomictemp files? I doubt it wouldn't be as simple as a callback function. Simple callback should do for the other operation, as they are also performed through vfs. Something like ward('open', f, mode), ward('unlink', f), etc., or wardopen(), wardunlink(), etc. ? yep For atomictmp, we probably want to channel their usage through as vfs to be able to cover them. I mean atomictemp=True is handled before calling ward(). And I think atomic operation should be allowed without taking a lock, but ward() should be called no matter if operation is atomic or not. Ha, good catch. yeah we should convey the value of atomictemp to the ward. -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
mercurial@29776: 3 new changesets
3 new changesets in mercurial: http://selenic.com/repo/hg//rev/a7f8939641aa changeset: 29774:a7f8939641aa user:Simon Farnsworth date:Fri Aug 12 06:01:42 2016 -0700 summary: merge: use labels in prompts to the user http://selenic.com/repo/hg//rev/978b907d9b36 changeset: 29775:978b907d9b36 user:Simon Farnsworth date:Fri Aug 12 05:56:40 2016 -0700 summary: merge: always use other, not remote, in user prompts http://selenic.com/repo/hg//rev/279cd80059d4 changeset: 29776:279cd80059d4 bookmark:@ tag: tip user:Maciej Fijalkowski date:Thu Jul 28 14:18:01 2016 +0200 summary: performance: disable workaround for an old bug of Python gc -- Repository URL: http://selenic.com/repo/hg/ ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 2] vfs: add the possibility to have a "ward" to check vfs usage
On Sun, 14 Aug 2016 13:47:38 +0200, Pierre-Yves David wrote: > On 08/14/2016 05:03 AM, Yuya Nishihara wrote: > > On Thu, 11 Aug 2016 12:52:24 +0200, Pierre-Yves David wrote: > >> @@ -583,6 +589,8 @@ class vfs(abstractvfs): > >> self._trustnlink = nlink > 1 or util.checknlink(f) > >> if nlink > 1 or not self._trustnlink: > >> util.rename(util.mktempcopy(f), f) > >> +if self._ward is not None: > >> +self._ward(f, mode) > >> fp = util.posixfile(f, mode) > > > > Do you have an idea to extend this to the other operations and open of > > atomictemp files? I doubt it wouldn't be as simple as a callback function. > > Simple callback should do for the other operation, as they are also > performed through vfs. Something like ward('open', f, mode), ward('unlink', f), etc., or wardopen(), wardunlink(), etc. ? > For atomictmp, we probably want to channel their usage through as vfs to > be able to cover them. I mean atomictemp=True is handled before calling ward(). And I think atomic operation should be allowed without taking a lock, but ward() should be called no matter if operation is atomic or not. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 2] repovfs: add a ward to check if locks are properly taken
On 08/14/2016 05:20 AM, Yuya Nishihara wrote: On Thu, 11 Aug 2016 12:52:25 +0200, Pierre-Yves David wrote: # HG changeset patch # User Pierre-Yves David # Date 1470606246 -7200 # Sun Aug 07 23:44:06 2016 +0200 # Node ID dd18fd65bc8322a8bbacf2b165228b843ba849ae # Parent b84b218971cfb131023b7e9550cb112e6195841b # EXP-Topic vfs.ward repovfs: add a ward to check if locks are properly taken We use the 'ward' feature introduced in the previous email to check for locks when accessing file in '.hg' for writting. Another changeset will add a 'ward' for the store vfs (svfs). This 'ward' system have caught an handful of locking issue that has been fixed in previous series. I expect another batch to be caught in third party extensions. We introduce two exceptions from extensions 'blackbox.log' (because a lot of read-only operations add entry to it), and 'last-email.txt' (because 'hg email' is currently a read only operation and there is value to keep it this way). Maybe patchbomb should use atomictemp to write last-email.txt without locking. That's a good idea. Will still need some exception if we get tempfile covered. --- a/mercurial/localrepo.pyThu Aug 04 17:07:46 2016 +0200 +++ b/mercurial/localrepo.pySun Aug 07 23:44:06 2016 +0200 @@ -244,6 +244,16 @@ class localrepository(object): # only functions defined in module of enabled extensions are invoked featuresetupfuncs = set() +# list of prefix for file which can be written without 'wlock' +# Extensions should extend this list when needed +_wlockfreeprefix = set([# We migh consider requiring 'wlock' for the next +# two, but pretty much all the existing code assume +# wlock is not needed so we keep them excluded for +# now. +'hgrc', +'requires', +]) + def __init__(self, baseui, path=None, create=False): self.requirements = set() self.wvfs = scmutil.vfs(path, expandpath=True, realpath=True) @@ -254,11 +264,15 @@ class localrepository(object): self.auditor = pathutil.pathauditor(self.root, self._checknested) self.nofsauditor = pathutil.pathauditor(self.root, self._checknested, realfs=False) -self.vfs = scmutil.vfs(self.path) -self.opener = self.vfs self.baseui = baseui self.ui = baseui.copy() self.ui.copy = baseui.copy # prevent copying repo configuration +_vfsward = None +if (self.ui.configbool('devel', 'all-warnings') or +self.ui.configbool('devel', 'check-locks')): +_vfsward = self._getvfsward() +self.vfs = scmutil.vfs(self.path, ward=_vfsward) +self.opener = self.vfs # A list of callback to shape the phase if no data were found. # Callback are in the form: func(repo, roots) --> processed root. # This list it to be filled by extension during repo setup @@ -357,6 +371,36 @@ class localrepository(object): # generic mapping between names and nodes self.names = namespaces.namespaces() +def _getvfsward(self): +"""build a ward for self.vfs""" +rref = weakref.ref(self) +def checkvfs(f, mode): +repo = rref() +if (repo is None +or not util.safehasattr(repo, '_wlockref') +or not util.safehasattr(repo, '_lockref')): +return +if mode in ('r', 'rb'): +return +assert f.startswith(repo.path) +# truncate name relative to the repository (.hg) +relname = f[len(repo.path) + 1:] +if relname.startswith('journal.'): +# journal is covered by 'lock' +if repo._currentlock(repo._lockref) is None: +repo.ui.develwarn('write with no lock: "%s"' % relname, + stacklevel=2) +elif repo._currentlock(repo._wlockref) is None: +# rest of vfs file are covered by 'wlock' +# +# exclude special files +for prefix in self._wlockfreeprefix: +if relname.startswith(prefix): +return +repo.ui.develwarn('write with no wlock: "%s"' % relname, + stacklevel=2) +return checkvfs This weakref stuff smells like a layering violation. If vfs should be fully covered by a lock, I think the lock should be managed by vfs. I do not think vfs should handled the lock. locks cover multiple vfs and multiple locks can be relevant for a vfs. Also vfs semantic is about file system access, not locking and transaction logic. That is something higher level. The ward is some developer tool that step over the boundary a bit to get its job done,
Re: [PATCH 1 of 2] vfs: add the possibility to have a "ward" to check vfs usage
On 08/14/2016 05:03 AM, Yuya Nishihara wrote: On Thu, 11 Aug 2016 12:52:24 +0200, Pierre-Yves David wrote: # HG changeset patch # User Pierre-Yves David # Date 1470323266 -7200 # Thu Aug 04 17:07:46 2016 +0200 # Node ID b84b218971cfb131023b7e9550cb112e6195841b # Parent 46d9a5bb2fb0a3629328995948f9ddb0ef0ca4a5 # EXP-Topic vfs.ward vfs: add the possibility to have a "ward" to check vfs usage The function will be called anytime we open a file. The first usage of this 'ward' will be to check that lock are properly taken before accessing file. Later we might use it to ensure we use the right vfs to access files, allowing more vfs to be introduced. We currently only apply the ward on 'open' operation. We will extend this to other operations like copy, creation and removal later. The current readonlyvfs seems to have the same shortcoming. diff -r 46d9a5bb2fb0 -r b84b218971cf mercurial/scmutil.py --- a/mercurial/scmutil.py Mon Aug 08 18:05:10 2016 +0200 +++ b/mercurial/scmutil.py Thu Aug 04 17:07:46 2016 +0200 @@ -482,7 +482,8 @@ class vfs(abstractvfs): This class is used to hide the details of COW semantics and remote file access from higher level code. ''' -def __init__(self, base, audit=True, expandpath=False, realpath=False): +def __init__(self, base, audit=True, expandpath=False, realpath=False, + ward=None): if expandpath: base = util.expandpath(base) if realpath: @@ -491,6 +492,11 @@ class vfs(abstractvfs): self.mustaudit = audit self.createmode = None self._trustnlink = None +# optional function to validate operation on file +# intended to be user for developer checks. +# +# XXX should be call for other things than 'open' +self._ward = ward @property def mustaudit(self): @@ -583,6 +589,8 @@ class vfs(abstractvfs): self._trustnlink = nlink > 1 or util.checknlink(f) if nlink > 1 or not self._trustnlink: util.rename(util.mktempcopy(f), f) +if self._ward is not None: +self._ward(f, mode) fp = util.posixfile(f, mode) Do you have an idea to extend this to the other operations and open of atomictemp files? I doubt it wouldn't be as simple as a callback function. Simple callback should do for the other operation, as they are also performed through vfs. For atomictmp, we probably want to channel their usage through as vfs to be able to cover them. Cheers, -- Pierre-Yves David ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 4 of 4 RFC] pycompat: patch (del|get|has|set)attr to accept bytes on Python 3
# HG changeset patch # User Yuya Nishihara # Date 1471146681 -32400 # Sun Aug 14 12:51:21 2016 +0900 # Node ID 2a300f882c077fa014046db96b545f6aff771d32 # Parent d53bd633898ec6edafccea9f798d314699faf90a pycompat: patch (del|get|has|set)attr to accept bytes on Python 3 getattr() and setattr() are widely used in our code. We wouldn't probably want to rewrite every single call of getattr/setattr. delattr() and hasattr() aren't that important, but they are functions of the same kind. Another option would be util.(get|set)attr. diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py --- a/mercurial/pycompat.py +++ b/mercurial/pycompat.py @@ -31,8 +31,27 @@ else: if sys.version_info[0] >= 3: import builtins +import functools builtins.xrange = range +def _wrapattrfunc(f): +try: +# avoid useless wrapping on reload +f.__wrapped__ +return f +except AttributeError: +pass +@functools.wraps(f) +def w(object, name, *args): +if isinstance(name, bytes): +name = name.decode('utf-8') +return f(object, name, *args) +return w +builtins.delattr = _wrapattrfunc(builtins.delattr) +builtins.getattr = _wrapattrfunc(builtins.getattr) +builtins.hasattr = _wrapattrfunc(builtins.hasattr) +builtins.setattr = _wrapattrfunc(builtins.setattr) + stringio = io.StringIO empty = _queue.Empty queue = _queue.Queue diff --git a/tests/test-check-py3-compat.t b/tests/test-check-py3-compat.t --- a/tests/test-check-py3-compat.t +++ b/tests/test-check-py3-compat.t @@ -122,48 +122,48 @@ mercurial/hook.py: error importing: str expected, not bytes (error at i18n.py:*) (glob) mercurial/httpconnection.py: error importing: str expected, not bytes (error at i18n.py:*) (glob) mercurial/httppeer.py: error importing: str expected, not bytes (error at i18n.py:*) (glob) - mercurial/keepalive.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/localrepo.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/lock.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/mail.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/manifest.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/match.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/mdiff.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/merge.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/minirst.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/namespaces.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/obsolete.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/patch.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/pathutil.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/peer.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/pushkey.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/pvec.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/registrar.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/repair.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/repoview.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/revlog.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/revset.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/scmutil.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/scmwindows.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/similar.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/simplemerge.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/sshpeer.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercurial/sshserver.py: error importing: getattr(): attribute name must be string (error at util.py:*) (glob) - mercu
[PATCH 2 of 4 RFC] pycompat: move xrange alias next to import lines
# HG changeset patch # User Yuya Nishihara # Date 1471146114 -32400 # Sun Aug 14 12:41:54 2016 +0900 # Node ID 76c2c6169dccde84f84a46b7f366020bc3e72959 # Parent 131cc484d7fe95d07f677c16dbb0f506847e1d38 pycompat: move xrange alias next to import lines Builtin functions should be available in compatibility code. diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py --- a/mercurial/pycompat.py +++ b/mercurial/pycompat.py @@ -29,6 +29,12 @@ else: import urllib.parse as urlparse import xmlrpc.client as xmlrpclib +try: +xrange +except NameError: +import builtins +builtins.xrange = range + stringio = io.StringIO empty = _queue.Empty queue = _queue.Queue @@ -145,9 +151,3 @@ else: "SimpleHTTPRequestHandler", "CGIHTTPRequestHandler", )) - -try: -xrange -except NameError: -import builtins -builtins.xrange = range ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 3 of 4 RFC] pycompat: check python version to enable builtins hack
# HG changeset patch # User Yuya Nishihara # Date 1471146253 -32400 # Sun Aug 14 12:44:13 2016 +0900 # Node ID d53bd633898ec6edafccea9f798d314699faf90a # Parent 76c2c6169dccde84f84a46b7f366020bc3e72959 pycompat: check python version to enable builtins hack Future patches will add getattr/setattr wrappers. diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py --- a/mercurial/pycompat.py +++ b/mercurial/pycompat.py @@ -29,9 +29,7 @@ else: import urllib.parse as urlparse import xmlrpc.client as xmlrpclib -try: -xrange -except NameError: +if sys.version_info[0] >= 3: import builtins builtins.xrange = range ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 4 RFC] pycompat: delay loading modules registered to stub
# HG changeset patch # User Yuya Nishihara # Date 1471153584 -32400 # Sun Aug 14 14:46:24 2016 +0900 # Node ID 131cc484d7fe95d07f677c16dbb0f506847e1d38 # Parent 0cb2d4db308b97e8fe7faa8d45a47b228037f230 pycompat: delay loading modules registered to stub New _pycompatstub is compatible with our demandimporter. try-except is replaced by version comparison because ImportError will no longer be raised immediately. diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py --- a/mercurial/pycompat.py +++ b/mercurial/pycompat.py @@ -34,30 +34,32 @@ empty = _queue.Empty queue = _queue.Queue class _pycompatstub(object): -pass - -def _alias(alias, origin, items): -""" populate a _pycompatstub +def __init__(self): +self._aliases = {} -copies items from origin to alias -""" -for item in items: +def _registeraliases(self, origin, items): +"""Add items that will be populated at the first access""" +self._aliases.update((item.replace('_', '').lower(), (origin, item)) + for item in items) + +def __getattr__(self, name): try: -lcase = item.replace('_', '').lower() -setattr(alias, lcase, getattr(origin, item)) -except AttributeError: -pass +origin, item = self._aliases[name] +except KeyError: +raise AttributeError(name) +self.__dict__[name] = obj = getattr(origin, item) +return obj httpserver = _pycompatstub() urlreq = _pycompatstub() urlerr = _pycompatstub() -try: +if sys.version_info[0] < 3: import BaseHTTPServer import CGIHTTPServer import SimpleHTTPServer import urllib2 import urllib -_alias(urlreq, urllib, ( +urlreq._registeraliases(urllib, ( "addclosehook", "addinfourl", "ftpwrapper", @@ -71,7 +73,7 @@ try: "url2pathname", "urlencode", )) -_alias(urlreq, urllib2, ( +urlreq._registeraliases(urllib2, ( "AbstractHTTPHandler", "BaseHandler", "build_opener", @@ -87,24 +89,24 @@ try: "Request", "urlopen", )) -_alias(urlerr, urllib2, ( +urlerr._registeraliases(urllib2, ( "HTTPError", "URLError", )) -_alias(httpserver, BaseHTTPServer, ( +httpserver._registeraliases(BaseHTTPServer, ( "HTTPServer", "BaseHTTPRequestHandler", )) -_alias(httpserver, SimpleHTTPServer, ( +httpserver._registeraliases(SimpleHTTPServer, ( "SimpleHTTPRequestHandler", )) -_alias(httpserver, CGIHTTPServer, ( +httpserver._registeraliases(CGIHTTPServer, ( "CGIHTTPRequestHandler", )) -except ImportError: +else: import urllib.request -_alias(urlreq, urllib.request, ( +urlreq._registeraliases(urllib.request, ( "AbstractHTTPHandler", "addclosehook", "addinfourl", @@ -132,12 +134,12 @@ except ImportError: "urlopen", )) import urllib.error -_alias(urlerr, urllib.error, ( +urlerr._registeraliases(urllib.error, ( "HTTPError", "URLError", )) import http.server -_alias(httpserver, http.server, ( +httpserver._registeraliases(http.server, ( "HTTPServer", "BaseHTTPRequestHandler", "SimpleHTTPRequestHandler", diff --git a/tests/test-check-py3-compat.t b/tests/test-check-py3-compat.t --- a/tests/test-check-py3-compat.t +++ b/tests/test-check-py3-compat.t @@ -122,24 +122,22 @@ mercurial/hook.py: error importing: str expected, not bytes (error at i18n.py:*) (glob) mercurial/httpconnection.py: error importing: str expected, not bytes (error at i18n.py:*) (glob) mercurial/httppeer.py: error importing: str expected, not bytes (error at i18n.py:*) (glob) - mercurial/keepalive.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/localrepo.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/lock.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/mail.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/manifest.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/match.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/mdiff.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/merge.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/minirst.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*) (glob) - mercurial/namespaces.py: error importing: getattr(): attribute name must be string (error at pycompat.py:*)
[PATCH 2 of 3] test-gpg: start gpg-agent under control of the test runner
# HG changeset patch # User Yuya Nishihara # Date 1471161693 -32400 # Sun Aug 14 17:01:33 2016 +0900 # Node ID 20ffcea3fc24f0ef534cd117bbbee134a8dc929c # Parent 4faac945fa5a66924ffb8e35f8c0a812ec2ec535 test-gpg: start gpg-agent under control of the test runner GnuPG v2 automatically starts gpg-agent. We should kill the daemon process. diff --git a/tests/hghave.py b/tests/hghave.py --- a/tests/hghave.py +++ b/tests/hghave.py @@ -349,6 +349,10 @@ def has_tla(): def has_gpg(): return matchoutput('gpg --version 2>&1', br'GnuPG') +@check("gpg2", "gpg client v2") +def has_gpg2(): +return matchoutput('gpg --version 2>&1', br'GnuPG[^0-9]+2\.') + @check("unix-permissions", "unix-style permissions") def has_unix_permissions(): d = tempfile.mkdtemp(dir='.', prefix=tempprefix) diff --git a/tests/test-gpg.t b/tests/test-gpg.t --- a/tests/test-gpg.t +++ b/tests/test-gpg.t @@ -12,6 +12,13 @@ Test the GPG extension $ GNUPGHOME="$TESTTMP/gpg"; export GNUPGHOME $ cp -R "$TESTDIR/gpg" "$GNUPGHOME" +Start gpg-agent, which is required by GnuPG v2 + +#if gpg2 + $ gpg-connect-agent -q --subst /serverpid '/echo ${get serverpid}' /bye \ + > >> $DAEMON_PIDS +#endif + $ hg init r $ cd r $ echo foo > foo ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 3 of 3] test-gpg: run migration of v1 secret keys beforehand
# HG changeset patch # User Yuya Nishihara # Date 1471162025 -32400 # Sun Aug 14 17:07:05 2016 +0900 # Node ID 0cb2d4db308b97e8fe7faa8d45a47b228037f230 # Parent 20ffcea3fc24f0ef534cd117bbbee134a8dc929c test-gpg: run migration of v1 secret keys beforehand This suppresses unwanted output at "hg sign". diff --git a/tests/test-gpg.t b/tests/test-gpg.t --- a/tests/test-gpg.t +++ b/tests/test-gpg.t @@ -12,11 +12,13 @@ Test the GPG extension $ GNUPGHOME="$TESTTMP/gpg"; export GNUPGHOME $ cp -R "$TESTDIR/gpg" "$GNUPGHOME" -Start gpg-agent, which is required by GnuPG v2 +Start gpg-agent, which is required by GnuPG v2, and migrate secret keys #if gpg2 $ gpg-connect-agent -q --subst /serverpid '/echo ${get serverpid}' /bye \ > >> $DAEMON_PIDS + $ gpg --no-permission-warning --no-secmem-warning --list-secret-keys \ + > > /dev/null 2>&1 #endif $ hg init r ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 3] test-gpg: make temporary copy of GNUPGHOME
# HG changeset patch # User Yuya Nishihara # Date 1471160987 -32400 # Sun Aug 14 16:49:47 2016 +0900 # Node ID 4faac945fa5a66924ffb8e35f8c0a812ec2ec535 # Parent 0bb8f8c1910a03c917c27ef06707e8c39ba50bfe test-gpg: make temporary copy of GNUPGHOME GnuPG v2 will convert v1 secret keys and create a socket under $GNUPGHOME. This patch makes sure no state would persist. We no longer need to verify trustdb.gpg, which was added by aae219a99a6e. diff --git a/tests/test-gpg.t b/tests/test-gpg.t --- a/tests/test-gpg.t +++ b/tests/test-gpg.t @@ -7,8 +7,11 @@ Test the GPG extension > gpg= > > [gpg] - > cmd=gpg --no-permission-warning --no-secmem-warning --no-auto-check-trustdb --homedir "$TESTDIR/gpg" + > cmd=gpg --no-permission-warning --no-secmem-warning --no-auto-check-trustdb > EOF + $ GNUPGHOME="$TESTTMP/gpg"; export GNUPGHOME + $ cp -R "$TESTDIR/gpg" "$GNUPGHOME" + $ hg init r $ cd r $ echo foo > foo @@ -36,12 +39,4 @@ Test the GPG extension e63c23eaa88a is signed by: hgtest -verify that this test has not modified the trustdb.gpg file back in -the main hg working dir - $ md5sum.py "$TESTDIR/gpg/trustdb.gpg" - f6b9c78c65fa9536e7512bb2ceb338ae */gpg/trustdb.gpg (glob) - -don't leak any state to next test run - $ rm -f "$TESTDIR/gpg/random_seed" - $ cd .. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 3 of 5] outgoing: add a 'missingroots' argument
On Thu, 11 Aug 2016 22:06:53 +0200, Pierre-Yves David wrote: > # HG changeset patch > # User Pierre-Yves David > # Date 1470774698 -7200 > # Tue Aug 09 22:31:38 2016 +0200 > # Node ID 8c4fcb1244bdf79caadadc73fc1e489a160a07ec > # Parent 9ff7059253fd00094799f592462590cd837fb457 > # EXP-Topic outgoing > outgoing: add a 'missingroots' argument > > This argument can be used instead of 'commonheads' to determine the 'outgoing' > set. We remove the outgoingbetween function as its role can now be handled by > 'outgoing' itself. > > I've thought of using an external function instead of making the constructor > more complicated. However, there is low hanging fruit to improve the current > code flow by storing some side products of the processing of 'missingroots'. > So > in my opinion it make senses to add all this to the class. > --- a/mercurial/discovery.py Tue Aug 09 15:55:44 2016 +0200 > +++ b/mercurial/discovery.py Tue Aug 09 22:31:38 2016 +0200 > @@ -76,11 +76,25 @@ class outgoing(object): > The sets are computed on demand from the heads, unless provided upfront > by discovery.''' > > -def __init__(self, repo, commonheads=None, missingheads=None): > +def __init__(self, repo, commonheads=None, missingheads=None, > + missingroots=None): > +# at least one of them must not be set > +assert None in (commonheads, missingroots) > cl = repo.changelog > if not missingheads: > missingheads = cl.heads() > -if not commonheads: > +if missingroots: > +discbases = [] > +for n in missingroots: > +discbases.extend([p for p in cl.parents(n) if p != nullid]) > +# TODO remove call to nodesbetween. > +# TODO populate attributes on outgoing instance instead of > setting > +# discbases. > +csets, roots, heads = cl.nodesbetween(missingroots, missingheads) > +included = set(csets) > +missingheads = heads > +commonheads = [n for n in discbases if n not in included] > +elif not commonheads: > commonheads = [nullid] > self.commonheads = commonheads > self.missingheads = missingheads > @@ -106,27 +120,6 @@ class outgoing(object): > self._computecommonmissing() > return self._missing > > -def outgoingbetween(repo, roots, heads): > -"""create an ``outgoing`` consisting of nodes between roots and heads > - > -The ``missing`` nodes will be descendants of any of the ``roots`` and > -ancestors of any of the ``heads``, both are which are defined as a list > -of binary nodes. > -""" I can't tell if this is an improvement or not, but the overall changes look good to me. I'll let Greg or Augie take 2nd pass. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel