Re: [PATCH 1 of 2 V2] exchange: refactor APIs to obtain bundle data (API)

2016-10-16 Thread Pierre-Yves David



On 10/16/2016 08:14 PM, Gregory Szorc wrote:

# HG changeset patch
# User Gregory Szorc 
# Date 1476639532 25200
#  Sun Oct 16 10:38:52 2016 -0700
# Node ID 3f18f7464e651128a5f8d9c9312805adbc22f547
# Parent  757d25d2bbe63fc560e92b6bb25fbfbf09b09342
exchange: refactor APIs to obtain bundle data (API)


That first one is pushed, thanks,

--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 2 V2] exchange: refactor APIs to obtain bundle data (API)

2016-10-16 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1476639532 25200
#  Sun Oct 16 10:38:52 2016 -0700
# Node ID 3f18f7464e651128a5f8d9c9312805adbc22f547
# Parent  757d25d2bbe63fc560e92b6bb25fbfbf09b09342
exchange: refactor APIs to obtain bundle data (API)

Currently, exchange.getbundle() returns either a cg1unpacker or a
util.chunkbuffer (in the case of bundle2). This is kinda OK, as
both expose a .read() to consumers. However, localpeer.getbundle()
has code inferring what the response type is based on arguments and
converts the util.chunkbuffer returned in the bundle2 case to a
bundle2.unbundle20 instance. This is a sign that the API for
exchange.getbundle() is not ideal because it doesn't consistently
return an "unbundler" instance.

In addition, unbundlers mask the fact that there is an underlying
generator of changegroup data. In both cg1 and bundle2, this generator
is being fed into a util.chunkbuffer so it can be re-exposed as a
file object.

util.chunkbuffer is a nice abstraction. However, it should only be
used "at the edges." This is because keeping data as a generator is
more efficient than converting it to a chunkbuffer, especially if we
convert that chunkbuffer back to a generator (as is the case in some
code paths currently).

This patch refactors exchange.getbundle() into
exchange.getbundlechunks(). The new API returns an iterator of chunks
instead of a file-like object.

Callers of exchange.getbundle() have been updated to use the new API.

There is a minor change of behavior in test-getbundle.t. This is
because `hg debuggetbundle` isn't defining bundlecaps. As a result,
a cg1 data stream and unpacker is being produced. This is getting fed
into a new bundle20 instance via bundle2.writebundle(), which uses
a backchannel mechanism between changegroup generation to add the
"nbchanges" part parameter. I never liked this backchannel mechanism
and I plan to remove it someday. `hg bundle` still produces the
"nbchanges" part parameter, so there should be no user-visible
change of behavior. I consider this "regression" a bug in
`hg debuggetbundle`. And that bug is captured by an existing
"TODO" in the code to use bundle2 capabilities.

diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -1527,43 +1527,37 @@ def getbundle2partsgenerator(stepname, i
 return func
 return dec
 
 def bundle2requested(bundlecaps):
 if bundlecaps is not None:
 return any(cap.startswith('HG2') for cap in bundlecaps)
 return False
 
-def getbundle(repo, source, heads=None, common=None, bundlecaps=None,
-  **kwargs):
-"""return a full bundle (with potentially multiple kind of parts)
+def getbundlechunks(repo, source, heads=None, common=None, bundlecaps=None,
+**kwargs):
+"""Return chunks constituting a bundle's raw data.
 
 Could be a bundle HG10 or a bundle HG20 depending on bundlecaps
-passed. For now, the bundle can contain only changegroup, but this will
-changes when more part type will be available for bundle2.
+passed.
 
-This is different from changegroup.getchangegroup that only returns an HG10
-changegroup bundle. They may eventually get reunited in the future when we
-have a clearer idea of the API we what to query different data.
-
-The implementation is at a very early stage and will get massive rework
-when the API of bundle is refined.
+Returns an iterator over raw chunks (of varying sizes).
 """
 usebundle2 = bundle2requested(bundlecaps)
 # bundle10 case
 if not usebundle2:
 if bundlecaps and not kwargs.get('cg', True):
 raise ValueError(_('request for bundle10 must include 
changegroup'))
 
 if kwargs:
 raise ValueError(_('unsupported getbundle arguments: %s')
  % ', '.join(sorted(kwargs.keys(
 outgoing = _computeoutgoing(repo, heads, common)
-return changegroup.getchangegroup(repo, source, outgoing,
-  bundlecaps=bundlecaps)
+bundler = changegroup.getbundler('01', repo, bundlecaps)
+return changegroup.getsubsetraw(repo, outgoing, bundler, source)
 
 # bundle20 case
 b2caps = {}
 for bcaps in bundlecaps:
 if bcaps.startswith('bundle2='):
 blob = urlreq.unquote(bcaps[len('bundle2='):])
 b2caps.update(bundle2.decodecaps(blob))
 bundler = bundle2.bundle20(repo.ui, b2caps)
@@ -1571,17 +1565,17 @@ def getbundle(repo, source, heads=None, 
 kwargs['heads'] = heads
 kwargs['common'] = common
 
 for name in getbundle2partsorder:
 func = getbundle2partsmapping[name]
 func(bundler, repo, source, bundlecaps=bundlecaps, b2caps=b2caps,
  **kwargs)
 
-return util.chunkbuffer(bundler.getchunks())
+return bundler.getchunks()
 
 @getbundle2partsgenerator('changegroup')
 def