On Fri, 2021-04-02 at 13:15 -0400, Paul Gortmaker wrote:
> If a clone in the download directory is not static, and was created with
> single-branch to avoid additonal unwanted content, then the fetcher will
> come along and spoil that effort by unconditonally getting refs/* from
> the server and downloading everything available at the server.
> 
> To that end, allow a companion variable that can optionally be used to
> limit the fetch to a subset of references that are in line with the
> desired content for the recipe in question, and so that large unwanted
> downloads aren't inadvertently triggered during an update.
> 
> If not specified, the existing "fetch everything" behaviour remains.
> 
> Signed-off-by: Paul Gortmaker <paul.gortma...@windriver.com>
> ---
>  bitbake/lib/bb/fetch2/git.py           | 3 ++-
>  documentation/ref-manual/variables.rst | 8 ++++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> index 22281e2cfb98..b54ec76d7174 100644
> --- a/bitbake/lib/bb/fetch2/git.py
> +++ b/bitbake/lib/bb/fetch2/git.py
> @@ -362,7 +362,8 @@ class Git(FetchMethod):
>                runfetchcmd("%s remote rm origin" % ud.basecmd, d, 
> workdir=ud.clonedir)
>  
> 
> 
> 
>              runfetchcmd("%s remote add --mirror=fetch origin %s" % 
> (ud.basecmd, shlex.quote(repourl)), d, workdir=ud.clonedir)
> -            fetch_cmd = "LANG=C %s fetch -f --progress %s refs/*:refs/*" % 
> (ud.basecmd, shlex.quote(repourl))
> +            fetchrefs = d.getVar('GITFETCHREFS_' + ud.names[0]) or 
> d.getVar('GITFETCHREFS') or "refs/*:refs/*"
> +            fetch_cmd = "LANG=C %s fetch -f --progress %s %s" % (ud.basecmd, 
> shlex.quote(repourl), fetchrefs)
>              if ud.proto.lower() != 'file':
>                  bb.fetch2.check_network_access(d, fetch_cmd, ud.url)
>              progresshandler = GitProgressHandler(d)

I think one of the reasons I'm unease about this series is that some of the 
changes break the existing fetcher functionality. Taking the above, I 
understand why you want to filter the references but this will break the
mirroring.

How? Imagine a repo has branches A and B and the recipe filters to A. This 
will result in a mirror tarball only with A in it. The mirror tarball name will
attempt to be reused by a recipe referencing with B and the mirroring 
mechanism will fail.

Mirroring is the reason we do use all pull the whole repo here. It isn't
good from an efficiency standpoint but it does work well for mirroring. This
piece of code from the fetcher kind of summarises it:

"""
gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', 
'.').replace('*', '.').replace('
','_'))
if gitsrcname.startswith('.'):
    gitsrcname = gitsrcname[1:]

# for rebaseable git repo, it is necessary to keep mirror tar ball
# per revision, so that even the revision disappears from the
# upstream repo in the future, the mirror will remain intact and still
# contains the revision
if ud.rebaseable:
    for name in ud.names:
        gitsrcname = gitsrcname + '_' + ud.revisions[name]

mirrortarball = 'git2_%s.tar.gz' % gitsrcname
"""

To make the above "work", you'd need to rename the mirror tarball to be
branch specific and include A or B in the mirror tarball filename.

I also then wonder whether the form of GITFETCHREFS is good API. It is certainly
powerful however the code already knows which branch we're asking it to fetch
so we could just add a parameter to the url "clonebranchonly" which asks it
to only clone that branch and not all the references?

I appreciate that doesn't give you the way to say "heads/v5.4/*" but instead
you'd get something more specific like "v5.4/standard/base" but that would 
filter the fetch to me even more efficient to the particularly use case a
user would generally be working with (a specific machine). I appreciate that
you're not going to like a mirror tarball per kernel branch and even trying
to get the system to generate all the right tarballs would be a nightmare.

Another alternative could be a parameter like ";filterclone=heads/v5.4/" which
the fetcher could then translate to "refs/heads/v5.4/*:refs/heads/v5.4/*" 
internally.

I guess in summary this means I have two worries about the API. Ideally we 
want something which can be put simply in the URL itself avoiding external 
variables. We also want something which is end user readable, rather than 
being a direct copy of the git commandline. I'd prefer to avoid direct 
commandlines as they only encourage people to start using them to different 
things and then we end up breaking mirrors, have incomplete test coverage
and/or find it really hard to add/change functionality later.

Cheers,

Richard




-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#9662): 
https://lists.yoctoproject.org/g/linux-yocto/message/9662
Mute This Topic: https://lists.yoctoproject.org/mt/81808153/21656
Group Owner: linux-yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/linux-yocto/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to