Hi, On Thu, Dec 15, 2022 at 12:38:11PM +0700, Arnaud Rebillout wrote: > Package: git-buildpackage > Version: 0.9.30 > Severity: normal > User: de...@kali.org > Usertags: origin-kali > > Dear Maintainer, > > In Kali Linux, we package an upstream that uses Git LFS to store a big > file (a GeoIP database). The upstream is at: > https://github.com/rsmusllp/king-phisher > > In a previous version, upstream used to version the database "as is", it > was a regular file in the Git repo. Then in a subsequent version, they > switched to use Git LFS to store this file. > > Gbp doesn't handle this transition well, apparently this is due to the > combination of: > * "gbp clone" disabling Git attributes (hence git lfs) > * however "gbp import-orig" does no such thing > > I'm the person who updated this package, so my local copy of > king-phisher doesn't have the git attributes disabled, and everything > works fine with me. However other folks who clone the repo complain, as > it leads to an unclean git checkout, and I don't know what's the way > forward. > > For a longer (and hopefully crystal-clear) explanation of the issue, I > prepared a Git repo and a walkthrough to reproduce the issue. There we > go :) > > Let's first clone the king-phisher package *before* upstream switched to > Git LFS: > > $ gbp clone https://gitlab.com/arnaudr/king-phisher.git > $ cd king-phisher > $ cat .gitattributes > cat: .gitattributes: No such file or directory > $ ls -l data/server/king_phisher/GeoLite2-City.mmdb > -rw-r--r-- 1 arno arno 61615395 Dec 15 11:53 > data/server/king_phisher/GeoLite2-City.mmdb > > So at this point, the file GeoLite2-City.mmdb is versioned "as is", it > is a regular file. > > Now let's update the package to latest Git snapshot: > > $ gbp import-orig --uscan > gbp:info: Launching uscan... > Downloading data/server/king_phisher/GeoLite2-City.mmdb (62 MB) > gbp:info: Using uscan downloaded tarball > ../king-phisher_1.15.0+git20221107.orig.tar.xz > What is the upstream version? [1.15.0+git20221107] > gbp:info: Importing '../king-phisher_1.15.0+git20221107.orig.tar.xz' to > branch 'upstream'... > gbp:info: Source package is king-phisher > gbp:info: Upstream version is 1.15.0+git20221107 > gbp:info: Replacing upstream source on 'kali/master' > gbp:info: Successfully imported version 1.15.0+git20221107 of > ../king-phisher_1.15.0+git20221107.orig.tar.xz > > The line "Downloading data/server/king_phisher/GeoLite2-City.mmdb (62 > MB" comes from git lfs, which is downloading the file. And here's the > situation now: > > $ cat .gitattributes > *.mmdb filter=lfs diff=lfs merge=lfs -text > $ cat .git/info/attributes > cat: .git/info/attributes: No such file or directory > $ ls -l data/server/king_phisher/GeoLite2-City.mmdb > -rw-r--r-- 1 arno arno 61615395 Dec 15 11:56 > data/server/king_phisher/GeoLite2-City.mmdb > > So we can see the git lfs thinggy, and we can see that > .git/info/attributes' doesn't exist (more on that below). > > Let's push that work (I prepared a fork to push changes): > > $ git remote add arnaudr2 g...@gitlab.com:arnaudr/king-phisher2.git > $ git push arnaudr2 : --follow-tags > Locking support detected on remote "arnaudr2". Consider enabling it with: > $ git config > lfs.https://gitlab.com/arnaudr/king-phisher2.git/info/lfs.locksverify true > Locking support detected on remote "arnaudr2". Consider enabling it with: > $ git config > lfs.https://gitlab.com/arnaudr/king-phisher2.git/info/lfs.locksverify true > Locking support detected on remote "arnaudr2". Consider enabling it with: > $ git config > lfs.https://gitlab.com/arnaudr/king-phisher2.git/info/lfs.locksverify true > Locking support detected on remote "arnaudr2". Consider enabling it with: > $ git config > lfs.https://gitlab.com/arnaudr/king-phisher2.git/info/lfs.locksverify true > Uploading LFS objects: 100% (1/1), 62 MB | 3.4 MB/s, done. > Enumerating objects: 112, done. > Counting objects: 100% (82/82), done. > Delta compression using up to 8 threads > Compressing objects: 100% (46/46), done. > Writing objects: 100% (49/49), 19.06 KiB | 19.06 MiB/s, done. > Total 49 (delta 29), reused 5 (delta 0), pack-reused 0 > remote: > remote: To create a merge request for pristine-tar, visit: > remote: > https://gitlab.com/arnaudr/king-phisher2/-/merge_requests/new?merge_request%5Bsource_branch%5D=pristine-tar > remote: > remote: > remote: To create a merge request for upstream, visit: > remote: > https://gitlab.com/arnaudr/king-phisher2/-/merge_requests/new?merge_request%5Bsource_branch%5D=upstream > remote: > To gitlab.com:arnaudr/king-phisher2.git > c5db68b..dbf4ce7 kali/master -> kali/master > d9ec6a5..e4e9390 pristine-tar -> pristine-tar > be63910..f4f0fae upstream -> upstream > * [new tag] upstream/1.15.0+git20221107 -> > upstream/1.15.0+git20221107 > > And now, the issue: when we clone this repo with gbp, the resulting repo > is not clean. Let's try: > > $ gbp clone -v g...@gitlab.com:arnaudr/king-phisher2.git > gbp:debug: ['git', 'rev-parse', '--show-cdup'] > gbp:info: Cloning from 'g...@gitlab.com:arnaudr/king-phisher2.git' > gbp:debug: ['git', 'clone', '--quiet', > 'g...@gitlab.com:arnaudr/king-phisher2.git'] > gbp:debug: ['git', 'rev-parse', '--show-cdup'] > gbp:debug: ['git', 'rev-parse', '--is-bare-repository'] > gbp:debug: ['git', 'rev-parse', '--git-dir'] > gbp:debug: ['git', 'rev-parse', '--show-cdup'] > gbp:debug: ['git', 'rev-parse', '--is-bare-repository'] > gbp:debug: ['git', 'rev-parse', '--git-dir'] > gbp:debug: Will track branches: ['kali/master', 'upstream', 'pristine-tar'] > gbp:debug: ['git', 'show-ref', '--verify', > 'refs/remotes/origin/kali/master'] > gbp:debug: ['git', 'show-ref', '--verify', 'refs/heads/kali/master'] > gbp:debug: ['git', 'show-ref', '--verify', 'refs/remotes/origin/upstream'] > gbp:debug: ['git', 'show-ref', '--verify', 'refs/heads/upstream'] > gbp:debug: ['git', 'branch', 'upstream', 'origin/upstream'] > gbp:debug: ['git', 'show-ref', '--verify', > 'refs/remotes/origin/pristine-tar'] > gbp:debug: ['git', 'show-ref', '--verify', 'refs/heads/pristine-tar'] > gbp:debug: ['git', 'branch', 'pristine-tar', 'origin/pristine-tar'] > gbp:debug: ['git', 'show-ref', '--verify', 'refs/remotes/kali/master'] > gbp:debug: ['git', 'config', 'user.name', 'Arnaud Rebillout'] > gbp:debug: ['git', 'config', 'user.email', 'arna...@kali.org'] > gbp:debug: ['git', 'ls-tree', '-z', '-r', '-l', 'HEAD', '--'] > gbp:debug: Found non-empty .gitattributes: b'.gitattributes' > gbp:debug: Configuring Git attributes > > $ cd king-phisher2 > > $ git status > On branch kali/master > Your branch is up to date with 'origin/kali/master'. > > Changes not staged for commit: > (use "git add <file>..." to update what will be committed) > (use "git restore <file>..." to discard changes in working directory) > modified: data/server/king_phisher/GeoLite2-City.mmdb > > no changes added to commit (use "git add" and/or "git commit -a") > > $ cat .gitattributes > *.mmdb filter=lfs diff=lfs merge=lfs -text > $ cat .git/info/attributes > # Added by git-buildpackage to disable .gitattributes found in the upstream > tree > [attr]dgit-defuse-attrs -text -eol -crlf -ident -filter > -working-tree-encoding > * -export-ignore > * dgit-defuse-attrs > $ ls -l data/server/king_phisher/GeoLite2-City.mmdb > -rw-r--r-- 1 arno arno 61615395 Dec 15 12:12 > data/server/king_phisher/GeoLite2-City.mmdb > > As we can see above (my interpretation): > * during the 'gbp clone' step, the 'git clone' command will actually > trigger git lfs, and download the GeoLite2 database (assuming you have > the package git-lfs installed on your machine). > * then at the end of the gbp clone operation, we can see "Configuring > Git attributes", and this is when gbp creates the file > .git/info/attributes > * as a result, the git repo is in an unclean state > > To bring back the Git repo in shape, we can either: > > 1) Undo what gbp just did: > > rm -fr .git/info/attributes > > 2) Undo what git lfs did: > > $ git checkout data/server/king_phisher/GeoLite2-City.mmdb > Updated 1 path from the index > $ cat data/server/king_phisher/GeoLite2-City.mmdb > version https://git-lfs.github.com/spec/v1 > oid > sha256:a253d9cd68fe17b00087da24375f31f07cd4bb3852dc5fe3afe37b8f59e5abd0 > size 61615395
Or use `gbp clone --git-defuse-attributes=off ...` ? Cheers, -- Guido > > As we can see with option 2), the LFS file becomes a short metadata > file, because that's what's really in the Git repo, before "git lfs" > replaces it with the "real file" that it fetches from somewhere else. > > == Questions > > How does the git LFS files should be handled? When "gbp clone" disables > the gitattributes, it disables Git LFS in turn: is it intended, or not? > Does gbp has an opinion on that? In any case, it seems that disabling > the gitattributes after 'git clone' has run is too late, because the Git > LFS objects were already fetched. > > Thanks for reading, and please help me understand how we should handle > those LFS files. > > Arnaud >