Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 19 September 2014 07:33, Diamond wrote: > Lets assume, that I don't want to scrap old ebuild yet. There's no git > cp command. git mv is just git rm + git add. That's what does it look > like (usual revbump with git add in reality): > > https://github.com/cerebrum/dr/commit/311df9b04d876f5847416fe5ba699edfab50adb6 > I think that git (at least with default config is a pain in the ass for > packages at all and we should probably think about better platform for > portage). > Not necessarily. It tracks copies too, -C Also, don't rely on githubs presentation of things as being gospel, its better than nothing, but it falls short of git. git log -p --find-copies-harder games-strategy/openra-bin/*.ebuild You can quite easily convince git to pretend vaguely similar files in the log are sources of each other with the right options. Throwing "-M1 -C1" in that command will let git find more Just because "I think git cant" doesn't mean "git cant", especially if you've not asked "how can I do " :) For instance: > git log --stat -C1 -M1 --find-copies-harder In your repo finds you these interesting "copies" if you look far enough >.../files/opentracker.init.d => net-misc/twonky/files/twonky.initd | 40 +-- >{www-apps/rutorrent => net-misc/minidlna}/metadata.xml | 16 ++- >{games-strategy/openra-bin => dev-util/mono-debugger}/metadata.xml | 8 +-- >media-video/{webcamstudio => webcamstudio-module}/ChangeLog | 42 + -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Diamond wrote: > I stumbled over this problem when started to use git for packages. Use git show -M to unstumble yourself. //Peter
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Thu, 18 Sep 2014 16:00:59 -0400 Rich Freeman wrote: > What would you propose? The problem you raise is just as much an > issue with cvs. I don't get a continuous history across revbumps in > cvs today, so I don't really see a problem with moving to git. I don't know what to propose. I stumbled over this problem when started to use git for packages. At least there are other SCM systems too. Haven't investigated them yet for that issue. Facebook uses even it's own one.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Thu, Sep 18, 2014 at 3:33 PM, Diamond wrote: > Lets assume, that I don't want to scrap old ebuild yet. There's no git > cp command. git mv is just git rm + git add. That's what does it look > like (usual revbump with git add in reality): > https://github.com/cerebrum/dr/commit/311df9b04d876f5847416fe5ba699edfab50adb6 > I think that git (at least with default config is a pain in the ass for > packages at all and we should probably think about better platform for > portage). > What would you propose? The problem you raise is just as much an issue with cvs. I don't get a continuous history across revbumps in cvs today, so I don't really see a problem with moving to git. -- Rich
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Thu, 18 Sep 2014 17:04:55 +1200 Kent Fredric wrote: > What's more, you can in fact do: > > git mv foo-1.ebuild foo-2.ebuild > git commit > > and you can still easily tell git to show that as a difference in a > log. > > Example script to emulate this and example output: > https://gist.github.com/kentfredric/10e93e9aac875e9edb93 > > ( In fact, you don't even have to use 'git mv', as long as you change > the tree state completely, git is smart enough to track most changes ) > Lets assume, that I don't want to scrap old ebuild yet. There's no git cp command. git mv is just git rm + git add. That's what does it look like (usual revbump with git add in reality): https://github.com/cerebrum/dr/commit/311df9b04d876f5847416fe5ba699edfab50adb6 I think that git (at least with default config is a pain in the ass for packages at all and we should probably think about better platform for portage).
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 18 September 2014 13:01, Rich Freeman wrote: > With git a revbump is: > cp foo-1.ebuild foo-2.ebuild > git add foo-2.ebuild > git commit > > (I left out changelogs, repoman, etc, since there is no change with > any of these, and I left out syncing the git repo.) > > There really is nothing new here. > > > Especially > > if you need to see the diff between packagename-0.1-r1 and > > packagename-0.1-r2 ebuilds? Git doesn't do this by default and it > > will might be a nightmare to compare such revbumps by hand. > > > > cvs doesn't do anything to compare the contents of different files. > So, there really is no loss here. > What's more, you can in fact do: git mv foo-1.ebuild foo-2.ebuild git commit and you can still easily tell git to show that as a difference in a log. Example script to emulate this and example output: https://gist.github.com/kentfredric/10e93e9aac875e9edb93 ( In fact, you don't even have to use 'git mv', as long as you change the tree state completely, git is smart enough to track most changes ) -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Wed, Sep 17, 2014 at 4:02 PM, Diamond wrote: > On Mon, 15 Sep 2014 14:51:56 -0400 > Rich Freeman wrote: >> >> In general you want each commit to represent a single "change." That >> might be a revbump in a single package, or it might be a package move >> that involves touching 300 packages in a single commit. > > Is it right that you are going to move portage packages to > git/github/..? The intent is to move to git. Git is an scm. Github is a service that hosts git repositories with some other value-adds. There is interest in mirroring gentoo-x86 on github to help entice more to contribute (assuming they like working with github), but the nature of git makes it very easy to mirror a repository on as many sites as you care to. That is one of the reasons that it is so popular with FOSS. > How are you going to make "revbump" with git? With cvs a revbump is: cp foo-1.ebuild foo-2.ebuild cvs add foo-2.ebuild cvs commit With git a revbump is: cp foo-1.ebuild foo-2.ebuild git add foo-2.ebuild git commit (I left out changelogs, repoman, etc, since there is no change with any of these, and I left out syncing the git repo.) There really is nothing new here. > Especially > if you need to see the diff between packagename-0.1-r1 and > packagename-0.1-r2 ebuilds? Git doesn't do this by default and it > will might be a nightmare to compare such revbumps by hand. > cvs doesn't do anything to compare the contents of different files. So, there really is no loss here. -- Rich
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 18 September 2014 08:02, Diamond wrote: > Git doesn't do this by default and it > will might be a nightmare to compare such revbumps by hand. > git diff -M1 -C1 ^ is usually sufficient to show new files as differences between similar files that were already there, including revbumps. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Mon, 15 Sep 2014 14:51:56 -0400 Rich Freeman wrote: > > In general you want each commit to represent a single "change." That > might be a revbump in a single package, or it might be a package move > that involves touching 300 packages in a single commit. Is it right that you are going to move portage packages to git/github/..? How are you going to make "revbump" with git? Especially if you need to see the diff between packagename-0.1-r1 and packagename-0.1-r2 ebuilds? Git doesn't do this by default and it will might be a nightmare to compare such revbumps by hand.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 09/16/2014 01:56 PM, hasufell wrote: > Luca Barbato: >> On 15/09/14 01:21, Patrick Lauer wrote: >>> On Sunday 14 September 2014 15:42:15 hasufell wrote: Patrick Lauer: >> Are we going to disallow merge commits and ask devs to rebase local >> changes in order to keep the history "clean"? > > Is that going to be sane with our commit frequency? You have to merge or rebase anyway in case of a push conflict, so the only difference is the method and the effect on the history. Currently... CVS allows you to run repoman on an outdated tree and push broken ebuilds with repoman being happy. Git will not allow this. >>> >>> iow, git doesn't allow people to work on more than one item at a time? >> >> It does. >> > > I think we really have to write up a step-by-step guide (not just > workflow policies) for people who have never seriously worked with git. > > On the other hand... there are thousands of tutorials on the net already. > > For the workflow model, I already have created a draft which is in no > way finished or even correct and there are still some controversially > discussed issues. > https://wiki.gentoo.org/wiki/Gentoo_git_workflow > As a prospective Gentoo developer, having a guide around meant specifically for Gentoo's practices would be incredibly helpful. I use git in my own hobby development and learned from Pro Git, et al, but it'd still be really nice to have, and give developers a place to point to if a new developer is having troubles. Just my 2¢. signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-16, o godz. 19:05:18 Luca Barbato napisał(a): > On 14/09/14 16:46, Michał Górny wrote: > > Of course, if we can't spare the resources to do intermediate updates, > > we may as well switch to cron-based update method. > > The mirror have a sync time, so basically regenerating the cache and > pushing the tree with further toward the user can happen the same way > w/out impacting anybody. > > We could just snapshot the tree when the regen starts and push that > commit to the user git and rsync. > > People is still supposed to play nice and sync not every minute. People don't have to sync. They will pull, and pulling often doesn't really hurt servers like rsync does. Of course, I'm considering the users switching to git there. However, I don't think limitations of rsync should impact them. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Luca Barbato: > On 15/09/14 01:21, Patrick Lauer wrote: >> On Sunday 14 September 2014 15:42:15 hasufell wrote: >>> Patrick Lauer: > Are we going to disallow merge commits and ask devs to rebase local > changes in order to keep the history "clean"? Is that going to be sane with our commit frequency? >>> >>> You have to merge or rebase anyway in case of a push conflict, so the >>> only difference is the method and the effect on the history. >>> >>> Currently... CVS allows you to run repoman on an outdated tree and push >>> broken ebuilds with repoman being happy. Git will not allow this. >> >> iow, git doesn't allow people to work on more than one item at a time? > > It does. > I think we really have to write up a step-by-step guide (not just workflow policies) for people who have never seriously worked with git. On the other hand... there are thousands of tutorials on the net already. For the workflow model, I already have created a draft which is in no way finished or even correct and there are still some controversially discussed issues. https://wiki.gentoo.org/wiki/Gentoo_git_workflow
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Tue, Sep 16, 2014 at 1:07 PM, Luca Barbato wrote: > On 14/09/14 17:30, Patrick Lauer wrote: >>> Are we going to disallow merge commits and ask devs to rebase local >>> changes in order to keep the history "clean"? >> >> Is that going to be sane with our commit frequency? >> > > Which is our commit frequency? Worst case we can aggregate changes and > push them in bulk. > I don't think commit frequency will be a problem other than maybe causing repoman issues if you're doing tree-wide changes (since repoman takes a while to run). See: https://github.com/rich0/gentoo-gitmig-2014-09-15/graphs/commit-activity Our tree gets 50-150 commits/day on average it seems. I have no idea how far back the punchcard view goes, but that should give a relative sense of distribution. -- Rich
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15/09/14 01:21, Patrick Lauer wrote: > On Sunday 14 September 2014 15:42:15 hasufell wrote: >> Patrick Lauer: Are we going to disallow merge commits and ask devs to rebase local changes in order to keep the history "clean"? >>> >>> Is that going to be sane with our commit frequency? >> >> You have to merge or rebase anyway in case of a push conflict, so the >> only difference is the method and the effect on the history. >> >> Currently... CVS allows you to run repoman on an outdated tree and push >> broken ebuilds with repoman being happy. Git will not allow this. > > iow, git doesn't allow people to work on more than one item at a time? It does. > That'd mean I need half a dozen checkouts just to emulate cvs, which somehow > doesn't make much sense to me ... Your statement sounds strange to me. commands you need to know: git rebase -i git add (-p) git commit (-p) git branch/checkout Examples edit cat/pkg/foo.ebuild edit cat2/pkg/bar.ebuild edit profile git add -p# to select by line what you want in the commit git commit# and you write down the commit message git commit -p # to do both at the same time. git commit -p # again to lump other changes line by line OR edit cat/pkg/foo.ebuild git commit -a # everything (that's tracked) you edited gets in a commit edit cat/pkg/bar.ebuild git commit -a # everything (that's tracked) you edited gets in again ... git rebase -i # sort out what you want commit merge, edit, drop etc git push. Git let you do whatever you do in cvs, but in a _much_ saner and faster way.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 14/09/14 17:30, Patrick Lauer wrote: >> Are we going to disallow merge commits and ask devs to rebase local >> changes in order to keep the history "clean"? > > Is that going to be sane with our commit frequency? > Which is our commit frequency? Worst case we can aggregate changes and push them in bulk. lu
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 14/09/14 16:46, Michał Górny wrote: > Of course, if we can't spare the resources to do intermediate updates, > we may as well switch to cron-based update method. The mirror have a sync time, so basically regenerating the cache and pushing the tree with further toward the user can happen the same way w/out impacting anybody. We could just snapshot the tree when the regen starts and push that commit to the user git and rsync. People is still supposed to play nice and sync not every minute. lu
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 14/09/14 17:15, Kent Fredric wrote: > On 15 September 2014 02:40, Michał Górny wrote: > >> However, I'm wondering if it would be possible to restrict people from >> accidentally committing straight into github (e.g. merging pull >> requests there instead of to our main server). >> > > > Easy. > > Put the Gentoo repo in its own group. > Don't give anyone any kinds of permissions on it. > Have only one approved account for the purpose of pushing commits. > Have a post-push hook that replicates to github as that approved account > > => Github is just a read only mirror, any pull reqs submitted there will be > fielded and pushed to gentoo directly. > > Only downside there is the way github pull reqs work is if the final SHA1's > that hit tree don't match, the pull req doesn't close. > > Solutions: > > - A) Have somebody tasked with reaping old pull reqs with permissions > granted. ( Uck ) > - B) Always use a merge of some kind to mark the pull req as dead ( for > instance, an "ours" merge to mark the branch as deprecated ) C) Ask nicely Github to have an application key and have a pull-request bridge to avoid the problem completely. I'd complete the migration first and discuss this kind of details later. lu
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Mon, Sep 15, 2014 at 1:42 PM, Ian Stakenvicius wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > On 14/09/14 09:06 PM, Peter Stuge wrote: >> Rich Freeman wrote: >>> If you just want to do 15 standalone commits before you push you >>> can do those sequentially easily enough. A branch would be more >>> appropriate for some kind of mini-project. >> .. >>> That is the beauty of git - branches are really cheap. So are >>> repositories >> >> And commits. >> >> Not only are branches cheap, they are also very easy to create, >> and maybe most importantly they can be created at any time, even >> after the commits. >> >> It's quick and painless to create a bunch of commits which aren't >> really closely related in sequence, and only later clean the whole >> series of commits up while creating different branches for commits >> which should actually be grouped rather than mixed all together. >> > > Ahh, so the secret here would then be just to git add files on a > related per-package basis, leaving the other files out of the commit. > that makes sense. There would still be the issue of untracked files > in the repo and the ability to switch back to the 'master' branch to > cherry-pick a commit for pushing, though... I guess we'd just have to > deal with the delay there and try and push all of the changes at once? If you really like to keep a lot of things going at once, I'd strongly recommend doing it in another branch (or more than one), so that you can cherry-pick/merge/etc stuff into master and not have a mess of an index while doing it. You need not publish that branch, and if you do publish it you need not do it on Gentoo's infra (a git repository can be synced with multiple other repositories, and the various heads don't need to stay in sync). In general you want each commit to represent a single "change." That might be a revbump in a single package, or it might be a package move that involves touching 300 packages in a single commit. The idea though is that every commit should stand on its own, so that they can be reverted on their own, etc. That's just good practice, and should be what we're trying to do with cvs with the huge limitation that multi-file changes in cvs aren't atomic. There are a lot of guides/tools/etc out there for using git. A popular one is the git-flow workflow. I'm not suggesting that it is really necessary for one person, but anybody not familiar with git should probably read up on it just so that you have some sense of how it can be used. -- Rich
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 14/09/14 09:06 PM, Peter Stuge wrote: > Rich Freeman wrote: >> If you just want to do 15 standalone commits before you push you >> can do those sequentially easily enough. A branch would be more >> appropriate for some kind of mini-project. > .. >> That is the beauty of git - branches are really cheap. So are >> repositories > > And commits. > > Not only are branches cheap, they are also very easy to create, > and maybe most importantly they can be created at any time, even > after the commits. > > It's quick and painless to create a bunch of commits which aren't > really closely related in sequence, and only later clean the whole > series of commits up while creating different branches for commits > which should actually be grouped rather than mixed all together. > Ahh, so the secret here would then be just to git add files on a related per-package basis, leaving the other files out of the commit. that makes sense. There would still be the issue of untracked files in the repo and the ability to switch back to the 'master' branch to cherry-pick a commit for pushing, though... I guess we'd just have to deal with the delay there and try and push all of the changes at once? -BEGIN PGP SIGNATURE- Version: GnuPG v2 iF4EAREIAAYFAlQXJQkACgkQ2ugaI38ACPBFBQD/Z1SYvajcf/WORxknJGu1VfI0 f8CFhMTdE34Bk0Zd+GoA/iJtwsYBUQQHXhRjs7AzQDxaIEuFRgzyUgee4BICKaiq =8fbP -END PGP SIGNATURE-
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 14/09/14 08:57 PM, Rich Freeman wrote: > On Sun, Sep 14, 2014 at 7:21 PM, Patrick Lauer > wrote: >> >> iow, git doesn't allow people to work on more than one item at a >> time? >> >> That'd mean I need half a dozen checkouts just to emulate cvs, >> which somehow doesn't make much sense to me ... >> > > Well, you can work on as many things as you like in git, but it > doesn't keep track of what changes have to do with what things if > you don't commit in-between. So, you'll have a big list of changes > in your index, and you'll have to pick-and-choose what you commit > at any one time. > > If you really want to work on many things "at once" the better way > to do it is to do a temporary branch per-thing, and when you > switch between things you switch between branches, and then move > into master things as they are done. > > I assume you mean working on things that will take a while to > complete. If you just want to do 15 standalone commits before you > push you can do those sequentially easily enough. A branch would > be more appropriate for some kind of mini-project. > > You can work on branches without pushing those to the master repo. > Or, if appropriate a project team might choose to push their branch > to master, or to some other repo (like an overlay). This would > allow collaborative work on a large commit, with a quick final > merge into the main tree. That is the beauty of git - branches are > really cheap. So are repositories - if somebody wants to do all > their work in github and then push to the main tree, they can do > that. > Actually i see what Patrick's getting at -- I have similar issues when working with mozilla stuff. if you're using local (temporary) branches, the whole tree is in the state of that current checkout, right? IE, while I have my firefox-update branch active and working for an 'ebuild ... install', I can't be doing work in my 'freewrl-update' branch unless I make multiple separate repo trees, one for each independently-separate workflow i want to do concurrently. Ideal here would be the ability to have separate active checkouts of a repo on a per-shell basis, ie each shell invocation would be able to work concurrently and distinctly on distinct branches; anyone done that before? Does git do it already? -BEGIN PGP SIGNATURE- Version: GnuPG v2 iF4EAREIAAYFAlQXJAAACgkQ2ugaI38ACPBREAD/YnsyY+fAK1TEXgzNYHBCq138 Q5Bj+J6pNGX8aBDjjHoA/iyy5CWxhyAYE3buSOXkEvFfhm/716DsQIptpX7JpS0m =YrIG -END PGP SIGNATURE-
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 14/09/14 07:21 PM, Patrick Lauer wrote: > On Sunday 14 September 2014 15:42:15 hasufell wrote: >> Patrick Lauer: Are we going to disallow merge commits and ask devs to rebase local changes in order to keep the history "clean"? >>> >>> Is that going to be sane with our commit frequency? >> >> You have to merge or rebase anyway in case of a push conflict, so >> the only difference is the method and the effect on the history. >> >> Currently... CVS allows you to run repoman on an outdated tree >> and push broken ebuilds with repoman being happy. Git will not >> allow this. > > iow, git doesn't allow people to work on more than one item at a > time? > > That'd mean I need half a dozen checkouts just to emulate cvs, > which somehow doesn't make much sense to me ... > I think you'd just need to have a (large) handful of different branches, and squash+cherry-pick into master when you're ready to push. -BEGIN PGP SIGNATURE- Version: GnuPG v2 iF4EAREIAAYFAlQXH08ACgkQ2ugaI38ACPAZ1gD/WfiMZnu3qesaILhPYEKYy2BP MUS2zWJVqYJ8lKp16nUA/1ng1mMxX6pNKZVYIaT/BFuERKz3g0BcLck+XILs3Hth =Ul7F -END PGP SIGNATURE-
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Mon, Sep 15, 2014 at 11:26:47PM +1200, Kent Fredric wrote: > None of these are impossible things, but they're much more complex than > "just make a dodgy commit and get somebody to pull it". Much more simple would be to make a dodgy commit by one of the devs. Why use users for that, if "the bad guy(/-s)" could be inside? Isn't that the best way to poison stuff? Just look at the "let's move to git" threads, there're bikeshedding and it's still cvs+rsync (maybe, just maybe, this isn't a coincidence) Piotr Szymaniak. -- Mezczyzna odlozyl gazete z powrotem na stojak i postapil krok w przod, robiac mine, ktora nadala mu wyglad swinskiego pecherza rozpietego na drucianym wieszaku do garniturow. -- Graham Masterton, "The Burning" signature.asc Description: Digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 22:10, Jauhien Piatlicki wrote: > So signing of git commits does not guarantee enough security (taking > that SHA1 is weak and can be broken), right? Could we than just use > usual (not thin) manifests? > However, the attackability of SHA1 may be entirely immaterial, because methods to exploit that require compromising other security strategies. If somebody pushes signed commit 0x0001 with parents 0x0002 and 0x0003 with tree 0x0004 with files 0x0005 to 0x0010, those binary blobs are pushed. And there is no way I know of to have those binary blobs replaced with cuckoo blobs. Once they're replicated, Git doesn't try re-replicating the same SHA1s. So your attack vectors entail directly manipulating the git storage on gentoo's servers, or poisoning a mirror under their control, or poisoning the data *PRIOR* to it landing on gentoo servers, or being NSA and poisoning it dynamically when a user attempts to fetch that specific SHA1. None of these are impossible things, but they're much more complex than "just make a dodgy commit and get somebody to pull it". This basically means you could use CRC32 as your hash algorithm and still pose a respectable problem for would-be attackers. As such, I don't presently see git commit signing as a "Security" model, merely a proof of authorship model. Anybody can forge commits with your 'author = ' and 'committer = ', but you're the only person who can sign the commit with your signature. That is to say, you sign that you crafted the *commit*. But you're *not* signing the creation of any of the dependencies. For instance, two commits may have the same tree, but obviously only one person forged that tree object. And two trees may have the same file ( and indeed, this is an expected element of how git works ), but you're not signing that you created all the files in that tree. You may infer that from the chain of authority from the commit itself, but it is not fact. And parent objects are also dependencies, and nobody would ever consider claiming they're a signing authority for all of that ;), that would be by proxy signing the creation of the entire repository back to the first commit ever forged! and its for that reason its probably good that git doesn't presently recursively feed all dependencies of a commit into GPG. I don't have 5 hours while every single blob in my repository is uncompressed and fed through GPG :p -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Hi, On 09/15/2014 01:37 AM, Kent Fredric wrote: > On 15 September 2014 11:25, hasufell wrote: > >> Robin said >>> The Git commit-signing design explicitly signs the entire commit, >> including blob contents, to avoid this security problem. >> >> Is this correct or not? >> > > I can verify a commit by hand with only the commit object and gpg, but > without any of the trees or parents. > > https://gist.github.com/kentfredric/8448fe55ffab7d314ecb > > So signing of git commits does not guarantee enough security (taking that SHA1 is weak and can be broken), right? Could we than just use usual (not thin) manifests? -- Jauhien signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-15, o godz. 07:21:35 Patrick Lauer napisał(a): > On Sunday 14 September 2014 15:42:15 hasufell wrote: > > Patrick Lauer: > > >> Are we going to disallow merge commits and ask devs to rebase local > > >> changes in order to keep the history "clean"? > > > > > > Is that going to be sane with our commit frequency? > > > > You have to merge or rebase anyway in case of a push conflict, so the > > only difference is the method and the effect on the history. > > > > Currently... CVS allows you to run repoman on an outdated tree and push > > broken ebuilds with repoman being happy. Git will not allow this. > > iow, git doesn't allow people to work on more than one item at a time? > > That'd mean I need half a dozen checkouts just to emulate cvs, which somehow > doesn't make much sense to me ... I'd appreciate if you reduced FUD to minimum. What hasufell meant is that you normally don't have three year-old files lying around in checkout because you did 'cvs up -dP' in another directory. With git, you update everything. What you do locally, is totally unrelated. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-14, o godz. 21:30:36 Tim Harder napisał(a): > On 2014-09-14 10:46, Michał Górny wrote: > > Dnia 2014-09-14, o godz. 15:40:06 > > Davide Pesavento napisał(a): > > > How long does the md5-cache regeneration process take? Are you sure it > > > will be able to keep up with the rate of pushes to the repo during > > > "peak hours"? If not, maybe we could use a time-based thing similar to > > > the current cvs->rsync synchronization. > > > > This strongly depends on how much data is there to update. A few > > ebuilds are quite fast, eclass change isn't ;). I was thinking of > > something along the lines of, in pseudo-code speaking: > > > > systemctl restart cache-regen > > > > That is, we start the regen on every update. If it finishes in time, it > > commits the new metadata. If another update occurs during regen, we > > just restart it to let it catch the new data. > > > > Of course, if we can't spare the resources to do intermediate updates, > > we may as well switch to cron-based update method. > > I don't see per push metadata regen working entirely well in this case > if this is the only way we're generating the metadata cache for users to > sync. It's easy to imagine a plausible situation where a widely used > eclass change is made followed by commits less than a minute apart (or > shorter than however long it would take for metadata regen to occur) for > at least 30 minutes (rsync refresh period for most user-facing mirrors) > during a time of high activity. For a metadata recheck (that is, egencache run with no changes): a. cold cache ext4: real3m54.321s user0m44.413s sys 0m13.497s b. warm cache ext4: real0m40.672s user0m35.087s sys 0m 4.687s I will try to re-run that on btrfs or reiserfs to get a more meaningful numbers. Now, that results back up your claims. However, if we can get that to <10s, I doubt we would have a major issue. My idea works like this: 1. first update is pushed, 1a. egencache starts rechecking and updating cache, 2. second update is pushed, 2a. previous egencache is terminated, 2b. egencache starts rechecking and updating cache, 2c. egencache finishes in time and commits. The point is, nothing gets committed to the user-reachable location before egencache finishes. And it goes quasi-incrementally, so if another update happens before egencache finished, it only does the 'slow' regen on changed metadata. I will come back with more results soon. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 2014-09-14 21:57, Kent Fredric wrote: > I generate metadata for the perl-experimental overlay periodically as a > snapshotted variation of the same, and the performance isn't so bad. Overlays with few eclasses are much different than the main tree. Anyway, egencache isn't bad it's just significantly slower than alternatives so it could be sped up quite a lot if necessary. > However, what I suspect you *could* do with a push hook is regen metadata > for only things that were modified in that commit, because I believe > there's a way to regen metadata for only specific files now. > ie: > modifications to cat/PN *would* trigger a metadata update, but only for > that cat/PN > modifications to eclass/* would *NOT* trigger a metadata update as part of > the push. > And doing tree-wide "an eclass was changed" updates could be done with > lower priority in an asynchronous cron job or something so as not to block > workflow for several minutes/hours/whatever while some muppet sits there > watching "git push" do nothing. If we need to do piecewise regen it seems we would be better off just sticking with the current scheduled cron job approach. Otherwise it sounds like one could pull updates without having the correct metadata for a significant portion of the tree. Tim pgpD3F3w_LSNi.pgp Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 13:30, Tim Harder wrote: > I haven't run portage metadata regen on a beefy machine lately, but I > don't think it could keep up in all cases. Perhaps someone can prove me > wrong. > > Anyway, things could definitely be sped up if portage merges a few speed > tweaks used in pkgcore. Specifically, I think using some of the weakref > and perhaps jitted attrs support along with the eclass caching hacks > would give a 2-4x metadata regen speedup. Otherwise pkgcore could > potentially be used to regen metadata as well or some other tuned regen > tool. > I generate metadata for the perl-experimental overlay periodically as a snapshotted variation of the same, and the performance isn't so bad. However, what I suspect you *could* do with a push hook is regen metadata for only things that were modified in that commit, because I believe there's a way to regen metadata for only specific files now. ie: modifications to cat/PN *would* trigger a metadata update, but only for that cat/PN modifications to eclass/* would *NOT* trigger a metadata update as part of the push. And doing tree-wide "an eclass was changed" updates could be done with lower priority in an asynchronous cron job or something so as not to block workflow for several minutes/hours/whatever while some muppet sits there watching "git push" do nothing. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 2014-09-14 10:46, Michał Górny wrote: > Dnia 2014-09-14, o godz. 15:40:06 > Davide Pesavento napisał(a): > > How long does the md5-cache regeneration process take? Are you sure it > > will be able to keep up with the rate of pushes to the repo during > > "peak hours"? If not, maybe we could use a time-based thing similar to > > the current cvs->rsync synchronization. > > This strongly depends on how much data is there to update. A few > ebuilds are quite fast, eclass change isn't ;). I was thinking of > something along the lines of, in pseudo-code speaking: > > systemctl restart cache-regen > > That is, we start the regen on every update. If it finishes in time, it > commits the new metadata. If another update occurs during regen, we > just restart it to let it catch the new data. > > Of course, if we can't spare the resources to do intermediate updates, > we may as well switch to cron-based update method. I don't see per push metadata regen working entirely well in this case if this is the only way we're generating the metadata cache for users to sync. It's easy to imagine a plausible situation where a widely used eclass change is made followed by commits less than a minute apart (or shorter than however long it would take for metadata regen to occur) for at least 30 minutes (rsync refresh period for most user-facing mirrors) during a time of high activity. I haven't run portage metadata regen on a beefy machine lately, but I don't think it could keep up in all cases. Perhaps someone can prove me wrong. Anyway, things could definitely be sped up if portage merges a few speed tweaks used in pkgcore. Specifically, I think using some of the weakref and perhaps jitted attrs support along with the eclass caching hacks would give a 2-4x metadata regen speedup. Otherwise pkgcore could potentially be used to regen metadata as well or some other tuned regen tool. Tim pgpGfmG5Ks9YC.pgp Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 13:06, Peter Stuge wrote: > even after > the commits. > I've even made branches in "detached head" state ( that is, without a branch ) and given them branches after the fact. After all, branches aren't really "things", they're just pointers to SHA1s, that get repointed to new sha1's as part of "git commit". Tags are also simply pointers, they just don't get updated by default. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Rich Freeman wrote: > If you just want to do 15 standalone commits before you push you can > do those sequentially easily enough. A branch would be more > appropriate for some kind of mini-project. .. > That is the beauty of git - branches are really cheap. > So are repositories And commits. Not only are branches cheap, they are also very easy to create, and maybe most importantly they can be created at any time, even after the commits. It's quick and painless to create a bunch of commits which aren't really closely related in sequence, and only later clean the whole series of commits up while creating different branches for commits which should actually be grouped rather than mixed all together. //Peter
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Patrick Lauer wrote: > > > That'd mean I need half a dozen checkouts just to emulate cvs, which > > > somehow doesn't make much sense to me ... > > > > Unlike CVS, git doesn't force you to work in "Keep millions of files in > > uncommitted states" mode just to work on a codebase, due to the commit <-> > > replicate seperation. > > But that's the feature! You can have millions of uncommitted files with git too. The person who creates a commit always decides what changes in what files should be included in that commit. (You don't even have to commit all the changes within one file at the same time.) There are some shortcuts for committing all uncommitted changes at once but you don't have to do that. I frequently only commit little bits of my currently uncommitted changes. > I can work on bumping postgresql (takes about 1h walltime to compile and test > all versions) *and* work on a few tiny python packages while doing that. > Without breaking either process. Without multiple checkouts. Same with git. //Peter
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 7:21 PM, Patrick Lauer wrote: > > iow, git doesn't allow people to work on more than one item at a time? > > That'd mean I need half a dozen checkouts just to emulate cvs, which somehow > doesn't make much sense to me ... > Well, you can work on as many things as you like in git, but it doesn't keep track of what changes have to do with what things if you don't commit in-between. So, you'll have a big list of changes in your index, and you'll have to pick-and-choose what you commit at any one time. If you really want to work on many things "at once" the better way to do it is to do a temporary branch per-thing, and when you switch between things you switch between branches, and then move into master things as they are done. I assume you mean working on things that will take a while to complete. If you just want to do 15 standalone commits before you push you can do those sequentially easily enough. A branch would be more appropriate for some kind of mini-project. You can work on branches without pushing those to the master repo. Or, if appropriate a project team might choose to push their branch to master, or to some other repo (like an overlay). This would allow collaborative work on a large commit, with a quick final merge into the main tree. That is the beauty of git - branches are really cheap. So are repositories - if somebody wants to do all their work in github and then push to the main tree, they can do that. -- Rich
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Patrick Lauer: > On Monday 15 September 2014 11:27:34 Kent Fredric wrote: >> On 15 September 2014 11:21, Patrick Lauer wrote: >>> iow, git doesn't allow people to work on more than one item at a time? >>> >>> That'd mean I need half a dozen checkouts just to emulate cvs, which >>> somehow >>> doesn't make much sense to me ... >> >> Use the Stash. Or just commit items, then swap branches, and then discard >> the commits sometime later before pushing. >> >> Unlike CVS, git doesn't force you to work in "Keep millions of files in >> uncommitted states" mode just to work on a codebase, due to the commit <-> >> replicate seperation. > But that's the feature! > > I can work on bumping postgresql (takes about 1h walltime to compile and test > all versions) *and* work on a few tiny python packages while doing that. > Without breaking either process. Without multiple checkouts. > > I doubt stash would allow things to progress ... but it's a cute idea. > Please read up about git branches. I don't see anything particularly broken. People use git to work on 10+ different feature at a time. It works. Also, let's not derail this thread to git vs CVS, thanks.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 11:25, hasufell wrote: > Robin said > > The Git commit-signing design explicitly signs the entire commit, > including blob contents, to avoid this security problem. > > Is this correct or not? > I can verify a commit by hand with only the commit object and gpg, but without any of the trees or parents. https://gist.github.com/kentfredric/8448fe55ffab7d314ecb -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Patrick Lauer: > On Sunday 14 September 2014 15:42:15 hasufell wrote: >> Patrick Lauer: Are we going to disallow merge commits and ask devs to rebase local changes in order to keep the history "clean"? >>> >>> Is that going to be sane with our commit frequency? >> >> You have to merge or rebase anyway in case of a push conflict, so the >> only difference is the method and the effect on the history. >> >> Currently... CVS allows you to run repoman on an outdated tree and push >> broken ebuilds with repoman being happy. Git will not allow this. > > iow, git doesn't allow people to work on more than one item at a time? > Completely the opposite. You can work on 400 packages, accumulate the changes, commit them and push them in one blow instead of writing fragile scripts or Makefiles that do >400 pushes, fail at some point in the middle because of a conflict and then try to figure out what you already pushed and what not. > That'd mean I need half a dozen checkouts just to emulate cvs, which somehow > doesn't make much sense to me ... > checkouts? You probably mean that you have to rebase your changes in case someone pushed before you. That makes perfect sense, because the ebuild you just wrote might be broken by now, because someone changed profiles/. We are talking about a one-liner in the shell that will work in the majority of the cases. If it doesn't work (as in: merge conflict), then that means there is something REALLY wrong and 2 people are working uncoordinated on the same file at a time.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Monday 15 September 2014 11:27:34 Kent Fredric wrote: > On 15 September 2014 11:21, Patrick Lauer wrote: > > iow, git doesn't allow people to work on more than one item at a time? > > > > That'd mean I need half a dozen checkouts just to emulate cvs, which > > somehow > > doesn't make much sense to me ... > > Use the Stash. Or just commit items, then swap branches, and then discard > the commits sometime later before pushing. > > Unlike CVS, git doesn't force you to work in "Keep millions of files in > uncommitted states" mode just to work on a codebase, due to the commit <-> > replicate seperation. But that's the feature! I can work on bumping postgresql (takes about 1h walltime to compile and test all versions) *and* work on a few tiny python packages while doing that. Without breaking either process. Without multiple checkouts. I doubt stash would allow things to progress ... but it's a cute idea.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 11:25:33PM +, hasufell wrote: > So can we get this clear now. > > Robin said > > > The Git commit-signing design explicitly signs the entire commit, > > including blob contents, to avoid this security problem. > > Is this correct or not? That is false. The commit signature explicitly signs the commit, which includes the root tree hash. That is the only connection between the signature and the tree contents. Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 11:21, Patrick Lauer wrote: > iow, git doesn't allow people to work on more than one item at a time? > > That'd mean I need half a dozen checkouts just to emulate cvs, which > somehow > doesn't make much sense to me ... > Use the Stash. Or just commit items, then swap branches, and then discard the commits sometime later before pushing. Unlike CVS, git doesn't force you to work in "Keep millions of files in uncommitted states" mode just to work on a codebase, due to the commit <-> replicate seperation. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Rich Freeman: > On Sun, Sep 14, 2014 at 6:56 PM, hasufell wrote: >> According to Robin, it's not about rebasing, it's about signing all >> commits so that messing with the blob (even if it has the same sha-1) >> will cause signature verification failure. >> > > The only thing that gets signed is the commit message, and the only > thing that ties the commit message to the code is the sha1 of the > top-level tree. If you can attack sha1 either at any tree level or at > the blob level you can defeat the signature. > So can we get this clear now. Robin said > The Git commit-signing design explicitly signs the entire commit, including > blob contents, to avoid this security problem. Is this correct or not?
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 07:13:21PM -0400, Rich Freeman wrote: > The only thing that gets signed is the commit message, and the only > thing that ties the commit message to the code is the sha1 of the > top-level tree. If you can attack sha1 either at any tree level or at > the blob level you can defeat the signature. > > That is way better than nothing though - I think it is worth pursuing > until somebody comes up with a way to upgrade git to more secure > hashes. Most projects don't gpg sign their trees at all, including > linux. I'm not worried about the attack (as I explained earlier in this thread). I'm just arguing for signing first-parent commits to master, and not worrying about signatures on any side-branch commits. So long as the merge gets signed, you've got all the security you're going to get. Leaving the side-branch commits unchanged allows you to preserve any non-dev commit hashes, which makes it easier for contributors to verify that their changes have landed (the same way that GitHub is checking to know when to automatically close pull requests). Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 11:15, W. Trevor King wrote: > All cherry-pick and am do is apply one commit's diff to a different > parent. Changing the parent hash (which is stored in the commit body > [1]), so old signatures won't apply to the new commit. If there have > been other tree changes between the initial parent and the new parent, > the tree hash will also change, which would also break old signatures. > None of that has anything to do with a malicious blob being pushed > into the tree disguised as a same-hashed good blob. Such a blob will > *not* break any signatures, since GnuPG is *never hashing the blob > contents* when signing commits [1,2]. You're only signing the commit > object, not the tree and blob objects referenced by that commit. > > Cheers, > Trevor > And given that the method of "security" against attacks is established by a chain of custody from a signed commit, through multiple child unsigned SHA1 objects, having a parent being an unsigned commit is no *less* secure than having a tree or file blob being unsigned, it doesn't make perfect sense to me that "all" commits have to be signed. ( Because doing so doesn't give the benefit of security we think it does ). Thus, a "I signed this commit, establishing a chain of trust relying on SHA1 integrity to the previous signed commit" is all that seems truly necessary. Anything else is decreased utility with no increase in security. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sunday 14 September 2014 15:42:15 hasufell wrote: > Patrick Lauer: > >> Are we going to disallow merge commits and ask devs to rebase local > >> changes in order to keep the history "clean"? > > > > Is that going to be sane with our commit frequency? > > You have to merge or rebase anyway in case of a push conflict, so the > only difference is the method and the effect on the history. > > Currently... CVS allows you to run repoman on an outdated tree and push > broken ebuilds with repoman being happy. Git will not allow this. iow, git doesn't allow people to work on more than one item at a time? That'd mean I need half a dozen checkouts just to emulate cvs, which somehow doesn't make much sense to me ...
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 10:56, hasufell wrote: > According to Robin, it's not about rebasing, it's about signing all > commits so that messing with the blob (even if it has the same sha-1) > will cause signature verification failure. > Correct me if I'm wrong, but wouldn't a SHA1 attack on the tree object or file blobs be completely invisible to the commit SHA1? As the Signature only signs content of the commit object, not any of the nodes it refers to. Granted, getting a tree/file object to replicate might be interesting. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 10:56:33PM +, hasufell wrote: > W. Trevor King: > > On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote: > >> So we'd basically end up using either "git cherry-pick" or "git > >> am" for "pulling" user stuff, so that we also sign the blobs. > > > > Rebasing the original commits doesn't protect you from the > > birthday attach either, because the vulnerable hash is likely > > going to still be in the rebased commit's tree. All rebasing does > > is swap the committer and drop the initial signature. > > According to Robin, it's not about rebasing, it's about signing all > commits so that messing with the blob (even if it has the same > sha-1) will cause signature verification failure. All cherry-pick and am do is apply one commit's diff to a different parent. Changing the parent hash (which is stored in the commit body [1]), so old signatures won't apply to the new commit. If there have been other tree changes between the initial parent and the new parent, the tree hash will also change, which would also break old signatures. None of that has anything to do with a malicious blob being pushed into the tree disguised as a same-hashed good blob. Such a blob will *not* break any signatures, since GnuPG is *never hashing the blob contents* when signing commits [1,2]. You're only signing the commit object, not the tree and blob objects referenced by that commit. Cheers, Trevor [1]: http://article.gmane.org/gmane.linux.gentoo.devel/77537 [2]: http://git.kernel.org/cgit/git/git.git/tree/commit.c?id=v2.1.0#n1076 -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 6:56 PM, hasufell wrote: > According to Robin, it's not about rebasing, it's about signing all > commits so that messing with the blob (even if it has the same sha-1) > will cause signature verification failure. > The only thing that gets signed is the commit message, and the only thing that ties the commit message to the code is the sha1 of the top-level tree. If you can attack sha1 either at any tree level or at the blob level you can defeat the signature. That is way better than nothing though - I think it is worth pursuing until somebody comes up with a way to upgrade git to more secure hashes. Most projects don't gpg sign their trees at all, including linux. -- Rich
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
W. Trevor King: > On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote: >> Yes, there is a possible attack vector mentioned in this comment >> https://bugs.gentoo.org/show_bug.cgi?id=502060#c16 > > From that comment, the point 1.2 is highly unlikely [1]: > > 1. Attacker constructs a init.d script, regular part at the start, > malicious part at the end > 1.1. This would be fairly simple, just construct two start() > functions, one of which is mundane, the other is malicious. > 1.2. Both variants of the script have the same SHA1... > >> So we'd basically end up using either "git cherry-pick" or "git am" >> for "pulling" user stuff, so that we also sign the blobs. > > Rebasing the original commits doesn't protect you from the birthday > attach either, because the vulnerable hash is likely going to still be > in the rebased commit's tree. All rebasing does is swap the committer > and drop the initial signature. > According to Robin, it's not about rebasing, it's about signing all commits so that messing with the blob (even if it has the same sha-1) will cause signature verification failure.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote: > Yes, there is a possible attack vector mentioned in this comment > https://bugs.gentoo.org/show_bug.cgi?id=502060#c16 From that comment, the point 1.2 is highly unlikely [1]: 1. Attacker constructs a init.d script, regular part at the start, malicious part at the end 1.1. This would be fairly simple, just construct two start() functions, one of which is mundane, the other is malicious. 1.2. Both variants of the script have the same SHA1... > So we'd basically end up using either "git cherry-pick" or "git am" > for "pulling" user stuff, so that we also sign the blobs. Rebasing the original commits doesn't protect you from the birthday attach either, because the vulnerable hash is likely going to still be in the rebased commit's tree. All rebasing does is swap the committer and drop the initial signature. Cheers, Trevor [1]: http://article.gmane.org/gmane.comp.version-control.git/210622 -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
W. Trevor King: > On Sun, Sep 14, 2014 at 05:40:30PM +0200, Michał Górny wrote: >> Dnia 2014-09-15, o godz. 03:15:14 Kent Fredric napisał(a): >>> Only downside there is the way github pull reqs work is if the >>> final SHA1's that hit tree don't match, the pull req doesn't >>> close. >>> >>> Solutions: >>> >>> - A) Have somebody tasked with reaping old pull reqs with >>> permissions granted. ( Uck ) >>> - B) Always use a merge of some kind to mark the pull req as dead >>> ( for instance, an "ours" merge to mark the branch as deprecated ) >>> >>> Both of those options are kinda ugly. >> >> If you merge a pull request, I suggest doing a proper 'git merge -S' >> anyway to get a developer signature on top of all the changes. > > Some previous package-tree-in-Git efforts suggested that only > Gentoo-dev signatures were acceptable, and that those signatures would > be required on every commit (not just the first-parent line) [1,2]. I > don't see the point of that, so long as Gentoo devs are signing the > first-parent line, but if folks still want Gentoo-dev signatures on > every commit the ‘git merge -S’ approach will not work for closing > PRs. > > Cheers, > Trevor > > [1]: http://article.gmane.org/gmane.linux.gentoo.devel/77572 > id:cagfcs_manfikevtj3cmcq1of-uqavebe2r1okykygwc5vom...@mail.gmail.com > [2]: https://bugs.gentoo.org/show_bug.cgi?id=502060#c0 > Yes, there is a possible attack vector mentioned in this comment https://bugs.gentoo.org/show_bug.cgi?id=502060#c16 So we'd basically end up using either "git cherry-pick" or "git am" for "pulling" user stuff, so that we also sign the blobs. Regular merges would still be possible for developer pull requests, but that's probably not the primary use case anyway.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 05:40:30PM +0200, Michał Górny wrote: > Dnia 2014-09-15, o godz. 03:15:14 Kent Fredric napisał(a): > > Only downside there is the way github pull reqs work is if the > > final SHA1's that hit tree don't match, the pull req doesn't > > close. > > > > Solutions: > > > > - A) Have somebody tasked with reaping old pull reqs with > > permissions granted. ( Uck ) > > - B) Always use a merge of some kind to mark the pull req as dead > > ( for instance, an "ours" merge to mark the branch as deprecated ) > > > > Both of those options are kinda ugly. > > If you merge a pull request, I suggest doing a proper 'git merge -S' > anyway to get a developer signature on top of all the changes. Some previous package-tree-in-Git efforts suggested that only Gentoo-dev signatures were acceptable, and that those signatures would be required on every commit (not just the first-parent line) [1,2]. I don't see the point of that, so long as Gentoo devs are signing the first-parent line, but if folks still want Gentoo-dev signatures on every commit the ‘git merge -S’ approach will not work for closing PRs. Cheers, Trevor [1]: http://article.gmane.org/gmane.linux.gentoo.devel/77572 id:cagfcs_manfikevtj3cmcq1of-uqavebe2r1okykygwc5vom...@mail.gmail.com [2]: https://bugs.gentoo.org/show_bug.cgi?id=502060#c0 -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Michał Górny wrote: > What I need others to do is provide the hosting for git repos. I'm happy to set up repos on my git server with custom hooks and accounts as needed. It's probably not what we want long-term, but it might be useful as proof of concept, so that infra only needs to do setup one time. I even have some virtual hosting working, point an A to the right IP and it looks like only desired repos are hosted there. Gitweb, git-daemon and git over http and CAcert https with pretty URLs. //Peter pgp4M7ju1Sv1x.pgp Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
I think the better option Is to block rsync and force emerge-webrsync .sended from a phone Il 14/09/2014 14:03, Michał Górny ha scritto: > The rsync tree > -- > > We'd also propagate things to rsync. We'd have to populate it with old > ChangeLogs, new ChangeLog entries (autogenerated from git) and thick > Manifests. So users won't notice much of a change. > If this will change all Changelog the first rsync from the users will generate a lot of traffic, rsync network need to be prepared
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Il 14/09/2014 14:03, Michał Górny ha scritto: > The rsync tree > -- > > We'd also propagate things to rsync. We'd have to populate it with old > ChangeLogs, new ChangeLog entries (autogenerated from git) and thick > Manifests. So users won't notice much of a change. > If this will change all Changelog the first rsync from the users will generate a lot of traffic, rsync network need to be prepared
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
> "MG" == Michał Górny writes: MG> This means we don't have to wait till someone figures out the perfect MG> way of converting the old CVS repository. You don't need that history MG> most of the time, and you can play with CVS to get it if you really do. MG> In any case, we would likely strip the history anyway to get a small MG> repo to work with. +1 on that. The cvs repo can be converted to an historical git repo on a slower timeframe, and remain available as cvs until then. That old-vs-fresh concept worked fine for other projects (including Linux). -JimC -- James Cloos OpenPGP: 0x997A9F17ED7DAEA6
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 11:42 AM, hasufell wrote: > Patrick Lauer: >>> Are we going to disallow merge commits and ask devs to rebase local >>> changes in order to keep the history "clean"? >> >> Is that going to be sane with our commit frequency? >> > > You have to merge or rebase anyway in case of a push conflict, so the > only difference is the method and the effect on the history. > > Currently... CVS allows you to run repoman on an outdated tree and push > broken ebuilds with repoman being happy. Git will not allow this. > Repoman is going to be a challenge here. With cvs every package is its own private repository with its own private history and cvs only cares if there is a collision within the scope of a single file. With git your commit is against the whole tree. So, even though it is trivial to merge, independent commits against two different packages do collide and need to be rebased or merged. Repoman can run against a single package fairly quickly, so assuming we still allow that we could do a pull/rebase/repman/push workflow even if people are doing commits every few minutes. On the other hand, if you're doing a package move or eclass change or some other change that affects 300 packages, just doing the rebase might cost you a few minutes (due to actual collisions), and running repoman against the whole thing before doing a push isn't going to be practical. Somebody doing a tree-wide commit would almost certainly have to run repoman before the final rebase/merge, push that out, and then maybe do another repoman after-the-fact and maybe clean up any issues. For all intents in purposes that is what we're doing today anyway, since repoman+cvs doesn't offer any kind of tree-wide consistency guarantees unless you're checking out based on a timestamp or something like that. -- Rich
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-15, o godz. 03:15:14 Kent Fredric napisał(a): > On 15 September 2014 02:40, Michał Górny wrote: > > > However, I'm wondering if it would be possible to restrict people from > > accidentally committing straight into github (e.g. merging pull > > requests there instead of to our main server). > > => Github is just a read only mirror, any pull reqs submitted there will be > fielded and pushed to gentoo directly. > > Only downside there is the way github pull reqs work is if the final SHA1's > that hit tree don't match, the pull req doesn't close. > > Solutions: > > - A) Have somebody tasked with reaping old pull reqs with permissions > granted. ( Uck ) > - B) Always use a merge of some kind to mark the pull req as dead ( for > instance, an "ours" merge to mark the branch as deprecated ) > > Both of those options are kinda ugly. If you merge a pull request, I suggest doing a proper 'git merge -S' anyway to get a developer signature on top of all the changes. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Patrick Lauer: >> Are we going to disallow merge commits and ask devs to rebase local >> changes in order to keep the history "clean"? > > Is that going to be sane with our commit frequency? > You have to merge or rebase anyway in case of a push conflict, so the only difference is the method and the effect on the history. Currently... CVS allows you to run repoman on an outdated tree and push broken ebuilds with repoman being happy. Git will not allow this.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sunday 14 September 2014 15:40:06 Davide Pesavento wrote: > On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny wrote: > > We have main developer repo where developers work & commit and are > > relatively happy. For every push into developer repo, automated magic > > thingie merges stuff into user sync repo and updates the metadata cache > > there. > > How long does the md5-cache regeneration process take? Are you sure it > will be able to keep up with the rate of pushes to the repo during > "peak hours"? If not, maybe we could use a time-based thing similar to > the current cvs->rsync synchronization. Best case only one package is affected - a few seconds Worst case someone touches an eclass like eutils, then it expands to something on the order of one or two CPU-hours. > Are we going to disallow merge commits and ask devs to rebase local > changes in order to keep the history "clean"? Is that going to be sane with our commit frequency?
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 02:40, Michał Górny wrote: > However, I'm wondering if it would be possible to restrict people from > accidentally committing straight into github (e.g. merging pull > requests there instead of to our main server). > Easy. Put the Gentoo repo in its own group. Don't give anyone any kinds of permissions on it. Have only one approved account for the purpose of pushing commits. Have a post-push hook that replicates to github as that approved account => Github is just a read only mirror, any pull reqs submitted there will be fielded and pushed to gentoo directly. Only downside there is the way github pull reqs work is if the final SHA1's that hit tree don't match, the pull req doesn't close. Solutions: - A) Have somebody tasked with reaping old pull reqs with permissions granted. ( Uck ) - B) Always use a merge of some kind to mark the pull req as dead ( for instance, an "ours" merge to mark the branch as deprecated ) Both of those options are kinda ugly. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-14, o godz. 15:23:24 Jauhien Piatlicki napisał(a): > Another question: will it be possible to maintain a copy of tree on github to > make contributions for users simpler (similarly to e.g. science overlay)? > (Can it somehow be combined with proposed signing mechanism?) Yes. I'm planning to have a mirror on github and bitbucket, and auto-pushing to both. However, I'm wondering if it would be possible to restrict people from accidentally committing straight into github (e.g. merging pull requests there instead of to our main server). In fact, I would start my experiments straight into github if not the fact that they don't allow us to set our own update hooks. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-14, o godz. 15:40:06 Davide Pesavento napisał(a): > On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny wrote: > > We have main developer repo where developers work & commit and are > > relatively happy. For every push into developer repo, automated magic > > thingie merges stuff into user sync repo and updates the metadata cache > > there. > > How long does the md5-cache regeneration process take? Are you sure it > will be able to keep up with the rate of pushes to the repo during > "peak hours"? If not, maybe we could use a time-based thing similar to > the current cvs->rsync synchronization. This strongly depends on how much data is there to update. A few ebuilds are quite fast, eclass change isn't ;). I was thinking of something along the lines of, in pseudo-code speaking: systemctl restart cache-regen That is, we start the regen on every update. If it finishes in time, it commits the new metadata. If another update occurs during regen, we just restart it to let it catch the new data. Of course, if we can't spare the resources to do intermediate updates, we may as well switch to cron-based update method. > [...] > > In any case, we would likely strip the history anyway to get a small > > repo to work with. > > > > I have prepared a basic git update hook that keeps master clean > > and attached it to the bug [1]. It enforces basic policies, prevents > > forced updates and checks GPG signatures on left-most history line. It > > can also be extended to do more extensive tree checks. > > Are we going to disallow merge commits and ask devs to rebase local > changes in order to keep the history "clean"? I don't think we should cripple git. Just to be clear, 'accidental' merges won't happen because the automatic merges are unsigned and the 'update' hook will refuse them. The developers will have to either rebase and resign the commits, or use a signed merge commit whichever makes more sense in particular context. Signed merge commits will also allow merging user-submitted changes while preserving original history. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-14, o godz. 15:09:25 Jauhien Piatlicki napisał(a): > 14.09.14 14:03, Michał Górny написав(ла): > > Hi, > > > > I'm quite tired of promises and all that perfectionist non-sense which > > locks us up with CVS for next 10 years of bikeshed. Therefore, I have > > prepared a plan how to do git migration, and I believe it's doable in > > less than 2 weeks (plus the testing). Of course, that assumes infra is > > going to cooperate quickly or someone else is willing to provide the > > infra for it. > > > > as always, nice effort, but I foresee lots of bikeshedding in this thread. ) Yes. I'm planning to ignore most of bikeshed and take only serious answers into consideration. Otherwise, we will be stuck with CVS. > > This means we don't have to wait till someone figures out the perfect > > way of converting the old CVS repository. You don't need that history > > most of the time, and you can play with CVS to get it if you really do. > > In any case, we would likely strip the history anyway to get a small > > repo to work with. > > Is it so difficult to convert CVS history? It may be difficult to convert it properly, especially considering the splitting of ebuild+Manifest commit. Then we need to somehow check if it was converted properly. I don't even want to waste my time on this. IMO the history doesn't have such a great value. > > The rsync tree > > -- > > > > We'd also propagate things to rsync. We'd have to populate it with old > > ChangeLogs, new ChangeLog entries (autogenerated from git) and thick > > Manifests. So users won't notice much of a change. > > > > How will user check the ebuild integrity with thick manifests using rsync? The same way he currently does :). > > The remaining issue is signing of stuff. We could supposedly sign > > Manifests but IMO it's a waste of resources considered how poor > > the signing system is for non-git repos. > > Again, how will user check the integrity and authenticity if Manifests are > unsigned? As far as I'm concerned, user can use the user git tree to get proper signatures or any other method that has proper signing support already. If someone wants proper GPG support in rsync, he can work on that. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 15 September 2014 00:03, Michał Górny wrote: > This means we don't have to wait till someone figures out the perfect > way of converting the old CVS repository. You don't need that history > most of the time, and you can play with CVS to get it if you really do. > Once somebody works this out, you can also simply make it available as a "replacement" ref. See 'git replace' This would mean, essentially, you could push a ref called 'refs/replace/oldcvs' of value "firstsha1 oldcvssha1" and anyone who wanted it could manually fetch it, and any one who did fetch it would get the full history in all of its glory, and then git would transparently pretend that history was always there anyway. No rebasing required, and available on a need-to-know basis :) -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 3:55 PM, hasufell wrote: > Davide Pesavento: >>> In any case, we would likely strip the history anyway to get a small >>> repo to work with. >>> >>> I have prepared a basic git update hook that keeps master clean >>> and attached it to the bug [1]. It enforces basic policies, prevents >>> forced updates and checks GPG signatures on left-most history line. It >>> can also be extended to do more extensive tree checks. >> >> Are we going to disallow merge commits and ask devs to rebase local >> changes in order to keep the history "clean"? >> > > I'd say it doesn't make sense to create merge commits for conflicts that > arise by someone having pushed earlier than you. > > Merge commits should only be there if they give useful information. > I totally agree. But is there a way to automatically enforce this? > Also... if you merge from a _user_ who is untrusted and allow a > fast-forward merge, then the signature verification fails. That means > for such pull requests you either have to use "git am" or "git merge > --no-ff". > Right. In that case you can either sign the merge commit or amend the user's commit and sign it yourself (re-signing could be needed anyway if you have to rebase). Thanks, Davide
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Jauhien Piatlicki: > > Or well, have our own pull requests review tool. > > Also only a secondary problem. Mirroring on github/bitbucket whatever should be fairly straightforward to allow user contributions. In addition the usual git workflow via e-mail/ML would become more popular (either via git style patches or plain pull request information with branch/commit/repository). So I'd suggest to focus on the git migration first.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Davide Pesavento: >> Main developer repo >> --- >> >> I was able to create a start git repository that takes around 66M >> as a git pack (this is how much you will have to fetch to start working >> with it). The repository is stripped clean of history and ChangeLogs, >> and has thin Manifests only. >> >> This means we don't have to wait till someone figures out the perfect >> way of converting the old CVS repository. You don't need that history >> most of the time, and you can play with CVS to get it if you really do. > > +1 > +1 >> In any case, we would likely strip the history anyway to get a small >> repo to work with. >> >> I have prepared a basic git update hook that keeps master clean >> and attached it to the bug [1]. It enforces basic policies, prevents >> forced updates and checks GPG signatures on left-most history line. It >> can also be extended to do more extensive tree checks. > > Are we going to disallow merge commits and ask devs to rebase local > changes in order to keep the history "clean"? > I'd say it doesn't make sense to create merge commits for conflicts that arise by someone having pushed earlier than you. Merge commits should only be there if they give useful information. Also... if you merge from a _user_ who is untrusted and allow a fast-forward merge, then the signature verification fails. That means for such pull requests you either have to use "git am" or "git merge --no-ff".
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Jauhien Piatlicki: > > Again, how will user check the integrity and authenticity if Manifests are > unsigned? > While this is an issue to be solved, it shouldn't be a blocker for the git migration. There is no regression if this isn't solved. There is no sane automated method for verifying signed Manifests yet (that should be on PM level) and signing them isn't even enforced throughout the tree. Moreover I highly doubt that there is any user who runs around ebuild directories and checks Manifest signatures by hand. People who really care use emerge-webrsync. If we use the proposed solution, then there is an additional method via the User syncing repo, so it's a win. We can put more effort into solving this for rsync mirrors later, but I'd rather focus on the git migration.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny wrote: > We have main developer repo where developers work & commit and are > relatively happy. For every push into developer repo, automated magic > thingie merges stuff into user sync repo and updates the metadata cache > there. How long does the md5-cache regeneration process take? Are you sure it will be able to keep up with the rate of pushes to the repo during "peak hours"? If not, maybe we could use a time-based thing similar to the current cvs->rsync synchronization. [...] > Main developer repo > --- > > I was able to create a start git repository that takes around 66M > as a git pack (this is how much you will have to fetch to start working > with it). The repository is stripped clean of history and ChangeLogs, > and has thin Manifests only. > > This means we don't have to wait till someone figures out the perfect > way of converting the old CVS repository. You don't need that history > most of the time, and you can play with CVS to get it if you really do. +1 > In any case, we would likely strip the history anyway to get a small > repo to work with. > > I have prepared a basic git update hook that keeps master clean > and attached it to the bug [1]. It enforces basic policies, prevents > forced updates and checks GPG signatures on left-most history line. It > can also be extended to do more extensive tree checks. Are we going to disallow merge commits and ask devs to rebase local changes in order to keep the history "clean"? Thanks a lot, Davide
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
14.09.14 15:25, "C. Bergström" написав(ла): > On 09/14/14 08:24 PM, Jauhien Piatlicki wrote: >> 14.09.14 15:23, Jauhien Piatlicki написав(ла): >>> Another question: will it be possible to maintain a copy of tree on github >>> to make contributions for users simpler (similarly to e.g. science >>> overlay)? (Can it somehow be combined with proposed signing mechanism?) >>> >>> >> Or well, have our own pull requests review tool. > NIH? What would be the benefit of that.. before going down this path.. I > think there's some good tools around which may at least serve as a base to > (fork) from before starting a ground up project. > > Sorry to jump in the middle of the conversation, but I know 1st hand how much > is involved here. > I was not precise. By our own I mean hosted by us, not by github. ) signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
On 09/14/14 08:24 PM, Jauhien Piatlicki wrote: 14.09.14 15:23, Jauhien Piatlicki написав(ла): Another question: will it be possible to maintain a copy of tree on github to make contributions for users simpler (similarly to e.g. science overlay)? (Can it somehow be combined with proposed signing mechanism?) Or well, have our own pull requests review tool. NIH? What would be the benefit of that.. before going down this path.. I think there's some good tools around which may at least serve as a base to (fork) from before starting a ground up project. Sorry to jump in the middle of the conversation, but I know 1st hand how much is involved here.
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
14.09.14 15:23, Jauhien Piatlicki написав(ла): > Another question: will it be possible to maintain a copy of tree on github to > make contributions for users simpler (similarly to e.g. science overlay)? > (Can it somehow be combined with proposed signing mechanism?) > > Or well, have our own pull requests review tool. signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Another question: will it be possible to maintain a copy of tree on github to make contributions for users simpler (similarly to e.g. science overlay)? (Can it somehow be combined with proposed signing mechanism?) signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)
Hi, 14.09.14 14:03, Michał Górny написав(ла): > Hi, > > I'm quite tired of promises and all that perfectionist non-sense which > locks us up with CVS for next 10 years of bikeshed. Therefore, I have > prepared a plan how to do git migration, and I believe it's doable in > less than 2 weeks (plus the testing). Of course, that assumes infra is > going to cooperate quickly or someone else is willing to provide the > infra for it. > as always, nice effort, but I foresee lots of bikeshedding in this thread. ) > This means we don't have to wait till someone figures out the perfect > way of converting the old CVS repository. You don't need that history > most of the time, and you can play with CVS to get it if you really do. > In any case, we would likely strip the history anyway to get a small > repo to work with. > Is it so difficult to convert CVS history? > > The rsync tree > -- > > We'd also propagate things to rsync. We'd have to populate it with old > ChangeLogs, new ChangeLog entries (autogenerated from git) and thick > Manifests. So users won't notice much of a change. > How will user check the ebuild integrity with thick manifests using rsync? > The remaining issue is signing of stuff. We could supposedly sign > Manifests but IMO it's a waste of resources considered how poor > the signing system is for non-git repos. > Again, how will user check the integrity and authenticity if Manifests are unsigned? Also, it would be a good idea to add automatic signature checking to portage for overlays that use signing (or is it already done?). -- Jauhien signature.asc Description: OpenPGP digital signature