Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-18 Thread Kent Fredric
On 19 September 2014 07:33, Diamond  wrote:

> Lets assume, that I don't want to scrap old ebuild yet. There's no git
> cp command. git mv is just git rm + git add. That's what does it look
> like (usual revbump with git add in reality):
>
> https://github.com/cerebrum/dr/commit/311df9b04d876f5847416fe5ba699edfab50adb6
> I think that git (at least with default config is a pain in the ass for
> packages at all and we should probably think about better platform for
> portage).
>

Not necessarily. It tracks copies too, -C

Also, don't rely on githubs presentation of things as being gospel, its
better than nothing, but it falls short of git.

git log -p --find-copies-harder games-strategy/openra-bin/*.ebuild

You can quite easily convince git to pretend vaguely similar files in the
log are sources of each other with the right options.

Throwing "-M1 -C1" in that command will let git find more

Just because "I think git cant" doesn't mean "git cant", especially if
you've not asked "how can I do  " :)

For instance:

 > git log --stat -C1 -M1 --find-copies-harder

In your repo finds you these interesting "copies" if you look far enough

>.../files/opentracker.init.d =>
net-misc/twonky/files/twonky.initd  | 40 +--
>{www-apps/rutorrent =>
net-misc/minidlna}/metadata.xml  |  16 ++-
>{games-strategy/openra-bin =>
dev-util/mono-debugger}/metadata.xml  |   8 +--
>media-video/{webcamstudio =>
webcamstudio-module}/ChangeLog |  42 +


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-18 Thread Peter Stuge
Diamond wrote:
> I stumbled over this problem when started to use git for packages.

Use git show -M to unstumble yourself.


//Peter



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-18 Thread Diamond
On Thu, 18 Sep 2014 16:00:59 -0400
Rich Freeman  wrote:

> What would you propose?  The problem you raise is just as much an
> issue with cvs.  I don't get a continuous history across revbumps in
> cvs today, so I don't really see a problem with moving to git.
I don't know what to propose. I stumbled over this problem when started
to use git for packages. At least there are other SCM systems too.
Haven't investigated them yet for that issue. Facebook uses even it's
own one.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-18 Thread Rich Freeman
On Thu, Sep 18, 2014 at 3:33 PM, Diamond  wrote:
> Lets assume, that I don't want to scrap old ebuild yet. There's no git
> cp command. git mv is just git rm + git add. That's what does it look
> like (usual revbump with git add in reality):
> https://github.com/cerebrum/dr/commit/311df9b04d876f5847416fe5ba699edfab50adb6
> I think that git (at least with default config is a pain in the ass for
> packages at all and we should probably think about better platform for
> portage).
>

What would you propose?  The problem you raise is just as much an
issue with cvs.  I don't get a continuous history across revbumps in
cvs today, so I don't really see a problem with moving to git.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-18 Thread Diamond
On Thu, 18 Sep 2014 17:04:55 +1200
Kent Fredric  wrote:

> What's more, you can in fact do:
> 
> git mv foo-1.ebuild foo-2.ebuild
> git commit
> 
> and you can still easily tell git to show that as a difference in a
> log.
> 
> Example script to emulate this and example output:
> https://gist.github.com/kentfredric/10e93e9aac875e9edb93
> 
> ( In fact, you don't even have to use 'git mv', as long as you change
> the tree state completely, git is smart enough to track most changes )
> 
Lets assume, that I don't want to scrap old ebuild yet. There's no git
cp command. git mv is just git rm + git add. That's what does it look
like (usual revbump with git add in reality):
https://github.com/cerebrum/dr/commit/311df9b04d876f5847416fe5ba699edfab50adb6
I think that git (at least with default config is a pain in the ass for
packages at all and we should probably think about better platform for
portage).



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-17 Thread Kent Fredric
On 18 September 2014 13:01, Rich Freeman  wrote:

> With git a revbump is:
> cp foo-1.ebuild foo-2.ebuild
> git add foo-2.ebuild
> git commit
>
> (I left out changelogs, repoman, etc, since there is no change with
> any of these, and I left out syncing the git repo.)
>
> There really is nothing new here.
>
> >  Especially
> > if you need to see the diff between packagename-0.1-r1 and
> > packagename-0.1-r2 ebuilds? Git doesn't do this by default and it
> > will might be a nightmare to compare such revbumps by hand.
> >
>
> cvs doesn't do anything to compare the contents of different files.
> So, there really is no loss here.
>

What's more, you can in fact do:

git mv foo-1.ebuild foo-2.ebuild
git commit

and you can still easily tell git to show that as a difference in a log.

Example script to emulate this and example output:
https://gist.github.com/kentfredric/10e93e9aac875e9edb93

( In fact, you don't even have to use 'git mv', as long as you change the
tree state completely, git is smart enough to track most changes )

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-17 Thread Rich Freeman
On Wed, Sep 17, 2014 at 4:02 PM, Diamond  wrote:
> On Mon, 15 Sep 2014 14:51:56 -0400
> Rich Freeman  wrote:
>>
>> In general you want each commit to represent a single "change."  That
>> might be a revbump in a single package, or it might be a package move
>> that involves touching 300 packages in a single commit.
>
> Is it right that you are going to move portage packages to
> git/github/..?

The intent is to move to git.  Git is an scm.  Github is a service
that hosts git repositories with some other value-adds.  There is
interest in mirroring gentoo-x86 on github to help entice more to
contribute (assuming they like working with github), but the nature of
git makes it very easy to mirror a repository on as many sites as you
care to.  That is one of the reasons that it is so popular with FOSS.

> How are you going to make "revbump" with git?

With cvs a revbump is:
cp foo-1.ebuild foo-2.ebuild
cvs add foo-2.ebuild
cvs commit

With git a revbump is:
cp foo-1.ebuild foo-2.ebuild
git add foo-2.ebuild
git commit

(I left out changelogs, repoman, etc, since there is no change with
any of these, and I left out syncing the git repo.)

There really is nothing new here.

>  Especially
> if you need to see the diff between packagename-0.1-r1 and
> packagename-0.1-r2 ebuilds? Git doesn't do this by default and it
> will might be a nightmare to compare such revbumps by hand.
>

cvs doesn't do anything to compare the contents of different files.
So, there really is no loss here.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-17 Thread Kent Fredric
On 18 September 2014 08:02, Diamond  wrote:

> Git doesn't do this by default and it
> will might be a nightmare to compare such revbumps by hand.
>

git diff -M1 -C1

^ is usually sufficient to show new files as differences between similar
files that were already there, including revbumps.



-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-17 Thread Diamond
On Mon, 15 Sep 2014 14:51:56 -0400
Rich Freeman  wrote:

> 
> In general you want each commit to represent a single "change."  That
> might be a revbump in a single package, or it might be a package move
> that involves touching 300 packages in a single commit.

Is it right that you are going to move portage packages to
git/github/..? How are you going to make "revbump" with git? Especially 
if you need to see the diff between packagename-0.1-r1 and
packagename-0.1-r2 ebuilds? Git doesn't do this by default and it
will might be a nightmare to compare such revbumps by hand.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread Daniel Campbell
On 09/16/2014 01:56 PM, hasufell wrote:
> Luca Barbato:
>> On 15/09/14 01:21, Patrick Lauer wrote:
>>> On Sunday 14 September 2014 15:42:15 hasufell wrote:
 Patrick Lauer:
>> Are we going to disallow merge commits and ask devs to rebase local
>> changes in order to keep the history "clean"?
>
> Is that going to be sane with our commit frequency?

 You have to merge or rebase anyway in case of a push conflict, so the
 only difference is the method and the effect on the history.

 Currently... CVS allows you to run repoman on an outdated tree and push
 broken ebuilds with repoman being happy. Git will not allow this.
>>>
>>> iow, git doesn't allow people to work on more than one item at a time?
>>
>> It does.
>>
> 
> I think we really have to write up a step-by-step guide (not just
> workflow policies) for people who have never seriously worked with git.
> 
> On the other hand... there are thousands of tutorials on the net already.
> 
> For the workflow model, I already have created a draft which is in no
> way finished or even correct and there are still some controversially
> discussed issues.
> https://wiki.gentoo.org/wiki/Gentoo_git_workflow
> 
As a prospective Gentoo developer, having a guide around meant
specifically for Gentoo's practices would be incredibly helpful. I use
git in my own hobby development and learned from Pro Git, et al, but
it'd still be really nice to have, and give developers a place to
point to if a new developer is having troubles.

Just my 2¢.



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread Michał Górny
Dnia 2014-09-16, o godz. 19:05:18
Luca Barbato  napisał(a):

> On 14/09/14 16:46, Michał Górny wrote:
> > Of course, if we can't spare the resources to do intermediate updates,
> > we may as well switch to cron-based update method.
> 
> The mirror have a sync time, so basically regenerating the cache and
> pushing the tree with further toward the user can happen the same way
> w/out impacting anybody.
> 
> We could just snapshot the tree when the regen starts and push that
> commit to the user git and rsync.
> 
> People is still supposed to play nice and sync not every minute.

People don't have to sync. They will pull, and pulling often doesn't
really hurt servers like rsync does.

Of course, I'm considering the users switching to git there. However,
I don't think limitations of rsync should impact them.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread hasufell
Luca Barbato:
> On 15/09/14 01:21, Patrick Lauer wrote:
>> On Sunday 14 September 2014 15:42:15 hasufell wrote:
>>> Patrick Lauer:
> Are we going to disallow merge commits and ask devs to rebase local
> changes in order to keep the history "clean"?

 Is that going to be sane with our commit frequency?
>>>
>>> You have to merge or rebase anyway in case of a push conflict, so the
>>> only difference is the method and the effect on the history.
>>>
>>> Currently... CVS allows you to run repoman on an outdated tree and push
>>> broken ebuilds with repoman being happy. Git will not allow this.
>>
>> iow, git doesn't allow people to work on more than one item at a time?
> 
> It does.
> 

I think we really have to write up a step-by-step guide (not just
workflow policies) for people who have never seriously worked with git.

On the other hand... there are thousands of tutorials on the net already.

For the workflow model, I already have created a draft which is in no
way finished or even correct and there are still some controversially
discussed issues.
https://wiki.gentoo.org/wiki/Gentoo_git_workflow



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread Rich Freeman
On Tue, Sep 16, 2014 at 1:07 PM, Luca Barbato  wrote:
> On 14/09/14 17:30, Patrick Lauer wrote:
>>> Are we going to disallow merge commits and ask devs to rebase local
>>> changes in order to keep the history "clean"?
>>
>> Is that going to be sane with our commit frequency?
>>
>
> Which is our commit frequency? Worst case we can aggregate changes and
> push them in bulk.
>

I don't think commit frequency will be a problem other than maybe
causing repoman issues if you're doing tree-wide changes (since
repoman takes a while to run).

See:
https://github.com/rich0/gentoo-gitmig-2014-09-15/graphs/commit-activity

Our tree gets 50-150 commits/day on average it seems.  I have no idea
how far back the punchcard view goes, but that should give a relative
sense of distribution.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread Luca Barbato
On 15/09/14 01:21, Patrick Lauer wrote:
> On Sunday 14 September 2014 15:42:15 hasufell wrote:
>> Patrick Lauer:
 Are we going to disallow merge commits and ask devs to rebase local
 changes in order to keep the history "clean"?
>>>
>>> Is that going to be sane with our commit frequency?
>>
>> You have to merge or rebase anyway in case of a push conflict, so the
>> only difference is the method and the effect on the history.
>>
>> Currently... CVS allows you to run repoman on an outdated tree and push
>> broken ebuilds with repoman being happy. Git will not allow this.
> 
> iow, git doesn't allow people to work on more than one item at a time?

It does.

> That'd mean I need half a dozen checkouts just to emulate cvs, which somehow 
> doesn't make much sense to me ...

Your statement sounds strange to me.

commands you need to know:

git rebase -i

git add (-p)

git commit (-p)

git branch/checkout

Examples

edit cat/pkg/foo.ebuild

edit cat2/pkg/bar.ebuild

edit profile

git add -p# to select by line what you want in the commit

git commit# and you write down the commit message

git commit -p # to do both at the same time.

git commit -p # again to lump other changes line by line

OR

edit cat/pkg/foo.ebuild

git commit -a # everything (that's tracked) you edited gets in a commit

edit cat/pkg/bar.ebuild

git commit -a # everything (that's tracked) you edited gets in again

...

git rebase -i # sort out what you want commit merge, edit, drop etc

git push.




Git let you do whatever you do in cvs, but in a _much_ saner and faster way.





Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread Luca Barbato
On 14/09/14 17:30, Patrick Lauer wrote:
>> Are we going to disallow merge commits and ask devs to rebase local
>> changes in order to keep the history "clean"?
> 
> Is that going to be sane with our commit frequency?
> 

Which is our commit frequency? Worst case we can aggregate changes and
push them in bulk.

lu



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread Luca Barbato
On 14/09/14 16:46, Michał Górny wrote:
> Of course, if we can't spare the resources to do intermediate updates,
> we may as well switch to cron-based update method.

The mirror have a sync time, so basically regenerating the cache and
pushing the tree with further toward the user can happen the same way
w/out impacting anybody.

We could just snapshot the tree when the regen starts and push that
commit to the user git and rsync.

People is still supposed to play nice and sync not every minute.

lu





Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-16 Thread Luca Barbato
On 14/09/14 17:15, Kent Fredric wrote:
> On 15 September 2014 02:40, Michał Górny  wrote:
> 
>> However, I'm wondering if it would be possible to restrict people from
>> accidentally committing straight into github (e.g. merging pull
>> requests there instead of to our main server).
>>
> 
> 
> Easy.
> 
> Put the Gentoo repo in its own group.
> Don't give anyone any kinds of permissions on it.
> Have only one approved account for the purpose of pushing commits.
> Have a post-push hook that replicates to github as that approved account
> 
> => Github is just a read only mirror, any pull reqs submitted there will be
> fielded and pushed to gentoo directly.
> 
> Only downside there is the way github pull reqs work is if the final SHA1's
> that hit tree don't match, the pull req doesn't close.
> 
> Solutions:
> 
> - A) Have somebody tasked with reaping old pull reqs with permissions
> granted. ( Uck )
> - B) Always use a merge of some kind to mark the pull req as dead ( for
> instance, an "ours" merge to mark the branch as deprecated )

C) Ask nicely Github to have an application key and have a pull-request
bridge to avoid the problem completely.

I'd complete the migration first and discuss this kind of details later.

lu




Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Rich Freeman
On Mon, Sep 15, 2014 at 1:42 PM, Ian Stakenvicius  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> On 14/09/14 09:06 PM, Peter Stuge wrote:
>> Rich Freeman wrote:
>>> If you just want to do 15 standalone commits before you push you
>>> can do those sequentially easily enough.  A branch would be more
>>> appropriate for some kind of mini-project.
>> ..
>>> That is the beauty of git - branches are really cheap. So are
>>> repositories
>>
>> And commits.
>>
>> Not only are branches cheap, they are also very easy to create,
>> and maybe most importantly they can be created at any time, even
>> after the commits.
>>
>> It's quick and painless to create a bunch of commits which aren't
>> really closely related in sequence, and only later clean the whole
>> series of commits up while creating different branches for commits
>> which should actually be grouped rather than mixed all together.
>>
>
> Ahh, so the secret here would then be just to git add files on a
> related per-package basis, leaving the other files out of the commit.
>  that makes sense.  There would still be the issue of untracked files
> in the repo and the ability to switch back to the 'master' branch to
> cherry-pick a commit for pushing, though...  I guess we'd just have to
> deal with the delay there and try and push all of the changes at once?

If you really like to keep a lot of things going at once, I'd strongly
recommend doing it in another branch (or more than one), so that you
can cherry-pick/merge/etc stuff into master and not have a mess of an
index while doing it.  You need not publish that branch, and if you do
publish it you need not do it on Gentoo's infra (a git repository can
be synced with multiple other repositories, and the various heads
don't need to stay in sync).

In general you want each commit to represent a single "change."  That
might be a revbump in a single package, or it might be a package move
that involves touching 300 packages in a single commit.  The idea
though is that every commit should stand on its own, so that they can
be reverted on their own, etc.  That's just good practice, and should
be what we're trying to do with cvs with the huge limitation that
multi-file changes in cvs aren't atomic.

There are a lot of guides/tools/etc out there for using git.  A
popular one is the git-flow workflow.  I'm not suggesting that it is
really necessary for one person, but anybody not familiar with git
should probably read up on it just so that you have some sense of how
it can be used.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Ian Stakenvicius
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 14/09/14 09:06 PM, Peter Stuge wrote:
> Rich Freeman wrote:
>> If you just want to do 15 standalone commits before you push you
>> can do those sequentially easily enough.  A branch would be more 
>> appropriate for some kind of mini-project.
> ..
>> That is the beauty of git - branches are really cheap. So are
>> repositories
> 
> And commits.
> 
> Not only are branches cheap, they are also very easy to create,
> and maybe most importantly they can be created at any time, even
> after the commits.
> 
> It's quick and painless to create a bunch of commits which aren't 
> really closely related in sequence, and only later clean the whole 
> series of commits up while creating different branches for commits 
> which should actually be grouped rather than mixed all together.
> 

Ahh, so the secret here would then be just to git add files on a
related per-package basis, leaving the other files out of the commit.
 that makes sense.  There would still be the issue of untracked files
in the repo and the ability to switch back to the 'master' branch to
cherry-pick a commit for pushing, though...  I guess we'd just have to
deal with the delay there and try and push all of the changes at once?
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iF4EAREIAAYFAlQXJQkACgkQ2ugaI38ACPBFBQD/Z1SYvajcf/WORxknJGu1VfI0
f8CFhMTdE34Bk0Zd+GoA/iJtwsYBUQQHXhRjs7AzQDxaIEuFRgzyUgee4BICKaiq
=8fbP
-END PGP SIGNATURE-



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Ian Stakenvicius
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 14/09/14 08:57 PM, Rich Freeman wrote:
> On Sun, Sep 14, 2014 at 7:21 PM, Patrick Lauer 
> wrote:
>> 
>> iow, git doesn't allow people to work on more than one item at a
>> time?
>> 
>> That'd mean I need half a dozen checkouts just to emulate cvs,
>> which somehow doesn't make much sense to me ...
>> 
> 
> Well, you can work on as many things as you like in git, but it 
> doesn't keep track of what changes have to do with what things if
> you don't commit in-between.  So, you'll have a big list of changes
> in your index, and you'll have to pick-and-choose what you commit
> at any one time.
> 
> If you really want to work on many things "at once" the better way
> to do it is to do a temporary branch per-thing, and when you
> switch between things you switch between branches, and then move
> into master things as they are done.
> 
> I assume you mean working on things that will take a while to 
> complete.  If you just want to do 15 standalone commits before you 
> push you can do those sequentially easily enough.  A branch would
> be more appropriate for some kind of mini-project.
> 
> You can work on branches without pushing those to the master repo. 
> Or, if appropriate a project team might choose to push their branch
> to master, or to some other repo (like an overlay).  This would
> allow collaborative work on a large commit, with a quick final
> merge into the main tree.  That is the beauty of git - branches are
> really cheap. So are repositories - if somebody wants to do all
> their work in github and then push to the main tree, they can do
> that.
> 

Actually i see what Patrick's getting at -- I have similar issues when
working with mozilla stuff.  if you're using local (temporary)
branches, the whole tree is in the state of that current checkout,
right?  IE, while I have my firefox-update branch active and working
for an 'ebuild ... install', I can't be doing work in my
'freewrl-update' branch unless I make multiple separate repo trees,
one for each independently-separate workflow i want to do concurrently.

Ideal here would be the ability to have separate active checkouts of a
repo on a per-shell basis, ie each shell invocation would be able to
work concurrently and distinctly on distinct branches; anyone done
that before?  Does git do it already?

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iF4EAREIAAYFAlQXJAAACgkQ2ugaI38ACPBREAD/YnsyY+fAK1TEXgzNYHBCq138
Q5Bj+J6pNGX8aBDjjHoA/iyy5CWxhyAYE3buSOXkEvFfhm/716DsQIptpX7JpS0m
=YrIG
-END PGP SIGNATURE-



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Ian Stakenvicius
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 14/09/14 07:21 PM, Patrick Lauer wrote:
> On Sunday 14 September 2014 15:42:15 hasufell wrote:
>> Patrick Lauer:
 Are we going to disallow merge commits and ask devs to rebase
 local changes in order to keep the history "clean"?
>>> 
>>> Is that going to be sane with our commit frequency?
>> 
>> You have to merge or rebase anyway in case of a push conflict, so
>> the only difference is the method and the effect on the history.
>> 
>> Currently... CVS allows you to run repoman on an outdated tree
>> and push broken ebuilds with repoman being happy. Git will not
>> allow this.
> 
> iow, git doesn't allow people to work on more than one item at a
> time?
> 
> That'd mean I need half a dozen checkouts just to emulate cvs,
> which somehow doesn't make much sense to me ...
> 

I think you'd just need to have a (large) handful of different
branches, and squash+cherry-pick into master when you're ready to push.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iF4EAREIAAYFAlQXH08ACgkQ2ugaI38ACPAZ1gD/WfiMZnu3qesaILhPYEKYy2BP
MUS2zWJVqYJ8lKp16nUA/1ng1mMxX6pNKZVYIaT/BFuERKz3g0BcLck+XILs3Hth
=Ul7F
-END PGP SIGNATURE-



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Piotr Szymaniak
On Mon, Sep 15, 2014 at 11:26:47PM +1200, Kent Fredric wrote:
> None of these are impossible things, but they're much more complex than
> "just make a dodgy commit and get somebody to pull it".

Much more simple would be to make a dodgy commit by one of the devs. Why
use users for that, if "the bad guy(/-s)" could be inside? Isn't that
the best way to poison stuff? Just look at the "let's move to git"
threads, there're bikeshedding and it's still cvs+rsync (maybe, just
maybe, this isn't a coincidence)


Piotr Szymaniak.
-- 
Mezczyzna  odlozyl gazete z powrotem na stojak i postapil krok w przod,
robiac  mine,  ktora nadala mu wyglad swinskiego pecherza rozpietego na
drucianym wieszaku do garniturow.
  -- Graham Masterton, "The Burning"


signature.asc
Description: Digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Kent Fredric
On 15 September 2014 22:10, Jauhien Piatlicki  wrote:

> So signing of git commits does not guarantee enough security (taking
> that SHA1 is weak and can be broken), right? Could we than just use
> usual (not thin) manifests?
>

However, the attackability of SHA1 may be entirely immaterial, because
methods to exploit that require compromising other security strategies.

If somebody pushes signed commit 0x0001  with parents 0x0002 and 0x0003
with tree 0x0004 with files 0x0005 to 0x0010, those binary blobs are
pushed. And there is no way I know of to have those binary blobs replaced
with cuckoo blobs.

Once they're replicated, Git doesn't try re-replicating the same SHA1s.

So your attack vectors entail directly manipulating the git storage on
gentoo's servers, or poisoning a mirror under their control, or poisoning
the data *PRIOR* to it landing on gentoo servers, or being NSA and
poisoning it dynamically when a user attempts to fetch that specific SHA1.

None of these are impossible things, but they're much more complex than
"just make a dodgy commit and get somebody to pull it".

This basically means you could use CRC32 as your hash algorithm and still
pose a respectable problem for would-be attackers.

As such, I don't presently see git commit signing as a "Security" model,
merely a proof of authorship model. Anybody can forge commits with your
'author  = ' and 'committer = ', but you're the only person who can sign
the commit with your signature.

That is to say, you sign that you crafted the *commit*. But you're *not*
signing the creation of any of the dependencies.

For instance, two commits may have the same tree, but obviously only one
person forged that tree object.

And two trees may have the same file ( and indeed, this is an expected
element of how git works ), but you're not signing that you created all the
files in that tree. You may infer that from the chain of authority from the
commit itself, but it is not fact.

And parent objects are also dependencies, and nobody would ever consider
claiming they're a signing authority for all of that ;), that would be by
proxy signing the creation of the entire repository back to the first
commit ever forged!  and its for that reason its probably good that git
doesn't presently recursively feed all dependencies of a commit into GPG. I
don't have 5 hours while every single blob in my repository is uncompressed
and fed through GPG :p


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Jauhien Piatlicki
Hi,

On 09/15/2014 01:37 AM, Kent Fredric wrote:
> On 15 September 2014 11:25, hasufell  wrote:
> 
>> Robin said
>>> The Git commit-signing design explicitly signs the entire commit,
>> including blob contents, to avoid this security problem.
>>
>> Is this correct or not?
>>
> 
> I can verify a commit by hand with only the commit object and gpg, but
> without any of the trees or parents.
> 
> https://gist.github.com/kentfredric/8448fe55ffab7d314ecb
> 
> 

So signing of git commits does not guarantee enough security (taking
that SHA1 is weak and can be broken), right? Could we than just use
usual (not thin) manifests?

--
Jauhien



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Michał Górny
Dnia 2014-09-15, o godz. 07:21:35
Patrick Lauer  napisał(a):

> On Sunday 14 September 2014 15:42:15 hasufell wrote:
> > Patrick Lauer:
> > >> Are we going to disallow merge commits and ask devs to rebase local
> > >> changes in order to keep the history "clean"?
> > > 
> > > Is that going to be sane with our commit frequency?
> > 
> > You have to merge or rebase anyway in case of a push conflict, so the
> > only difference is the method and the effect on the history.
> > 
> > Currently... CVS allows you to run repoman on an outdated tree and push
> > broken ebuilds with repoman being happy. Git will not allow this.
> 
> iow, git doesn't allow people to work on more than one item at a time?
> 
> That'd mean I need half a dozen checkouts just to emulate cvs, which somehow 
> doesn't make much sense to me ...

I'd appreciate if you reduced FUD to minimum.

What hasufell meant is that you normally don't have three year-old
files lying around in checkout because you did 'cvs up -dP' in another
directory. With git, you update everything.

What you do locally, is totally unrelated.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-15 Thread Michał Górny
Dnia 2014-09-14, o godz. 21:30:36
Tim Harder  napisał(a):

> On 2014-09-14 10:46, Michał Górny wrote:
> > Dnia 2014-09-14, o godz. 15:40:06
> > Davide Pesavento  napisał(a):
> > > How long does the md5-cache regeneration process take? Are you sure it
> > > will be able to keep up with the rate of pushes to the repo during
> > > "peak hours"? If not, maybe we could use a time-based thing similar to
> > > the current cvs->rsync synchronization.
> > 
> > This strongly depends on how much data is there to update. A few
> > ebuilds are quite fast, eclass change isn't ;). I was thinking of
> > something along the lines of, in pseudo-code speaking:
> > 
> >   systemctl restart cache-regen
> > 
> > That is, we start the regen on every update. If it finishes in time, it
> > commits the new metadata. If another update occurs during regen, we
> > just restart it to let it catch the new data.
> > 
> > Of course, if we can't spare the resources to do intermediate updates,
> > we may as well switch to cron-based update method.
> 
> I don't see per push metadata regen working entirely well in this case
> if this is the only way we're generating the metadata cache for users to
> sync. It's easy to imagine a plausible situation where a widely used
> eclass change is made followed by commits less than a minute apart (or
> shorter than however long it would take for metadata regen to occur) for
> at least 30 minutes (rsync refresh period for most user-facing mirrors)
> during a time of high activity.

For a metadata recheck (that is, egencache run with no changes):

a. cold cache ext4:

  real3m54.321s
  user0m44.413s
  sys 0m13.497s

b. warm cache ext4:

  real0m40.672s
  user0m35.087s
  sys 0m 4.687s

I will try to re-run that on btrfs or reiserfs to get a more meaningful
numbers.

Now, that results back up your claims. However, if we can get that to
<10s, I doubt we would have a major issue. My idea works like this:

1. first update is pushed,
1a. egencache starts rechecking and updating cache,
2. second update is pushed,
2a. previous egencache is terminated,
2b. egencache starts rechecking and updating cache,
2c. egencache finishes in time and commits.

The point is, nothing gets committed to the user-reachable location
before egencache finishes. And it goes quasi-incrementally, so if
another update happens before egencache finished, it only does
the 'slow' regen on changed metadata.

I will come back with more results soon.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Tim Harder
On 2014-09-14 21:57, Kent Fredric wrote:
> I generate metadata for the perl-experimental overlay periodically as a
> snapshotted variation of the same, and the performance isn't so bad.

Overlays with few eclasses are much different than the main tree.
Anyway, egencache isn't bad it's just significantly slower than
alternatives so it could be sped up quite a lot if necessary.

> However, what I suspect you *could* do with a push hook is regen metadata
> for only things that were modified in that commit, because I believe
> there's a way to regen metadata for only specific files now.

> ie:
>  modifications to cat/PN *would* trigger a metadata update, but only for
> that cat/PN
>  modifications to eclass/* would *NOT* trigger a metadata update as part of
> the push.

> And doing tree-wide "an eclass was changed" updates could be done with
> lower priority in an asynchronous cron job or something so as not to block
> workflow for several minutes/hours/whatever while some muppet sits there
> watching "git push" do nothing.

If we need to do piecewise regen it seems we would be better off just
sticking with the current scheduled cron job approach. Otherwise it
sounds like one could pull updates without having the correct metadata
for a significant portion of the tree.

Tim


pgpD3F3w_LSNi.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 13:30, Tim Harder  wrote:

> I haven't run portage metadata regen on a beefy machine lately, but I
> don't think it could keep up in all cases. Perhaps someone can prove me
> wrong.
>
> Anyway, things could definitely be sped up if portage merges a few speed
> tweaks used in pkgcore. Specifically, I think using some of the weakref
> and perhaps jitted attrs support along with the eclass caching hacks
> would give a 2-4x metadata regen speedup. Otherwise pkgcore could
> potentially be used to regen metadata as well or some other tuned regen
> tool.
>


I generate metadata for the perl-experimental overlay periodically as a
snapshotted variation of the same, and the performance isn't so bad.

However, what I suspect you *could* do with a push hook is regen metadata
for only things that were modified in that commit, because I believe
there's a way to regen metadata for only specific files now.

ie:
 modifications to cat/PN *would* trigger a metadata update, but only for
that cat/PN
 modifications to eclass/* would *NOT* trigger a metadata update as part of
the push.

And doing tree-wide "an eclass was changed" updates could be done with
lower priority in an asynchronous cron job or something so as not to block
workflow for several minutes/hours/whatever while some muppet sits there
watching "git push" do nothing.

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Tim Harder
On 2014-09-14 10:46, Michał Górny wrote:
> Dnia 2014-09-14, o godz. 15:40:06
> Davide Pesavento  napisał(a):
> > How long does the md5-cache regeneration process take? Are you sure it
> > will be able to keep up with the rate of pushes to the repo during
> > "peak hours"? If not, maybe we could use a time-based thing similar to
> > the current cvs->rsync synchronization.
> 
> This strongly depends on how much data is there to update. A few
> ebuilds are quite fast, eclass change isn't ;). I was thinking of
> something along the lines of, in pseudo-code speaking:
> 
>   systemctl restart cache-regen
> 
> That is, we start the regen on every update. If it finishes in time, it
> commits the new metadata. If another update occurs during regen, we
> just restart it to let it catch the new data.
> 
> Of course, if we can't spare the resources to do intermediate updates,
> we may as well switch to cron-based update method.

I don't see per push metadata regen working entirely well in this case
if this is the only way we're generating the metadata cache for users to
sync. It's easy to imagine a plausible situation where a widely used
eclass change is made followed by commits less than a minute apart (or
shorter than however long it would take for metadata regen to occur) for
at least 30 minutes (rsync refresh period for most user-facing mirrors)
during a time of high activity.

I haven't run portage metadata regen on a beefy machine lately, but I
don't think it could keep up in all cases. Perhaps someone can prove me
wrong.

Anyway, things could definitely be sped up if portage merges a few speed
tweaks used in pkgcore. Specifically, I think using some of the weakref
and perhaps jitted attrs support along with the eclass caching hacks
would give a 2-4x metadata regen speedup. Otherwise pkgcore could
potentially be used to regen metadata as well or some other tuned regen
tool.

Tim


pgpGfmG5Ks9YC.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 13:06, Peter Stuge  wrote:

> even after
> the commits.
>

I've even made branches in "detached head" state ( that is, without a
branch ) and given them branches after the fact.

After all, branches aren't really "things", they're just pointers to SHA1s,
that get repointed to new sha1's as part of "git commit".

Tags are also simply pointers, they just don't get updated by default.

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Peter Stuge
Rich Freeman wrote:
> If you just want to do 15 standalone commits before you push you can
> do those sequentially easily enough.  A branch would be more
> appropriate for some kind of mini-project.
..
> That is the beauty of git - branches are really cheap.
> So are repositories

And commits.

Not only are branches cheap, they are also very easy to create, and
maybe most importantly they can be created at any time, even after
the commits.

It's quick and painless to create a bunch of commits which aren't
really closely related in sequence, and only later clean the whole
series of commits up while creating different branches for commits
which should actually be grouped rather than mixed all together.


//Peter



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Peter Stuge
Patrick Lauer wrote:
> > > That'd mean I need half a dozen checkouts just to emulate cvs, which
> > > somehow doesn't make much sense to me ...
> > 
> > Unlike CVS, git doesn't force you to work in "Keep millions of files in
> > uncommitted states" mode just to work on a codebase, due to the commit <->
> > replicate seperation.
> 
> But that's the feature!

You can have millions of uncommitted files with git too. The person
who creates a commit always decides what changes in what files should
be included in that commit. (You don't even have to commit all the
changes within one file at the same time.)

There are some shortcuts for committing all uncommitted changes at
once but you don't have to do that. I frequently only commit little
bits of my currently uncommitted changes.


> I can work on bumping postgresql (takes about 1h walltime to compile and test 
> all versions) *and* work on a few tiny python packages while doing that. 
> Without breaking either process. Without multiple checkouts.

Same with git.


//Peter



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 7:21 PM, Patrick Lauer  wrote:
>
> iow, git doesn't allow people to work on more than one item at a time?
>
> That'd mean I need half a dozen checkouts just to emulate cvs, which somehow
> doesn't make much sense to me ...
>

Well, you can work on as many things as you like in git, but it
doesn't keep track of what changes have to do with what things if you
don't commit in-between.  So, you'll have a big list of changes in
your index, and you'll have to pick-and-choose what you commit at any
one time.

If you really want to work on many things "at once" the better way to
do it is to do a temporary branch per-thing, and when you switch
between things you switch between branches, and then move into master
things as they are done.

I assume you mean working on things that will take a while to
complete.  If you just want to do 15 standalone commits before you
push you can do those sequentially easily enough.  A branch would be
more appropriate for some kind of mini-project.

You can work on branches without pushing those to the master repo.
Or, if appropriate a project team might choose to push their branch to
master, or to some other repo (like an overlay).  This would allow
collaborative work on a large commit, with a quick final merge into
the main tree.  That is the beauty of git - branches are really cheap.
So are repositories - if somebody wants to do all their work in github
and then push to the main tree, they can do that.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Patrick Lauer:
> On Monday 15 September 2014 11:27:34 Kent Fredric wrote:
>> On 15 September 2014 11:21, Patrick Lauer  wrote:
>>> iow, git doesn't allow people to work on more than one item at a time?
>>>
>>> That'd mean I need half a dozen checkouts just to emulate cvs, which
>>> somehow
>>> doesn't make much sense to me ...
>>
>> Use the Stash. Or just commit items, then swap branches, and then discard
>> the commits sometime later before pushing.
>>
>> Unlike CVS, git doesn't force you to work in "Keep millions of files in
>> uncommitted states" mode just to work on a codebase, due to the commit <->
>> replicate seperation.
> But that's the feature!
> 
> I can work on bumping postgresql (takes about 1h walltime to compile and test 
> all versions) *and* work on a few tiny python packages while doing that. 
> Without breaking either process. Without multiple checkouts.
> 
> I doubt stash would allow things to progress ... but it's a cute idea.
> 

Please read up about git branches.

I don't see anything particularly broken. People use git to work on 10+
different feature at a time. It works.

Also, let's not derail this thread to git vs CVS, thanks.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 11:25, hasufell  wrote:

> Robin said
> > The Git commit-signing design explicitly signs the entire commit,
> including blob contents, to avoid this security problem.
>
> Is this correct or not?
>

I can verify a commit by hand with only the commit object and gpg, but
without any of the trees or parents.

https://gist.github.com/kentfredric/8448fe55ffab7d314ecb


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Patrick Lauer:
> On Sunday 14 September 2014 15:42:15 hasufell wrote:
>> Patrick Lauer:
 Are we going to disallow merge commits and ask devs to rebase local
 changes in order to keep the history "clean"?
>>>
>>> Is that going to be sane with our commit frequency?
>>
>> You have to merge or rebase anyway in case of a push conflict, so the
>> only difference is the method and the effect on the history.
>>
>> Currently... CVS allows you to run repoman on an outdated tree and push
>> broken ebuilds with repoman being happy. Git will not allow this.
> 
> iow, git doesn't allow people to work on more than one item at a time?
> 

Completely the opposite. You can work on 400 packages, accumulate the
changes, commit them and push them in one blow instead of writing
fragile scripts or Makefiles that do >400 pushes, fail at some point in
the middle because of a conflict and then try to figure out what you
already pushed and what not.

> That'd mean I need half a dozen checkouts just to emulate cvs, which somehow 
> doesn't make much sense to me ...
> 

checkouts? You probably mean that you have to rebase your changes in
case someone pushed before you. That makes perfect sense, because the
ebuild you just wrote might be broken by now, because someone changed
profiles/.

We are talking about a one-liner in the shell that will work in the
majority of the cases. If it doesn't work (as in: merge conflict), then
that means there is something REALLY wrong and 2 people are working
uncoordinated on the same file at a time.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Patrick Lauer
On Monday 15 September 2014 11:27:34 Kent Fredric wrote:
> On 15 September 2014 11:21, Patrick Lauer  wrote:
> > iow, git doesn't allow people to work on more than one item at a time?
> > 
> > That'd mean I need half a dozen checkouts just to emulate cvs, which
> > somehow
> > doesn't make much sense to me ...
> 
> Use the Stash. Or just commit items, then swap branches, and then discard
> the commits sometime later before pushing.
> 
> Unlike CVS, git doesn't force you to work in "Keep millions of files in
> uncommitted states" mode just to work on a codebase, due to the commit <->
> replicate seperation.
But that's the feature!

I can work on bumping postgresql (takes about 1h walltime to compile and test 
all versions) *and* work on a few tiny python packages while doing that. 
Without breaking either process. Without multiple checkouts.

I doubt stash would allow things to progress ... but it's a cute idea.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 11:25:33PM +, hasufell wrote:
> So can we get this clear now.
> 
> Robin said
>
> > The Git commit-signing design explicitly signs the entire commit,
> > including blob contents, to avoid this security problem.
> 
> Is this correct or not?

That is false.  The commit signature explicitly signs the commit,
which includes the root tree hash.  That is the only connection
between the signature and the tree contents.

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 11:21, Patrick Lauer  wrote:

> iow, git doesn't allow people to work on more than one item at a time?
>
> That'd mean I need half a dozen checkouts just to emulate cvs, which
> somehow
> doesn't make much sense to me ...
>

Use the Stash. Or just commit items, then swap branches, and then discard
the commits sometime later before pushing.

Unlike CVS, git doesn't force you to work in "Keep millions of files in
uncommitted states" mode just to work on a codebase, due to the commit <->
replicate seperation.


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Rich Freeman:
> On Sun, Sep 14, 2014 at 6:56 PM, hasufell  wrote:
>> According to Robin, it's not about rebasing, it's about signing all
>> commits so that messing with the blob (even if it has the same sha-1)
>> will cause signature verification failure.
>>
> 
> The only thing that gets signed is the commit message, and the only
> thing that ties the commit message to the code is the sha1 of the
> top-level tree.  If you can attack sha1 either at any tree level or at
> the blob level you can defeat the signature.
> 

So can we get this clear now.

Robin said
> The Git commit-signing design explicitly signs the entire commit, including 
> blob contents, to avoid this security problem.

Is this correct or not?



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 07:13:21PM -0400, Rich Freeman wrote:
> The only thing that gets signed is the commit message, and the only
> thing that ties the commit message to the code is the sha1 of the
> top-level tree.  If you can attack sha1 either at any tree level or at
> the blob level you can defeat the signature.
> 
> That is way better than nothing though - I think it is worth pursuing
> until somebody comes up with a way to upgrade git to more secure
> hashes.  Most projects don't gpg sign their trees at all, including
> linux.

I'm not worried about the attack (as I explained earlier in this
thread).  I'm just arguing for signing first-parent commits to master,
and not worrying about signatures on any side-branch commits.  So long
as the merge gets signed, you've got all the security you're going to
get.  Leaving the side-branch commits unchanged allows you to preserve
any non-dev commit hashes, which makes it easier for contributors to
verify that their changes have landed (the same way that GitHub is
checking to know when to automatically close pull requests).

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 11:15, W. Trevor King  wrote:

> All cherry-pick and am do is apply one commit's diff to a different
> parent.  Changing the parent hash (which is stored in the commit body
> [1]), so old signatures won't apply to the new commit.  If there have
> been other tree changes between the initial parent and the new parent,
> the tree hash will also change, which would also break old signatures.
> None of that has anything to do with a malicious blob being pushed
> into the tree disguised as a same-hashed good blob.  Such a blob will
> *not* break any signatures, since GnuPG is *never hashing the blob
> contents* when signing commits [1,2].  You're only signing the commit
> object, not the tree and blob objects referenced by that commit.
>
> Cheers,
> Trevor
>


And given that the method of "security" against attacks is established by a
chain of custody from a signed commit, through multiple child unsigned SHA1
objects, having a parent being an unsigned commit is no *less* secure than
having a tree or file blob being unsigned, it doesn't make perfect sense to
me that "all" commits have to be signed.  ( Because doing so doesn't give
the benefit of security we think it does ).

Thus, a "I signed this commit, establishing a chain of trust relying on
SHA1 integrity to the previous signed commit" is all that seems truly
necessary. Anything else is decreased utility with no increase in security.


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Patrick Lauer
On Sunday 14 September 2014 15:42:15 hasufell wrote:
> Patrick Lauer:
> >> Are we going to disallow merge commits and ask devs to rebase local
> >> changes in order to keep the history "clean"?
> > 
> > Is that going to be sane with our commit frequency?
> 
> You have to merge or rebase anyway in case of a push conflict, so the
> only difference is the method and the effect on the history.
> 
> Currently... CVS allows you to run repoman on an outdated tree and push
> broken ebuilds with repoman being happy. Git will not allow this.

iow, git doesn't allow people to work on more than one item at a time?

That'd mean I need half a dozen checkouts just to emulate cvs, which somehow 
doesn't make much sense to me ...



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 10:56, hasufell  wrote:

> According to Robin, it's not about rebasing, it's about signing all
> commits so that messing with the blob (even if it has the same sha-1)
> will cause signature verification failure.
>

Correct me if I'm wrong, but wouldn't a SHA1 attack on the tree object or
file blobs be completely invisible to the commit SHA1?

As the Signature only signs content of the commit object, not any of the
nodes it refers to.

Granted, getting a tree/file object to replicate might be interesting.

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 10:56:33PM +, hasufell wrote:
> W. Trevor King:
> > On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote:
> >> So we'd basically end up using either "git cherry-pick" or "git
> >> am" for "pulling" user stuff, so that we also sign the blobs.
> > 
> > Rebasing the original commits doesn't protect you from the
> > birthday attach either, because the vulnerable hash is likely
> > going to still be in the rebased commit's tree.  All rebasing does
> > is swap the committer and drop the initial signature.
> 
> According to Robin, it's not about rebasing, it's about signing all
> commits so that messing with the blob (even if it has the same
> sha-1) will cause signature verification failure.

All cherry-pick and am do is apply one commit's diff to a different
parent.  Changing the parent hash (which is stored in the commit body
[1]), so old signatures won't apply to the new commit.  If there have
been other tree changes between the initial parent and the new parent,
the tree hash will also change, which would also break old signatures.
None of that has anything to do with a malicious blob being pushed
into the tree disguised as a same-hashed good blob.  Such a blob will
*not* break any signatures, since GnuPG is *never hashing the blob
contents* when signing commits [1,2].  You're only signing the commit
object, not the tree and blob objects referenced by that commit.

Cheers,
Trevor

[1]: http://article.gmane.org/gmane.linux.gentoo.devel/77537
[2]: http://git.kernel.org/cgit/git/git.git/tree/commit.c?id=v2.1.0#n1076

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 6:56 PM, hasufell  wrote:
> According to Robin, it's not about rebasing, it's about signing all
> commits so that messing with the blob (even if it has the same sha-1)
> will cause signature verification failure.
>

The only thing that gets signed is the commit message, and the only
thing that ties the commit message to the code is the sha1 of the
top-level tree.  If you can attack sha1 either at any tree level or at
the blob level you can defeat the signature.

That is way better than nothing though - I think it is worth pursuing
until somebody comes up with a way to upgrade git to more secure
hashes.  Most projects don't gpg sign their trees at all, including
linux.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
W. Trevor King:
> On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote:
>> Yes, there is a possible attack vector mentioned in this comment
>> https://bugs.gentoo.org/show_bug.cgi?id=502060#c16
> 
> From that comment, the point 1.2 is highly unlikely [1]:
> 
>   1. Attacker constructs a init.d script, regular part at the start,
>  malicious part at the end
>   1.1. This would be fairly simple, just construct two start()
>  functions, one of which is mundane, the other is malicious.
>   1.2. Both variants of the script have the same SHA1...
> 
>> So we'd basically end up using either "git cherry-pick" or "git am"
>> for "pulling" user stuff, so that we also sign the blobs.
> 
> Rebasing the original commits doesn't protect you from the birthday
> attach either, because the vulnerable hash is likely going to still be
> in the rebased commit's tree.  All rebasing does is swap the committer
> and drop the initial signature.
> 

According to Robin, it's not about rebasing, it's about signing all
commits so that messing with the blob (even if it has the same sha-1)
will cause signature verification failure.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote:
> Yes, there is a possible attack vector mentioned in this comment
> https://bugs.gentoo.org/show_bug.cgi?id=502060#c16

From that comment, the point 1.2 is highly unlikely [1]:

  1. Attacker constructs a init.d script, regular part at the start,
 malicious part at the end
  1.1. This would be fairly simple, just construct two start()
 functions, one of which is mundane, the other is malicious.
  1.2. Both variants of the script have the same SHA1...

> So we'd basically end up using either "git cherry-pick" or "git am"
> for "pulling" user stuff, so that we also sign the blobs.

Rebasing the original commits doesn't protect you from the birthday
attach either, because the vulnerable hash is likely going to still be
in the rebased commit's tree.  All rebasing does is swap the committer
and drop the initial signature.

Cheers,
Trevor

[1]: http://article.gmane.org/gmane.comp.version-control.git/210622

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
W. Trevor King:
> On Sun, Sep 14, 2014 at 05:40:30PM +0200, Michał Górny wrote:
>> Dnia 2014-09-15, o godz. 03:15:14 Kent Fredric napisał(a):
>>> Only downside there is the way github pull reqs work is if the
>>> final SHA1's that hit tree don't match, the pull req doesn't
>>> close.
>>>
>>> Solutions:
>>>
>>> - A) Have somebody tasked with reaping old pull reqs with
>>> permissions granted. ( Uck )
>>> - B) Always use a merge of some kind to mark the pull req as dead
>>> ( for instance, an "ours" merge to mark the branch as deprecated )
>>>
>>> Both of those options are kinda ugly.
>>
>> If you merge a pull request, I suggest doing a proper 'git merge -S'
>> anyway to get a developer signature on top of all the changes.
> 
> Some previous package-tree-in-Git efforts suggested that only
> Gentoo-dev signatures were acceptable, and that those signatures would
> be required on every commit (not just the first-parent line) [1,2].  I
> don't see the point of that, so long as Gentoo devs are signing the
> first-parent line, but if folks still want Gentoo-dev signatures on
> every commit the ‘git merge -S’ approach will not work for closing
> PRs.
> 
> Cheers,
> Trevor
> 
> [1]: http://article.gmane.org/gmane.linux.gentoo.devel/77572
>  id:cagfcs_manfikevtj3cmcq1of-uqavebe2r1okykygwc5vom...@mail.gmail.com
> [2]: https://bugs.gentoo.org/show_bug.cgi?id=502060#c0
> 

Yes, there is a possible attack vector mentioned in this comment
https://bugs.gentoo.org/show_bug.cgi?id=502060#c16

So we'd basically end up using either "git cherry-pick" or "git am" for
"pulling" user stuff, so that we also sign the blobs.

Regular merges would still be possible for developer pull requests, but
that's probably not the primary use case anyway.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 05:40:30PM +0200, Michał Górny wrote:
> Dnia 2014-09-15, o godz. 03:15:14 Kent Fredric napisał(a):
> > Only downside there is the way github pull reqs work is if the
> > final SHA1's that hit tree don't match, the pull req doesn't
> > close.
> > 
> > Solutions:
> > 
> > - A) Have somebody tasked with reaping old pull reqs with
> > permissions granted. ( Uck )
> > - B) Always use a merge of some kind to mark the pull req as dead
> > ( for instance, an "ours" merge to mark the branch as deprecated )
> > 
> > Both of those options are kinda ugly.
> 
> If you merge a pull request, I suggest doing a proper 'git merge -S'
> anyway to get a developer signature on top of all the changes.

Some previous package-tree-in-Git efforts suggested that only
Gentoo-dev signatures were acceptable, and that those signatures would
be required on every commit (not just the first-parent line) [1,2].  I
don't see the point of that, so long as Gentoo devs are signing the
first-parent line, but if folks still want Gentoo-dev signatures on
every commit the ‘git merge -S’ approach will not work for closing
PRs.

Cheers,
Trevor

[1]: http://article.gmane.org/gmane.linux.gentoo.devel/77572
 id:cagfcs_manfikevtj3cmcq1of-uqavebe2r1okykygwc5vom...@mail.gmail.com
[2]: https://bugs.gentoo.org/show_bug.cgi?id=502060#c0

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Peter Stuge
Michał Górny wrote:
> What I need others to do is provide the hosting for git repos.

I'm happy to set up repos on my git server with custom hooks and
accounts as needed.

It's probably not what we want long-term, but it might be useful as
proof of concept, so that infra only needs to do setup one time.

I even have some virtual hosting working, point an A to the right IP
and it looks like only desired repos are hosted there.

Gitweb, git-daemon and git over http and CAcert https with pretty URLs.


//Peter


pgp4M7ju1Sv1x.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Ivan Viso Altamirano
I think the better option Is to block rsync and force emerge-webrsync
.sended from a phone
Il 14/09/2014 14:03, Michał Górny ha scritto:
> The rsync tree
> --
>
> We'd also propagate things to rsync. We'd have to populate it with old
> ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> Manifests. So users won't notice much of a change.
>
If this will change all Changelog the first rsync from the users will
generate a lot of traffic, rsync network need to be prepared


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread viv...@gmail.com
Il 14/09/2014 14:03, Michał Górny ha scritto:
> The rsync tree
> --
>
> We'd also propagate things to rsync. We'd have to populate it with old
> ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> Manifests. So users won't notice much of a change.
>
If this will change all Changelog the first rsync from the users will
generate a lot of traffic, rsync network need to be prepared




Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread James Cloos
> "MG" == Michał Górny  writes:

MG> This means we don't have to wait till someone figures out the perfect
MG> way of converting the old CVS repository. You don't need that history
MG> most of the time, and you can play with CVS to get it if you really do.
MG> In any case, we would likely strip the history anyway to get a small
MG> repo to work with.

+1 on that.  The cvs repo can be converted to an historical git repo on
a slower timeframe, and remain available as cvs until then.

That old-vs-fresh concept worked fine for other projects (including Linux).

-JimC
-- 
James Cloos  OpenPGP: 0x997A9F17ED7DAEA6



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 11:42 AM, hasufell  wrote:
> Patrick Lauer:
>>> Are we going to disallow merge commits and ask devs to rebase local
>>> changes in order to keep the history "clean"?
>>
>> Is that going to be sane with our commit frequency?
>>
>
> You have to merge or rebase anyway in case of a push conflict, so the
> only difference is the method and the effect on the history.
>
> Currently... CVS allows you to run repoman on an outdated tree and push
> broken ebuilds with repoman being happy. Git will not allow this.
>

Repoman is going to be a challenge here.  With cvs every package is
its own private repository with its own private history and cvs only
cares if there is a collision within the scope of a single file.

With git your commit is against the whole tree.  So, even though it is
trivial to merge, independent commits against two different packages
do collide and need to be rebased or merged.

Repoman can run against a single package fairly quickly, so assuming
we still allow that we could do a pull/rebase/repman/push workflow
even if people are doing commits every few minutes.  On the other
hand, if you're doing a package move or eclass change or some other
change that affects 300 packages, just doing the rebase might cost you
a few minutes (due to actual collisions), and running repoman against
the whole thing before doing a push isn't going to be practical.
Somebody doing a tree-wide commit would almost certainly have to run
repoman before the final rebase/merge, push that out, and then maybe
do another repoman after-the-fact and maybe clean up any issues.  For
all intents in purposes that is what we're doing today anyway, since
repoman+cvs doesn't offer any kind of tree-wide consistency guarantees
unless you're checking out based on a timestamp or something like
that.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-15, o godz. 03:15:14
Kent Fredric  napisał(a):

> On 15 September 2014 02:40, Michał Górny  wrote:
> 
> > However, I'm wondering if it would be possible to restrict people from
> > accidentally committing straight into github (e.g. merging pull
> > requests there instead of to our main server).
>
> => Github is just a read only mirror, any pull reqs submitted there will be
> fielded and pushed to gentoo directly.
> 
> Only downside there is the way github pull reqs work is if the final SHA1's
> that hit tree don't match, the pull req doesn't close.
> 
> Solutions:
> 
> - A) Have somebody tasked with reaping old pull reqs with permissions
> granted. ( Uck )
> - B) Always use a merge of some kind to mark the pull req as dead ( for
> instance, an "ours" merge to mark the branch as deprecated )
> 
> Both of those options are kinda ugly.

If you merge a pull request, I suggest doing a proper 'git merge -S'
anyway to get a developer signature on top of all the changes.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Patrick Lauer:
>> Are we going to disallow merge commits and ask devs to rebase local
>> changes in order to keep the history "clean"?
> 
> Is that going to be sane with our commit frequency?
> 

You have to merge or rebase anyway in case of a push conflict, so the
only difference is the method and the effect on the history.

Currently... CVS allows you to run repoman on an outdated tree and push
broken ebuilds with repoman being happy. Git will not allow this.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Patrick Lauer
On Sunday 14 September 2014 15:40:06 Davide Pesavento wrote:
> On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny  wrote:
> > We have main developer repo where developers work & commit and are
> > relatively happy. For every push into developer repo, automated magic
> > thingie merges stuff into user sync repo and updates the metadata cache
> > there.
> 
> How long does the md5-cache regeneration process take? Are you sure it
> will be able to keep up with the rate of pushes to the repo during
> "peak hours"? If not, maybe we could use a time-based thing similar to
> the current cvs->rsync synchronization.

Best case only one package is affected - a few seconds
Worst case someone touches an eclass like eutils, then it expands to something 
on the order of one or two CPU-hours.
 
> Are we going to disallow merge commits and ask devs to rebase local
> changes in order to keep the history "clean"?

Is that going to be sane with our commit frequency?



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 02:40, Michał Górny  wrote:

> However, I'm wondering if it would be possible to restrict people from
> accidentally committing straight into github (e.g. merging pull
> requests there instead of to our main server).
>


Easy.

Put the Gentoo repo in its own group.
Don't give anyone any kinds of permissions on it.
Have only one approved account for the purpose of pushing commits.
Have a post-push hook that replicates to github as that approved account

=> Github is just a read only mirror, any pull reqs submitted there will be
fielded and pushed to gentoo directly.

Only downside there is the way github pull reqs work is if the final SHA1's
that hit tree don't match, the pull req doesn't close.

Solutions:

- A) Have somebody tasked with reaping old pull reqs with permissions
granted. ( Uck )
- B) Always use a merge of some kind to mark the pull req as dead ( for
instance, an "ours" merge to mark the branch as deprecated )

Both of those options are kinda ugly.



-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 15:23:24
Jauhien Piatlicki  napisał(a):

> Another question: will it be possible to maintain a copy of tree on github to 
> make contributions for users simpler (similarly to e.g. science overlay)? 
> (Can it somehow be combined with proposed signing mechanism?)

Yes. I'm planning to have a mirror on github and bitbucket,
and auto-pushing to both.

However, I'm wondering if it would be possible to restrict people from
accidentally committing straight into github (e.g. merging pull
requests there instead of to our main server).

In fact, I would start my experiments straight into github if not
the fact that they don't allow us to set our own update hooks.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 15:40:06
Davide Pesavento  napisał(a):

> On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny  wrote:
> > We have main developer repo where developers work & commit and are
> > relatively happy. For every push into developer repo, automated magic
> > thingie merges stuff into user sync repo and updates the metadata cache
> > there.
> 
> How long does the md5-cache regeneration process take? Are you sure it
> will be able to keep up with the rate of pushes to the repo during
> "peak hours"? If not, maybe we could use a time-based thing similar to
> the current cvs->rsync synchronization.

This strongly depends on how much data is there to update. A few
ebuilds are quite fast, eclass change isn't ;). I was thinking of
something along the lines of, in pseudo-code speaking:

  systemctl restart cache-regen

That is, we start the regen on every update. If it finishes in time, it
commits the new metadata. If another update occurs during regen, we
just restart it to let it catch the new data.

Of course, if we can't spare the resources to do intermediate updates,
we may as well switch to cron-based update method.

> [...]
> > In any case, we would likely strip the history anyway to get a small
> > repo to work with.
> >
> > I have prepared a basic git update hook that keeps master clean
> > and attached it to the bug [1]. It enforces basic policies, prevents
> > forced updates and checks GPG signatures on left-most history line. It
> > can also be extended to do more extensive tree checks.
> 
> Are we going to disallow merge commits and ask devs to rebase local
> changes in order to keep the history "clean"?

I don't think we should cripple git. Just to be clear, 'accidental'
merges won't happen because the automatic merges are unsigned
and the 'update' hook will refuse them.

The developers will have to either rebase and resign the commits, or
use a signed merge commit whichever makes more sense in particular
context.

Signed merge commits will also allow merging user-submitted changes
while preserving original history.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 15:09:25
Jauhien Piatlicki  napisał(a):

> 14.09.14 14:03, Michał Górny написав(ла):
> > Hi,
> > 
> > I'm quite tired of promises and all that perfectionist non-sense which
> > locks us up with CVS for next 10 years of bikeshed. Therefore, I have
> > prepared a plan how to do git migration, and I believe it's doable in
> > less than 2 weeks (plus the testing). Of course, that assumes infra is
> > going to cooperate quickly or someone else is willing to provide the
> > infra for it.
> > 
> 
> as always, nice effort, but I foresee lots of bikeshedding in this thread. )

Yes. I'm planning to ignore most of bikeshed and take only serious
answers into consideration. Otherwise, we will be stuck with CVS.

> > This means we don't have to wait till someone figures out the perfect
> > way of converting the old CVS repository. You don't need that history
> > most of the time, and you can play with CVS to get it if you really do.
> > In any case, we would likely strip the history anyway to get a small
> > repo to work with.
> 
> Is it so difficult to convert CVS history?

It may be difficult to convert it properly, especially considering
the splitting of ebuild+Manifest commit. Then we need to somehow check
if it was converted properly. I don't even want to waste my time on
this. IMO the history doesn't have such a great value.

> > The rsync tree
> > --
> > 
> > We'd also propagate things to rsync. We'd have to populate it with old
> > ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> > Manifests. So users won't notice much of a change.
> > 
> 
> How will user check the ebuild integrity with thick manifests using rsync?

The same way he currently does :).

> > The remaining issue is signing of stuff. We could supposedly sign
> > Manifests but IMO it's a waste of resources considered how poor
> > the signing system is for non-git repos.
> 
> Again, how will user check the integrity and authenticity if Manifests are 
> unsigned?

As far as I'm concerned, user can use the user git tree to get proper
signatures or any other method that has proper signing support already.

If someone wants proper GPG support in rsync, he can work on that.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 00:03, Michał Górny  wrote:

> This means we don't have to wait till someone figures out the perfect
> way of converting the old CVS repository. You don't need that history
> most of the time, and you can play with CVS to get it if you really do.
>

Once somebody works this out, you can also simply make it available as a
"replacement" ref.

See 'git replace'

This would mean, essentially, you could push a ref called
'refs/replace/oldcvs' of value "firstsha1 oldcvssha1" and anyone who wanted
it  could manually fetch it, and any one who did fetch it would get the
full history in all of its glory, and then git would transparently pretend
that history was always there anyway.

No rebasing required, and available on a need-to-know basis :)

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Davide Pesavento
On Sun, Sep 14, 2014 at 3:55 PM, hasufell  wrote:
> Davide Pesavento:
>>> In any case, we would likely strip the history anyway to get a small
>>> repo to work with.
>>>
>>> I have prepared a basic git update hook that keeps master clean
>>> and attached it to the bug [1]. It enforces basic policies, prevents
>>> forced updates and checks GPG signatures on left-most history line. It
>>> can also be extended to do more extensive tree checks.
>>
>> Are we going to disallow merge commits and ask devs to rebase local
>> changes in order to keep the history "clean"?
>>
>
> I'd say it doesn't make sense to create merge commits for conflicts that
> arise by someone having pushed earlier than you.
>
> Merge commits should only be there if they give useful information.
>

I totally agree. But is there a way to automatically enforce this?

> Also... if you merge from a _user_ who is untrusted and allow a
> fast-forward merge, then the signature verification fails. That means
> for such pull requests you either have to use "git am" or "git merge
> --no-ff".
>

Right. In that case you can either sign the merge commit or amend the
user's commit and sign it yourself (re-signing could be needed anyway
if you have to rebase).

Thanks,
Davide



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Jauhien Piatlicki:
> 
> Or well, have our own pull requests review tool.
> 
> 

Also only a secondary problem. Mirroring on github/bitbucket whatever
should be fairly straightforward to allow user contributions.

In addition the usual git workflow via e-mail/ML would become more
popular (either via git style patches or plain pull request information
with branch/commit/repository).

So I'd suggest to focus on the git migration first.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Davide Pesavento:
>> Main developer repo
>> ---
>>
>> I was able to create a start git repository that takes around 66M
>> as a git pack (this is how much you will have to fetch to start working
>> with it). The repository is stripped clean of history and ChangeLogs,
>> and has thin Manifests only.
>>
>> This means we don't have to wait till someone figures out the perfect
>> way of converting the old CVS repository. You don't need that history
>> most of the time, and you can play with CVS to get it if you really do.
> 
> +1
> 

+1

>> In any case, we would likely strip the history anyway to get a small
>> repo to work with.
>>
>> I have prepared a basic git update hook that keeps master clean
>> and attached it to the bug [1]. It enforces basic policies, prevents
>> forced updates and checks GPG signatures on left-most history line. It
>> can also be extended to do more extensive tree checks.
> 
> Are we going to disallow merge commits and ask devs to rebase local
> changes in order to keep the history "clean"?
> 

I'd say it doesn't make sense to create merge commits for conflicts that
arise by someone having pushed earlier than you.

Merge commits should only be there if they give useful information.

Also... if you merge from a _user_ who is untrusted and allow a
fast-forward merge, then the signature verification fails. That means
for such pull requests you either have to use "git am" or "git merge
--no-ff".



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Jauhien Piatlicki:
> 
> Again, how will user check the integrity and authenticity if Manifests are 
> unsigned?
> 

While this is an issue to be solved, it shouldn't be a blocker for the
git migration.

There is no regression if this isn't solved. There is no sane automated
method for verifying signed Manifests yet (that should be on PM level)
and signing them isn't even enforced throughout the tree. Moreover I
highly doubt that there is any user who runs around ebuild directories
and checks Manifest signatures by hand.

People who really care use emerge-webrsync.
If we use the proposed solution, then there is an additional method via
the User syncing repo, so it's a win.

We can put more effort into solving this for rsync mirrors later, but
I'd rather focus on the git migration.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Davide Pesavento
On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny  wrote:
> We have main developer repo where developers work & commit and are
> relatively happy. For every push into developer repo, automated magic
> thingie merges stuff into user sync repo and updates the metadata cache
> there.

How long does the md5-cache regeneration process take? Are you sure it
will be able to keep up with the rate of pushes to the repo during
"peak hours"? If not, maybe we could use a time-based thing similar to
the current cvs->rsync synchronization.

[...]
> Main developer repo
> ---
>
> I was able to create a start git repository that takes around 66M
> as a git pack (this is how much you will have to fetch to start working
> with it). The repository is stripped clean of history and ChangeLogs,
> and has thin Manifests only.
>
> This means we don't have to wait till someone figures out the perfect
> way of converting the old CVS repository. You don't need that history
> most of the time, and you can play with CVS to get it if you really do.

+1

> In any case, we would likely strip the history anyway to get a small
> repo to work with.
>
> I have prepared a basic git update hook that keeps master clean
> and attached it to the bug [1]. It enforces basic policies, prevents
> forced updates and checks GPG signatures on left-most history line. It
> can also be extended to do more extensive tree checks.

Are we going to disallow merge commits and ask devs to rebase local
changes in order to keep the history "clean"?

Thanks a lot,
Davide



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
14.09.14 15:25, "C. Bergström" написав(ла):
> On 09/14/14 08:24 PM, Jauhien Piatlicki wrote:
>> 14.09.14 15:23, Jauhien Piatlicki написав(ла):
>>> Another question: will it be possible to maintain a copy of tree on github 
>>> to make contributions for users simpler (similarly to e.g. science 
>>> overlay)? (Can it somehow be combined with proposed signing mechanism?)
>>>
>>>
>> Or well, have our own pull requests review tool.
> NIH? What would be the benefit of that.. before going down this path.. I 
> think there's some good tools around which may at least serve as a base to 
> (fork) from before starting a ground up project.
> 
> Sorry to jump in the middle of the conversation, but I know 1st hand how much 
> is involved here.
> 

I was not precise. By our own I mean hosted by us, not by github. )




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread C. Bergström

On 09/14/14 08:24 PM, Jauhien Piatlicki wrote:

14.09.14 15:23, Jauhien Piatlicki написав(ла):

Another question: will it be possible to maintain a copy of tree on github to 
make contributions for users simpler (similarly to e.g. science overlay)? (Can 
it somehow be combined with proposed signing mechanism?)



Or well, have our own pull requests review tool.
NIH? What would be the benefit of that.. before going down this path.. I 
think there's some good tools around which may at least serve as a base 
to (fork) from before starting a ground up project.


Sorry to jump in the middle of the conversation, but I know 1st hand how 
much is involved here.




Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
14.09.14 15:23, Jauhien Piatlicki написав(ла):
> Another question: will it be possible to maintain a copy of tree on github to 
> make contributions for users simpler (similarly to e.g. science overlay)? 
> (Can it somehow be combined with proposed signing mechanism?)
> 
> 

Or well, have our own pull requests review tool.




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
Another question: will it be possible to maintain a copy of tree on github to 
make contributions for users simpler (similarly to e.g. science overlay)? (Can 
it somehow be combined with proposed signing mechanism?)




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
Hi,

14.09.14 14:03, Michał Górny написав(ла):
> Hi,
> 
> I'm quite tired of promises and all that perfectionist non-sense which
> locks us up with CVS for next 10 years of bikeshed. Therefore, I have
> prepared a plan how to do git migration, and I believe it's doable in
> less than 2 weeks (plus the testing). Of course, that assumes infra is
> going to cooperate quickly or someone else is willing to provide the
> infra for it.
> 

as always, nice effort, but I foresee lots of bikeshedding in this thread. )

> This means we don't have to wait till someone figures out the perfect
> way of converting the old CVS repository. You don't need that history
> most of the time, and you can play with CVS to get it if you really do.
> In any case, we would likely strip the history anyway to get a small
> repo to work with.
> 

Is it so difficult to convert CVS history?

> 
> The rsync tree
> --
> 
> We'd also propagate things to rsync. We'd have to populate it with old
> ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> Manifests. So users won't notice much of a change.
> 

How will user check the ebuild integrity with thick manifests using rsync?

> The remaining issue is signing of stuff. We could supposedly sign
> Manifests but IMO it's a waste of resources considered how poor
> the signing system is for non-git repos.
> 

Again, how will user check the integrity and authenticity if Manifests are 
unsigned?

Also, it would be a good idea to add automatic signature checking to portage 
for overlays that use signing (or is it already done?).

--
Jauhien




signature.asc
Description: OpenPGP digital signature