Re: [freenet-dev] Github and code review

2015-04-03 Thread Matthew Toseland
On 01/04/15 21:56, Arne Babenhauserheide wrote:
 Am Dienstag, 31. März 2015, 22:29:24 schrieb Ian:
 On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de
 wrote:
 It’s work from paid contributors for which
 we need structures which reduce the cost of code-review compared to
 what you propose
 I haven't heard of any proposal other than my own that would reduce the
 cost of code-review.  What specifically has been proposed that would reduce
 the cost of code review, and why specifically would it reduce the cost of
 code-review?
 Requiring contributors to refactor large pull-requests into
 semantically meaningful commits.
Or into separate pull requests. Which is what Ian suggested.

Sometimes a change will be big enough that it needs to be treated as a
series of changes of manageable size.

However these changes will still be dependent on each other. For example
my work this summer should have been broken up into something like:

Add infrastructure
Add splitfile storage backend
Add splitfile storage tests
Integrate splitfile storage
Implement back compatibility
Delete remnants of old code
Improve APIs since we've had to change them anyway

(This is an oversimplification, of course...)

These could be separate pull requests, but they have to be treated as a
series, because after integrate splitfile storage, we lose backwards
compatibility.

Ideally we'd like to avoid such situations entirely but when dealing
with persistence, legacy code, and refactoring APIs, we can't always
achieve this, and sometimes it's not desirable (i.e. we want to get all
the API changes in simultaneously - they can be separate pull requests
but should go in for the same build).

The point is, the two approaches are functionally equivalent:
- A series of largish squashed commits.
- A series of pull requests.

IMHO the former is marginally preferable, because a series of pull
requests is a slightly awkward beast. But I don't think it matters much.

Pick one, continue. Committers (including the release manager) should
require one or the other. But that's not the real issue here.

The fundamental disagreement here is *whether we should have advance
code review at all*. Unfortunately this appears to be a case of
conflicting goals...
 This reduces the cost of code review for the reviewer - which is where
 we have a bottleneck.
 It is very clear that the current process is neither streamlined nor is it
 painless, so change is clearly necessary.
 The process is streamlined and painless, when both parties agree to
 make it easy. See the past few pull-requests for examples which show
 that it works well.
And paid staff should stick with the agreed process. But we have
agreement on this now from Xor, so that's fine.
 The code review of fcp-rewrite was exhausting, because (a) the
 pull-request was huge, and (b) the coder got defensive and long-winded
 instead of accepting the review where he agreed or giving short,
 friendly explanations where he thought that it was mistaken - and
 accept that code which does not pass the review cannot be merged.

 (a) will happen from time to time. By requiring refactoring of history
 for huge pull-requests and generally making it as easy as reasonable
 to review we can deal with it (though what’s reasonable might differ -
 in the end it’s the reviewers who have to decide that, because they
 are the bottleneck).

 (b) however is not about the tools. It is about the question what
 behavior we as community can expect from paid developers, and how to
 organize our community in a way which minimizes the friction from very
 different time budgets.

 Best wishes,
 Arne



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-04-03 Thread Steve Dougherty
On 04/03/2015 02:19 PM, Matthew Toseland wrote:
 On 01/04/15 21:56, Arne Babenhauserheide wrote:
 Am Dienstag, 31. März 2015, 22:29:24 schrieb Ian:
 On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de
 wrote:
 It’s work from paid contributors for which
 we need structures which reduce the cost of code-review compared to
 what you propose
 I haven't heard of any proposal other than my own that would reduce the
 cost of code-review.  What specifically has been proposed that would reduce
 the cost of code review, and why specifically would it reduce the cost of
 code-review?
 Requiring contributors to refactor large pull-requests into
 semantically meaningful commits.
 Or into separate pull requests. Which is what Ian suggested.
 
 Sometimes a change will be big enough that it needs to be treated as a
 series of changes of manageable size.
 
 However these changes will still be dependent on each other. For example
 my work this summer should have been broken up into something like:
 
 Add infrastructure
 Add splitfile storage backend
 Add splitfile storage tests
 Integrate splitfile storage
 Implement back compatibility
 Delete remnants of old code
 Improve APIs since we've had to change them anyway
 
 (This is an oversimplification, of course...)
 
 These could be separate pull requests, but they have to be treated as a
 series, because after integrate splitfile storage, we lose backwards
 compatibility.
 
 Ideally we'd like to avoid such situations entirely but when dealing
 with persistence, legacy code, and refactoring APIs, we can't always
 achieve this, and sometimes it's not desirable (i.e. we want to get all
 the API changes in simultaneously - they can be separate pull requests
 but should go in for the same build).
 
 The point is, the two approaches are functionally equivalent:
 - A series of largish squashed commits.
 - A series of pull requests.
 
 IMHO the former is marginally preferable, because a series of pull
 requests is a slightly awkward beast. But I don't think it matters much.
 
 Pick one, continue. Committers (including the release manager) should
 require one or the other.

We all seem to agree that small pull requests that can be reviewed in a
few minutes is the ideal. I propose the contributing guidelines include
something like this:

 ## Pull request scope

 A pull request is easiest to review when it is small enough take only
 a few minutes. If your pull request is much larger than this, we
 might ask that you split it over a series of pull requests.

I don't see why we need to require exactly one technique. What about
keeping the part about squashing with an intro like this:

 If you'd rather keep a single pull request, please squash into
 high-level commits:

If there are objections to including this suggestion even at this level
I will drop it so that we can proceed.

 But that's not the real issue here.
 
 The fundamental disagreement here is *whether we should have advance
 code review at all*. Unfortunately this appears to be a case of
 conflicting goals...

I didn't take that as serious proposals that no one review code at all,
rather an example of how we really do need to figure out a way to review
effectively.



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-04-02 Thread xor
DISCLAIMER:
As the one who offended the fred review process, I *do not* participate in this 
discussion to manipulate your decision. It is up to you folks to decide what 
the review standard is.
I will comply to all rules for pull requests :)
If you say I shall squash, I will squash.


Instead, the information below is provided as a service so you have more input 
to use to come to a conclusion, as I feel that some aspects of squashing have 
not been mentioned by anyone else yet.
After I've explained the unmentioned disadvantages of squashing, at text 
position [1] I will also provide an alternative to squashing which you might 
want to discuss.

(Another reason for this mail is: I have the personal desire to excuse myself 
for causing the huge discussion. I'm sorry, seriously. So to have a chance to 
be forgiven, I want to a least explain why I objected so strongly to 
squashing.)

On Sunday, March 29, 2015 02:29:56 PM Bert Massop wrote:
 In short: I'm in favor of squashing when done in a sensible and
 information-preserving way.
 
[...]
 
 This is an interesting observation. There appears to be a certain
 hierarchy to which history can be considered useful for a particular
 purpose. IMHO rewriting / losing history is only a bad thing if the
 history actually serves a purpose: this depends on who will be using the
 log in the future, and for what purpose.
 
 I really don't care about anyone's Oops, fix XYZ or Also fix XYZ at
 ABC commits (and nor should any other developer or reviewer) as long as
 the relevant code has not yet surfaced. Yes, this provides information
 as to how the code was originally created, but it serves no purpose in
 understanding, reviewing or maintaining the code.
 
 I consider this kind of commits *noise* if they end up in the master
 tree (or in the initial commit set of a pull request), though they may
 be useful in my own branch during development (e.g. I can revert more
 granularly, read my work log, etc.). (FYI: I consider myself a noisy
 committer, too.) A pull request may require such commits to be added to
 the publicly known code, at that point they do add information — up to
 the point where the pull request is merged, from then on they no longer
 serve a purpose.

Unfortunately, this is short-sighted.

Consider a branch of a game is developed, it is called add-vehicle-system. 
It contains the initial new (poor) pseudo code:

  class DrumBrake extends Brake { brake() { speed -= 10; } }
  class Bike extends Vehicle { DrumBrake b; void brake() {  b.brake(); }}
  class Hummer extends Car { DrumBrake b; void brake() {  b.brake(); }}
  class Porsche extends Car { DrumBrake b; void brake() {  b.brake(); }}

Now the code isn't merged yet, and some engineer comes along and notices that 
all the vehicles have copy-pasted and by-design poor drum brakes, so he adds 
commits:
  Add disc brake
  Fix Hummer to use disc brake

The code is now:
  class DrumBrake  extends Brake  { brake() { speed -= 10; } }
  class DiscBrake  extends Brake  { brake() { speed -= 100; } }
  class Bike  extends Vehicle { DrumBrake b; void brake() {  b.brake(); }}
  class Hummer extends Car { DiscBrake b; void brake() {  b.brake(); }}
  class Porsche extends Car { DrumBrake b; void brake() {  b.brake(); }}

Unfortunately, the fastest vehicle - the Porsche - was forgotten about and 
still has the poor drum brakes.

This isn't noticed yet though, the whole add-vehicle-system branch is done 
and thus is squashed and merged into a single commit:
Add vehicle implementation

Then someday, someone will notice that the Porsche brakes poorly, and look 
into the history of the code to figure out why this is.
Unfortunately, since all Fix ... to use disc brake commits are gone, he will 
NOT notice that a Fix Porsche to use disc brake commit was lacking.
Thus, he will think: Well, this must have been intentional. I've also heard 
that in the 80s even some sports cars still had drum brakes. Guess its OK. 
and leave the Porsche with the poor brakes as is.


Now you might say that this is an academically constructed example. Yea well I 
cannot remember a practical one yet, sorry :)
But if you don't accept this, then what *is* the point of preserving *any* Git 
history anyway? The users only want the most recent version, just as you said. 
So we might as well delete *all* history.

But we don't, because the purpose of history is specifically to have *old* 
versions available so people can investigate the historical evolution of code 
to help with debugging.
So by demanding to delete old versions by squashing, you demand to invalidate 
the sole purpose of the history itself :)



Now as said, nevertheless, I don't want to cast a vote here.
I do understand that people are overwhelmed by the amount of code I have 
thrown at them, and I *will* squash pull requests when requested to.
Maybe it will work out just fine. I have only used squashing once, maybe I 
actually like it if I get more used to it. There are for sure change 

Re: [freenet-dev] Github and code review

2015-04-01 Thread Arne Babenhauserheide
Am Dienstag, 31. März 2015, 22:29:24 schrieb Ian:
 On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de
 wrote:
  It’s work from paid contributors for which
  we need structures which reduce the cost of code-review compared to
  what you propose
 I haven't heard of any proposal other than my own that would reduce the
 cost of code-review.  What specifically has been proposed that would reduce
 the cost of code review, and why specifically would it reduce the cost of
 code-review?

Requiring contributors to refactor large pull-requests into
semantically meaningful commits.

This reduces the cost of code review for the reviewer - which is where
we have a bottleneck.

 It is very clear that the current process is neither streamlined nor is it
 painless, so change is clearly necessary.

The process is streamlined and painless, when both parties agree to
make it easy. See the past few pull-requests for examples which show
that it works well.

The code review of fcp-rewrite was exhausting, because (a) the
pull-request was huge, and (b) the coder got defensive and long-winded
instead of accepting the review where he agreed or giving short,
friendly explanations where he thought that it was mistaken - and
accept that code which does not pass the review cannot be merged.

(a) will happen from time to time. By requiring refactoring of history
for huge pull-requests and generally making it as easy as reasonable
to review we can deal with it (though what’s reasonable might differ -
in the end it’s the reviewers who have to decide that, because they
are the bottleneck).

(b) however is not about the tools. It is about the question what
behavior we as community can expect from paid developers, and how to
organize our community in a way which minimizes the friction from very
different time budgets.

Best wishes,
Arne

signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-31 Thread Ian
On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de
wrote:

 It’s work from paid contributors for which
 we need structures which reduce the cost of code-review compared to
 what you propose


I haven't heard of any proposal other than my own that would reduce the
cost of code-review.  What specifically has been proposed that would reduce
the cost of code review, and why specifically would it reduce the cost of
code-review?


 - and not reviewing code isn’t an option anymore


My hope is that if the process of code review can be made less painful,
which is the intent of my proposal, then we will have sufficient resources
to do comprehensive code reviews.

However, if we do not have sufficient resources to do comprehensive code
reviews then our options are either to make do without comprehensive code
reviews, or to suspend development of Freenet.


 since toad left we managed to decentralize the project structures to
 some degree, and allowing unreviewed pushes would void the additional
 safety against compromise and dependendcy on individuals which that
 decentralization offers. We need to get to a point where we can cope
 with having anyone leave the group without notice, and that means that
 every change must be known by at least two people.


I agree, that is a desirable situation.  The best way to achieve that is to
ensure that the code-review process is as streamlined and painless as
possible, which is why I've proposed a process that I know to be
streamlined and painless because I've used it myself (as have many others).

It is very clear that the current process is neither streamlined nor is it
painless, so change is clearly necessary.

Ian.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-30 Thread Arne Babenhauserheide
Am Sonntag, 29. März 2015, 13:43:42 schrieb Ian:
 Nothing you are advocating will change anything if we simply don't have
 enough people willing to do code reviews.  At least with my proposal it
 will be a much less painful process than it appears to be today.

Florent put that pretty well: There’s no problem with reviewing work
from unpaid contributors. It’s work from paid contributors for which
we need structures which reduce the cost of code-review compared to
what you propose - and not reviewing code isn’t an option anymore:
since toad left we managed to decentralize the project structures to
some degree, and allowing unreviewed pushes would void the additional
safety against compromise and dependendcy on individuals which that
decentralization offers. We need to get to a point where we can cope
with having anyone leave the group without notice, and that means that
every change must be known by at least two people.

Best wishes,
Arne
--
Ich hab' nichts zu verbergen – hab ich gedacht: 

- http://draketo.de/licht/lieder/ich-hab-nichts-zu-verbergen



signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-29 Thread Ian
On Sat, Mar 28, 2015 at 7:18 PM, Arne Babenhauserheide arne_...@web.de
wrote:

 Am Samstag, 28. März 2015, 11:32:30 schrieb Ian:
   I agree with Bombe that it’s not nice to lose the history, but with
   git that’s the best we can do. It’s a limitation of the tool.
 
  It's not a limitation of the tool, it's a limitation created by your
 desire
  to misuse the tool.

 We have a need, we use a tool. To fulfil the need with the tool we
 have to misuse the tool. Consequently the tool is too limited for our
 usecase.

 There are several hacks around that - including merge commits with
 explanations and only ever looking at merge commits (hiding all
 non-merges) - and that these hacks are used by others shows that we
 aren’t the only ones with these needs, but git offers no clean
 solution.

 Sadly there also isn’t a clean way forward: Git users tend to be
 pretty defensive of their tool, and as such I expect that moving to a
 different tool would hurt more than it would help. So we’re stuck with
 the hacks.


There is a clear way forward, we can use Github for code review the way
that it was clearly intended to be used, and the way 95% of competent
development teams use it.  That is to review pull requests, not commits.


  It certainly works very well for my team (with a larger codebase
  than Freenet).  We've never had any of the problems you guys seem to
  be so concerned about.

 Do you run a free software community with vastly differing amount of
 worktime per week?


No, but I see nothing about Freenet that would make this work any less well.



 We cannot expect people to be able to review every 4 days.


Yes we can, that is how code review is supposed to happen.  If people can't
be bothered to review code then that is unfortunate, but we can't allow
development to grind to a halt because of it.

The complaints I've seen are not that reviews are too frequent, but rather
that when they do occur people are required to review massive changes.  A
code review should only require a few minutes.


 This is a
 reality we cannot change without having more paid
 developers. Different from a company, we have to structure our
 workflow in a way which keeps review from becoming a bottleneck while
 ensuring that all code is reviewed.


Nothing you are advocating will change anything if we simply don't have
enough people willing to do code reviews.  At least with my proposal it
will be a much less painful process than it appears to be today.

Ian.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-28 Thread Arne Babenhauserheide
Am Samstag, 28. März 2015, 11:32:30 schrieb Ian:
  I agree with Bombe that it’s not nice to lose the history, but with
  git that’s the best we can do. It’s a limitation of the tool.
 
 It's not a limitation of the tool, it's a limitation created by your desire
 to misuse the tool.

We have a need, we use a tool. To fulfil the need with the tool we
have to misuse the tool. Consequently the tool is too limited for our
usecase.

There are several hacks around that - including merge commits with
explanations and only ever looking at merge commits (hiding all
non-merges) - and that these hacks are used by others shows that we
aren’t the only ones with these needs, but git offers no clean
solution.

Sadly there also isn’t a clean way forward: Git users tend to be
pretty defensive of their tool, and as such I expect that moving to a
different tool would hurt more than it would help. So we’re stuck with
the hacks.

 It certainly works very well for my team (with a larger codebase
 than Freenet).  We've never had any of the problems you guys seem to
 be so concerned about.

Do you run a free software community with vastly differing amount of
worktime per week?

We cannot expect people to be able to review every 4 days. This is a
reality we cannot change without having more paid
developers. Different from a company, we have to structure our
workflow in a way which keeps review from becoming a bottleneck while
ensuring that all code is reviewed.

Best wishes,
Arne
--
Celebrate with ye beauty and gather yer friends for a Pirate Party!
→ http://1w6.org/english/flyerbook-rules#pirate-party ←



signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-28 Thread Ian
On Sat, Mar 28, 2015 at 11:13 AM, Arne Babenhauserheide arne_...@web.de
wrote:

 I agree with Bombe that it’s not nice to lose the history, but with
 git that’s the best we can do. It’s a limitation of the tool.


It's not a limitation of the tool, it's a limitation created by your desire
to misuse the tool.

Individual commits should not be reviewed, pull requests should.  I'm
really surprised that people are so resistant to this, this is widely
accepted practice.  It certainly works very well for my team (with a larger
codebase than Freenet).  We've never had any of the problems you guys seem
to be so concerned about.

If the pull request is too big to be reviewed, then it should have been
broken into small pull requests.  Typically a pull request should represent *at
most* 4 days worth of work.

Ian.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-28 Thread Arne Babenhauserheide
Am Samstag, 28. März 2015, 02:57:48 schrieb Steve Dougherty:
 # Authors squash commits into high-level changes making up final
 version; review pull request as commits.

My stance is:

Keep the commits if they are clean enough to be reviewed, refactor
them mercilessly if that’s necessary to avoid making review a
bottleneck.

If a new contributor provides 5 commits where three would suffice, but
it’s easy to merge (and asking to fix it would take more time than the
reviewing takes), just merge them. If it’s a complex change and
reviewing it is hard, then require the author of the pull request to
make it easy to review by using the commits as second hierarchy.

I agree with Bombe that it’s not nice to lose the history, but with
git that’s the best we can do. It’s a limitation of the tool.

But even though I would love to see Freenet use Mercurial by default
(in which this problem can be resolved with the evolve extension¹), I
think that it would be hard to change. So we’re stuck with that
limitation and have to employ it as efficiently as we can - even if
that means compromising the clean history and unchanging hisotry
ideology.

Best wishes,
Arne

¹: The evolve extension of Mercurial has the concepts of hidden
   changesets with rewrite-markers, so the history which is shown by
   default is clean, but you can always inquire how it came into
   being.

signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-28 Thread Steve Dougherty
On 03/27/2015 08:05 PM, Ian wrote:
 On Wed, Mar 25, 2015 at 12:55 PM, David Roden bo...@pterodactylus.net
 wrote:
 On 25.03.2015, at 06:14, Steve Dougherty st...@asksteved.com wrote:

 My inclination is to merge the unit of review. If the diff can be easily
 described in a single commit summary, the pull request is squashed and
 merged as a single commit.

 I really wish people would stop doing this. Commits are atomic units of
 work, building on one another (unless the person creating them is rather
 unorganized or is mixing several things in which case a pull request should
 probably also not be merged until those situations are cleaned up).
 Squashing commits destroys valuable information for no practical reason.

 
 I completely agree.  It serves no useful purpose, and is entirely contrary
 to widely established best-practices for using Github.

The intent of squashing commits is not to destroy useful history. It's
to make history easier to read and allow reviewing changes incrementally
by presenting them in high-level commits which reflect the final state
of the code which passes review.

To be more specific, this is about turning something with changes made
during development / code review like

Clarify test reasoning
Add more tests for new feature
Fix NPE in new feature
Fix inverted branch condition in new feature
Add tests for new feature
Refactor support classes and new feature
Add exciting new feature
Add support needed by new feature

into

Add tests for new feature
Add exciting new feature
Add support needed by new feature

not into

Add exciting new feature

It is my opinion that it is not worth maintaining in repo history a
distinction between an initial implementation and changes made during
code review or refinement. Of course only feature / bugfix branches are
subject to squashing - next/master/release branches are not. git
supports this workflow - git commit has --fixup and --squash. We use
this at my job - a reviewer who is up to date can see the incremental
part in response to review, and a reviewer coming up to date can rebase
autosquash before reading.

We seem to have arrived at two approaches to this. I'm willing to go
with either at this point, and I'd appreciate more perspectives. We'll
update CONTRIBUTING.md with what we agree on.

# Review pull request as a diff; merge without squashing.

David and Ian favor this.

Benefits:

* Common use of GitHub pull requests.
* Does not require the effort / complication of squashing commits.

Disadvantages:

* Keeps changes made toward the final version of the pull request in
  the repo history - harder to review later.
* Difficult to review incrementally. This prevents working with very
  large changes, not that we want them anyway. It displays changes in
  multiple layers simultaneously.
* Hard to review commit messages.

# Authors squash commits into high-level changes making up final
version; review pull request as commits.

Arne and I favor this.

Benefits:

* Easy to review in repo history - high-level context is with each
  commit; avoids storing incremental changes toward final version in
  the stable repo history.
* Allows reviewing changes incrementally - commits can be ordered and
  given messages to aid understanding; diff ordering is not
  customizable.

Disadvantages:

* Atypical use of GitHub pull requests - higher bar of entry for larger
  changes.
* More effort / complication for authors.

Is this an accurate summary?

I'm not clear on the opinions of Florent, Matthew, Roland, (is Bert
around?) or others on these. Thoughts?

Thanks,
Steve



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-24 Thread Steve Dougherty
On 03/22/2015 06:21 PM, Roland Haeder wrote:
 On Sun, 22 Mar 2015 21:40:41 +
 Matthew Toseland mj...@cam.ac.uk wrote:
 
 On 22/03/15 19:22, Ian Clarke wrote:
 Unfortunately, sometimes it is necessary to make big changes.
 
 One good example is the initial import of all files for a new project.
 There you commit a huge chunk of files and directories.
 
 As I'm (different project) a newcomer to a project's development, I
 had started cloning the original repository (creating a personal
 clone) and started developing. I did this by committing carefully and
 I tried to avoid huge changes (not always possible, more below) to let
 them easily review them.
 
 Then I stepped towards them and my commits got rejected because I
 didn't make a feature branch or fix branch ... :-( That is kind of
 frustrating because my commits won't make it in.

Oh no! :( That shouldn't have happened - that does not strike me as
reason alone to reject a pull request. Do you have a link to it?

 What I mean here is that the maintainer should not be so bureaucratic
 to newcomers as this could turn them away and they may never come back.

Indeed. I've been making an effort to be more flexible like
https://emu.freenetproject.org/pipermail/devl/2015-February/037983.html



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-23 Thread Florent Daigniere
On Sun, 2015-03-22 at 14:11 -0500, Ian wrote:
 On Sun, Mar 22, 2015 at 1:36 AM, Florent Daigniere 
 nextg...@freenetproject.org wrote:
 
  On Sun, 2015-03-22 at 00:42 +, Matthew Toseland wrote:
   On 21/03/15 20:49, Ian Clarke wrote:
On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland 
  matt...@toselandcs.co.uk
wrote:
Most of the above boils down to review the diff, not the history.
  That
is probably sensible.
That's part of it, but also that a branch should be created for each
bugfix/feature, which ideally should be as small a unit of work as
  possible
(that can be merged without breaking stuff).
 
  It only is if the diff is of reasonable size... which it is most of the
  time, *except* when it comes from paid devs.
 
 
 Then that needs to change.  With a feature/bugfix-per-branch approach the
 diffs should be reviewable in just a few minutes.  If they aren't, then the
 coder is probably stuffing multiple features/bugfixes into a single branch
 and that's contrary to the guidelines I've outlined.  Whatever happened in
 the past, what matters is what we do going forward.
 

Agreed.

 They code in their cave for months, drop a 100kloc diff doing way more
  than a single feature/bugfix onto the maintainer and expect it to be
  merged; I'm sure that refactoring is good but not when it's forked off
  for 6month... This is the problem we had recently with both toad's and
  xor's code. We're talking about dozens of features and bugfixes AND
  refactoring.
 
 
 Then that needs to change going forward.
 

Agreed.

  Whether there's a single change per commit/branch/pull request doesn't
  matter as long as one of them is enforced.
 
 
 The way Github's code review tools are designed, you definitely want it to
 be per branch (which will have a 1-1 relationship with a pull request), not
 per commit.
 

No disagreement here either. I'd argue that for us git and github are
the wrong tools but I completely agree with your analysis of how we're
mis-using them :)

  Until now the base unit was
  supposed to be one change per commit; I'm all for changing it to
  something else but that won't solve the problem unless it's enforced
  strictly.
 
 
 You can't impose processes on people, they need to agree to them or it
 won't work.  That being said, I don't know why any reasonable person
 wouldn't agree to what I've outlined.  It's a tried and tested approach.
 

OSS projects do; when maintainers don't like the code they just don't
merge it (and that might leads to forks and that's perfectly fine). This
is what's happening now and part of why we're in limbo.

What's sad is that most of the problems are from the code *paid* devs
are producing. Some might argue that it's because they're producing more
than volunteers but I don't think so. 

I believe that their evaluation / incentives model needs to change for
their behaviour to adapt. Maybe it's time to reconsider bounties (pay
per feature/bugfix).

Florent


signature.asc
Description: This is a digitally signed message part
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-23 Thread Ian Clarke
On Sun, Mar 22, 2015 at 4:39 PM, Ian Clarke i...@freenetproject.org wrote:

 That is precisely how every mature project works. Just because neither
 your business ventures nor your involvement in open source are mature
 doesn't mean there aren't projects out there - both open source (Linux,
 Wine) and businesses (Steve's employer, presumably Google, etc) - which
 care about code quality.


 Wow.  Why would you say something so stupid in a forum where future
 employers could potentially see it?

 You forget that I've seen your code, it wouldn't come anywhere close to
 passing a code review by even the least experienced developer on my
 company's engineering team, so please don't presume to lecture me about
 code quality, or how mature software projects work.

 I really hope they teach you how to be less of an asshole during your
 computer science course, or you'll have a very hard time getting or keeping
 a job, either commercial or in academia.  Your previous paragraph would be
 a firing offense at most companies.  Professional engineering teams simply
 don't tolerate asshole behavior any more.


Ugh, this interaction was exactly the type of thing we need to avoid.
Attacking each-other's competence serves no useful purpose whatsoever.

Ian.

-- 
Founder, The Freenet Project
Email: i...@freenetproject.org
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-23 Thread Ian Clarke
On Mon, Mar 23, 2015 at 5:54 AM, Florent Daigniere 
nextg...@freenetproject.org wrote:

 On Sun, 2015-03-22 at 14:11 -0500, Ian wrote:
  The way Github's code review tools are designed, you definitely want it
 to
  be per branch (which will have a 1-1 relationship with a pull request),
 not
  per commit.

 No disagreement here either. I'd argue that for us git and github are
 the wrong tools but I completely agree with your analysis of how we're
 mis-using them :)


I'm not sure, I haven't had experience with any other code review tool, but
they work very well for us at my day-job, the entire team (9 experienced
software engineers) seems happy with them.

 You can't impose processes on people, they need to agree to them or it
  won't work.  That being said, I don't know why any reasonable person
  wouldn't agree to what I've outlined.  It's a tried and tested approach.

 OSS projects do; when maintainers don't like the code they just don't
 merge it (and that might leads to forks and that's perfectly fine). This
 is what's happening now and part of why we're in limbo.


In the early years of the project Freenet had a fairly liberal attitude to
contributions, trust but verify.  I did my best to minimize red-tape for
developers.  This was beneficial because it meant there was a low barrier
to entry for volunteers.  If we didn't have this approach, I doubt many of
the earliest contributors, people like Oskar Sandberg and Scott Miller,
would have become involved (both started with very minor contributions).

And as I mentioned previously, I just don't think we have the manpower for
formalized gatekeepers (ie. we're not the Linux Kernel), so it's simply not
an option to be that rigid.


 What's sad is that most of the problems are from the code *paid* devs
 are producing. Some might argue that it's because they're producing more
 than volunteers but I don't think so.


I think we need to avoid personal criticisms of people (yes, I know that I
attacked Toad earlier in this thread, but I was provoked - I should have
taken the high road, I'd had a few beers at that point).


 I believe that their evaluation / incentives model needs to change for
 their behaviour to adapt. Maybe it's time to reconsider bounties (pay
 per feature/bugfix).


I'm not sure about that.  Monetary motivation works well for salespeople, I
don't think it works well for engineers.  It would also be a significant
amount of work to administer, and I just don't think we have the manpower
to do that.  It could also result in a bias towards work on outwardly
visible bells and whistles, and against internal stuff like refactoring.

Ian.

-- 

*Ian Clarke* / Co-Founder  CTO

*OneSpot, Inc*

Email: i...@onespot.com
Web: http://www.onespot.com
Personal Blog: http://blog.locut.us/
LinkedIn: http://www.linkedin.com/in/iancjclarke
Twitter: http://twitter.com/sanity
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Matthew Toseland
On 22/03/15 21:39, Ian Clarke wrote:
 On Sun, Mar 22, 2015 at 4:20 PM, Matthew Toseland mj...@cam.ac.uk wrote:
 But regardless, nothing in the process I've outlined would inhibit
 correcting problems like this.  Ideally they'd be corrected at the code
 review stage, but if they make it past that then they can be corrected
 with
 a new branch, just like any other bugfix.
 No, they can't.

 Removing binaries (or worse, copyright-infringing files) requires
 rewriting the history.
 Why would removing a binary require rewriting the history, 
Because we don't want those files in the repository at all. We don't
want everyone to have to download jars which are many years out of date
and aren't used anyway when they clone the repository.
 and why would
 anyone commit a copyright-infringing file?
Can happen with installers. Can happen with people taking short-cuts,
although that's less likely as Freenet is fairly specialised. Can happen
as a deliberate attempt to disrupt. Can even happen by accident.
 In my years of using Github I have never ever been in a situation that
 necessitated rewriting history.

 The possibility that someone might do something really dumb and easily
 avoidable should not dictate our code review policy.
Everyone does this when they first use Git. For example, most of the
members of my group project this year included multiple copies of huge
third party javascript libraries, and even jars. Some of our
contributors will be on that level - they won't initially know any
better. This is simply because of lack of experience. If we have
volunteers, many of them will have limited experience. This may also be
true of paid developers, and certainly of most GSoC interns.
 2. Disruptive changes to APIs.
 This has also happened. Especially if the description is incomplete,
 there is a significant risk of refactoring breaking other code (e.g.
 plugins; this was part of the problem with Xor's changes), introducing
 new APIs that don't make sense etc.
 Both code review and a decent continuous integration system should address
 this.  This is a good continuous integration service that is free for OSS
 projects: https://travis-ci.org/  You can see how I use it on this other
 OSS project of mine: http://quickml.org/
 Not all problems can be detected by automated tools.

 Sometimes it is necessary to change classes that are used by plugins,
 and some of those plugins are unofficial, i.e. third party.
 Yes, just like any other software project.  What would make us any
 different, and why would that necessitate deviating from the process I've
 proposed (which is pretty-much standard practice among well-run dev teams
 already)?  
Your point was that continuous integration can detect all problems, so
advance code review is unnecessary. This is not true. CI is worth a lot,
but it doesn't solve all the problems being addressed here.

And it is certainly not true that all well-run development teams accept
unreviewed code! Wine doesn't, Linux doesn't, Steve's company doesn't,
and I'd be pretty amazed if Google did.

As for the rest, I will not be drawn into personal attacks. I care about
Freenet and would like it to succeed, regardless of whether I am paid to
work for it. Some of its goals may be naive in the modern world, but
there is much potential here. Sadly I don't have time to volunteer at
the moment; I'm only involved now because a major crisis has erupted and
I had hoped I could contribute something.



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Ian Clarke
On Sun, Mar 22, 2015 at 4:20 PM, Matthew Toseland mj...@cam.ac.uk wrote:

  But regardless, nothing in the process I've outlined would inhibit
  correcting problems like this.  Ideally they'd be corrected at the code
  review stage, but if they make it past that then they can be corrected
 with
  a new branch, just like any other bugfix.
 No, they can't.

 Removing binaries (or worse, copyright-infringing files) requires
 rewriting the history.


Why would removing a binary require rewriting the history, and why would
anyone commit a copyright-infringing file?

In my years of using Github I have never ever been in a situation that
necessitated rewriting history.

The possibility that someone might do something really dumb and easily
avoidable should not dictate our code review policy.

 2. Disruptive changes to APIs.
  This has also happened. Especially if the description is incomplete,
  there is a significant risk of refactoring breaking other code (e.g.
  plugins; this was part of the problem with Xor's changes), introducing
  new APIs that don't make sense etc.
  Both code review and a decent continuous integration system should
 address
  this.  This is a good continuous integration service that is free for OSS
  projects: https://travis-ci.org/  You can see how I use it on this other
  OSS project of mine: http://quickml.org/
 Not all problems can be detected by automated tools.

 Sometimes it is necessary to change classes that are used by plugins,
 and some of those plugins are unofficial, i.e. third party.


Yes, just like any other software project.  What would make us any
different, and why would that necessitate deviating from the process I've
proposed (which is pretty-much standard practice among well-run dev teams
already)?  I'm not sure I understand your point.


  That seems very unlikely to happen, we barely have budget for actual
  developers, let alone dedicated QA people.  And really, who'd want that
 job
  anyway?
 
  Even setting aside budget and finding suitable people, coders should have
  primary responsibility for ensuring that their code is good.  It's
  unhealthy to have a situation where coders think that ensuring their code
  is clean and robust is someone else's problem.
 That is precisely how every mature project works. Just because neither
 your business ventures nor your involvement in open source are mature
 doesn't mean there aren't projects out there - both open source (Linux,
 Wine) and businesses (Steve's employer, presumably Google, etc) - which
 care about code quality.


Wow.  Why would you say something so stupid in a forum where future
employers could potentially see it?

You forget that I've seen your code, it wouldn't come anywhere close to
passing a code review by even the least experienced developer on my
company's engineering team, so please don't presume to lecture me about
code quality, or how mature software projects work.

I really hope they teach you how to be less of an asshole during your
computer science course, or you'll have a very hard time getting or keeping
a job, either commercial or in academia.  Your previous paragraph would be
a firing offense at most companies.  Professional engineering teams simply
don't tolerate asshole behavior any more.

Ian.


-- 
Ian Clarke
Founder, The Freenet Project
Email: i...@freenetproject.org
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Ian
On Sat, Mar 21, 2015 at 7:42 PM, Matthew Toseland matt...@toselandcs.co.uk
wrote:

 On 21/03/15 20:49, Ian Clarke wrote:
  That's part of it, but also that a branch should be created for each
  bugfix/feature, which ideally should be as small a unit of work as
 possible
  (that can be merged without breaking stuff).
 Absolutely. See e.g.:
 http://nvie.com/posts/a-successful-git-branching-model/

 We try to stick to this ... or at least, that was the plan/consensus!


Hmm, it doesn't sound like we do in practice, given that I believe there
were pull requests with hundreds of commits which sounded like it was for
much more than a single feature or bugfix.  Anyway, the point isn't to
complain about what happened in the past, just to agree on our process
going forward.


 The problem is then we spend a lot of time undoing stuff. This can and
 does happen, and has happened, even relatively recently. It's part of
 the reason why most projects use a branch-and-merge model, rather than
 giving all the devs push rights.

 Ian.
 Why we might need to revert or reject changes:
 0. Stupid stuff. E.g. committing jars to repositories.


Committing jars to repositories is kind of careless, don't people have
sensible .gitignore files?  Can a .gitignore be committed to the repo?

But regardless, nothing in the process I've outlined would inhibit
correcting problems like this.  Ideally they'd be corrected at the code
review stage, but if they make it past that then they can be corrected with
a new branch, just like any other bugfix.


  2. Disruptive changes to APIs.
 This has also happened. Especially if the description is incomplete,
 there is a significant risk of refactoring breaking other code (e.g.
 plugins; this was part of the problem with Xor's changes), introducing
 new APIs that don't make sense etc.


Both code review and a decent continuous integration system should address
this.  This is a good continuous integration service that is free for OSS
projects: https://travis-ci.org/  You can see how I use it on this other
OSS project of mine: http://quickml.org/


 I agree that review capacity is potentially a bottleneck. There are
 different ways to solve this:
 - Have paid staff who review and merge stuff.


That seems very unlikely to happen, we barely have budget for actual
developers, let alone dedicated QA people.  And really, who'd want that job
anyway?

Even setting aside budget and finding suitable people, coders should have
primary responsibility for ensuring that their code is good.  It's
unhealthy to have a situation where coders think that ensuring their code
is clean and robust is someone else's problem.


 - Don't accept pull requests if nobody can review them right now.


This will absolutely cause a severe bottleneck, causing development to
grind to a halt, and probably destroying morale in the process.  Who wants
to work their ass of on some code only for it to sit in a branch
indefinitely?


 - Allow the reviewers to make reasonable demands for clear code. A pull
 request is a negotiation between the contributor and the maintainer.


Of course, reviewers can point out unclean code, and a conscientious
developer will want to commit good code so they'll fix it.  If a coder is
ignoring reasonable code review feedback then that might be grounds for
removing commit access.

Ian.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Ian Clarke
On Sun, Mar 22, 2015 at 6:44 AM, Arne Babenhauserheide arne_...@web.de
wrote:

 Am Samstag, 21. März 2015, 12:45:39 schrieb Ian Clarke:
  It sounds like people are trying to use commits for code review, whereas
  they should be using Github pull requests.

 For huge changes we are using commits as a second level hierarchy:
 When the diff is too big to understand on its own, then the commits
 have to form an easy to follow story which allows understanding the
 change step by step.


I think you've misidentified the problem and therefore the solution.  The
problem is that diffs would ever be too big to understand on their own,
therefore the solution is that the diff should never be too big to
understand on it's own.  A branch/pull-request should rarely represent more
than a few days of work, and thus should rarely take more than a few
minutes to review.


 As such they unpaid devs cannot just sit down for 6 hours and read
 through a huge diff to understand it in its entirety. They need the
 diff more structured.


A 6 hour code review is insane, and should never be necessary.


 - For any isolatable feature or bugfix, create a new branch just for
 that feature or bug request (perhaps put the bug id # in the name of
 the
 branch).  *Do not combine multiple features or bugfixes into a single
 branch.*  If it can be merged independently, it should have it's own
 branch.

 We get into a problem with this, when the changes get too extensive
 without being ready for merging into a release.


It should never be necessary for changes to be that extensive.  In my
day-job we're working on a far more complex codebase than Freenet, and yet
a branch/pull-request almost *never* represents more than 4 days of work,
and therefore code reviews rarely require more than 10-20 minutes.


 Your scheme provides fast development of individual features, but does
 not provide information spread within the group: Only one person has
 seen the current version of the code.


I don't understand that, anyone that wants to is free to look at the
code/pull-requests.


 Longterm this needs to a situation where no one can understand the
 whole code well enough to change it efficiently, which in turn leads
 to pseudo-ownership over certain sub-sections of the code.


If we're producing code that isn't comprehensible without external
explanation, then we're not producing good code.  Well written code
shouldn't require an explanation, it should be self-explanatory.  This book
lays out some excellent guidelines for this kind of code, every programmer
should read it: http://amzn.com/0132350882

Ian.

-- 
Ian Clarke
Founder, The Freenet Project
Email: i...@freenetproject.org
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Ian Clarke
Sure, but let's not get bogged down with complaining about the past, we all
know it's been dysfunctional, let's focus on what we're going to do from
now on.

Ian.

On Sun, Mar 22, 2015 at 2:06 AM, Florent Daigniere 
nextg...@freenetproject.org wrote:

 On Sun, 2015-03-22 at 07:36 +0100, Florent Daigniere wrote:
  On Sun, 2015-03-22 at 00:42 +, Matthew Toseland wrote:
   On 21/03/15 20:49, Ian Clarke wrote:
On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland 
 matt...@toselandcs.co.uk
wrote:
Most of the above boils down to review the diff, not the history.
 That
is probably sensible.
That's part of it, but also that a branch should be created for each
bugfix/feature, which ideally should be as small a unit of work as
 possible
(that can be merged without breaking stuff).
 
  It only is if the diff is of reasonable size... which it is most of the
  time, *except* when it comes from paid devs.
 
  They code in their cave for months, drop a 100kloc diff doing way more
  than a single feature/bugfix onto the maintainer and expect it to be
  merged; I'm sure that refactoring is good but not when it's forked off
  for 6month... This is the problem we had recently with both toad's and
  xor's code. We're talking about dozens of features and bugfixes AND
  refactoring.
 
  What blocks development is that the refactoring isn't merged until their
  work is ready... and there's usually a good reason for it; they code
  first and think later (to be fair it's a common trait amongst devs
  working alone) which means that as long as they haven't tried it out
  they're not quite sure of what they want in terms of API... so it's only
  in a mergeable state when it works.
 
  Whether there's a single change per commit/branch/pull request doesn't
  matter as long as one of them is enforced. Until now the base unit was
  supposed to be one change per commit; I'm all for changing it to
  something else but that won't solve the problem unless it's enforced
  strictly.
 
  Florent

 Just to put some perspective on what I've written above:

 Here's toad's diff:
 https://github.com/freenet/fred/pull/287/files

 Here's xor's diff:
 https://github.com/freenet/fred/pull/319/files

 in both cases github gives up rendering it:
 Sorry, we could not display the changes to this file because there were
 too many other changes to display.

 and doesn't even display the commit count:
  This pull request is big! We're only showing the most recent 250
 commits. 

 As for the individual commits, there's plenty which are doing more than
 a bugfix/feature/single semantic change.

 Florent




-- 
Ian Clarke
Founder, The Freenet Project
Email: i...@freenetproject.org
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Matthew Toseland
On 22/03/15 19:22, Ian Clarke wrote:
 I think you've misidentified the problem and therefore the solution.  The
 problem is that diffs would ever be too big to understand on their own,
 therefore the solution is that the diff should never be too big to
 understand on it's own.  A branch/pull-request should rarely represent more
 than a few days of work, and thus should rarely take more than a few
 minutes to review.
...
 It should never be necessary for changes to be that extensive.  In my
 day-job we're working on a far more complex codebase than Freenet, and yet
 a branch/pull-request almost *never* represents more than 4 days of work,
 and therefore code reviews rarely require more than 10-20 minutes.
Unfortunately, sometimes it is necessary to make big changes.

For example, my work last summer: We can't keep on using Db4o because
it's been discontinued. We can't easily move to another object database
because they are not standardised. In any case we need to rewrite it to
a simpler structure for stability and performance reasons. But this
meant rewriting a lot of core code, adding necessary infrastructure, and
then implementing backward compatibility. Since this will break APIs, we
may as well fix some of the other problems at the same time.

Now, if I'd had time, I would have broken this up into a series of pull
requests (or big commits) to be reviewed separately. However, most of
them cannot be merged independantly - either they introduce new
functionality that isn't used (so why should it be in the codebase?), or
they break backwards compatibility (which is a largish chunk of work
itself). It appears that time was a factor in Xor's case too - he was
worried about starting his thesis soon.

I believe Ian's answer will be that time is always a factor and the
correct solution is to cut corners and hope for the best - we will clean
up the code some day. Some day never comes, because there's always
another deadline (for employees)... IMHO this is a management style
which does not work with the kind of code that Freenet is, that modern
OSS is, that Google's infrastructure is. It works well enough with code
that doesn't need to be maintained, and it just about works if you can
rewrite it completely every year, or ship it once and forget it.

Re volunteers, if they know what the rules are and if maintainers
respond quickly there's a good chance that they will help to solve the
problems, and the code will get merged.
 Longterm this needs to a situation where no one can understand the
 whole code well enough to change it efficiently, which in turn leads
 to pseudo-ownership over certain sub-sections of the code.
 If we're producing code that isn't comprehensible without external
 explanation, then we're not producing good code.  Well written code
 shouldn't require an explanation, it should be self-explanatory.  This book
 lays out some excellent guidelines for this kind of code, every programmer
 should read it: http://amzn.com/0132350882
Yes, and allowing everyone to commit with no supervision is a perfect
recipe for the sort of spaghetti code that we would like to avoid.



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Ian
On Sun, Mar 22, 2015 at 1:36 AM, Florent Daigniere 
nextg...@freenetproject.org wrote:

 On Sun, 2015-03-22 at 00:42 +, Matthew Toseland wrote:
  On 21/03/15 20:49, Ian Clarke wrote:
   On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland 
 matt...@toselandcs.co.uk
   wrote:
   Most of the above boils down to review the diff, not the history.
 That
   is probably sensible.
   That's part of it, but also that a branch should be created for each
   bugfix/feature, which ideally should be as small a unit of work as
 possible
   (that can be merged without breaking stuff).

 It only is if the diff is of reasonable size... which it is most of the
 time, *except* when it comes from paid devs.


Then that needs to change.  With a feature/bugfix-per-branch approach the
diffs should be reviewable in just a few minutes.  If they aren't, then the
coder is probably stuffing multiple features/bugfixes into a single branch
and that's contrary to the guidelines I've outlined.  Whatever happened in
the past, what matters is what we do going forward.


 They code in their cave for months, drop a 100kloc diff doing way more
 than a single feature/bugfix onto the maintainer and expect it to be
 merged; I'm sure that refactoring is good but not when it's forked off
 for 6month... This is the problem we had recently with both toad's and
 xor's code. We're talking about dozens of features and bugfixes AND
 refactoring.


Then that needs to change going forward.


 Whether there's a single change per commit/branch/pull request doesn't
 matter as long as one of them is enforced.


The way Github's code review tools are designed, you definitely want it to
be per branch (which will have a 1-1 relationship with a pull request), not
per commit.


 Until now the base unit was
 supposed to be one change per commit; I'm all for changing it to
 something else but that won't solve the problem unless it's enforced
 strictly.


You can't impose processes on people, they need to agree to them or it
won't work.  That being said, I don't know why any reasonable person
wouldn't agree to what I've outlined.  It's a tried and tested approach.

Ian.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Matthew Toseland
On 22/03/15 19:07, Ian wrote:
 On Sat, Mar 21, 2015 at 7:42 PM, Matthew Toseland matt...@toselandcs.co.uk
 wrote:


 0. Stupid stuff. E.g. committing jars to repositories.
 Committing jars to repositories is kind of careless, don't people have
 sensible .gitignore files?  Can a .gitignore be committed to the repo?

 But regardless, nothing in the process I've outlined would inhibit
 correcting problems like this.  Ideally they'd be corrected at the code
 review stage, but if they make it past that then they can be corrected with
 a new branch, just like any other bugfix.
No, they can't.

Removing binaries (or worse, copyright-infringing files) requires
rewriting the history.

Rewriting the history is *BAD*. It makes it impossible for a third party
to follow what is going on. It makes life extremely difficult/messy for
our own developers, breaking their unfinished work when they pull from
master.
 2. Disruptive changes to APIs.
 This has also happened. Especially if the description is incomplete,
 there is a significant risk of refactoring breaking other code (e.g.
 plugins; this was part of the problem with Xor's changes), introducing
 new APIs that don't make sense etc.
 Both code review and a decent continuous integration system should address
 this.  This is a good continuous integration service that is free for OSS
 projects: https://travis-ci.org/  You can see how I use it on this other
 OSS project of mine: http://quickml.org/
Not all problems can be detected by automated tools.

Sometimes it is necessary to change classes that are used by plugins,
and some of those plugins are unofficial, i.e. third party.
 I agree that review capacity is potentially a bottleneck. There are
 different ways to solve this:
 - Have paid staff who review and merge stuff.
 That seems very unlikely to happen, we barely have budget for actual
 developers, let alone dedicated QA people.  And really, who'd want that job
 anyway?

 Even setting aside budget and finding suitable people, coders should have
 primary responsibility for ensuring that their code is good.  It's
 unhealthy to have a situation where coders think that ensuring their code
 is clean and robust is someone else's problem.
That is precisely how every mature project works. Just because neither
your business ventures nor your involvement in open source are mature
doesn't mean there aren't projects out there - both open source (Linux,
Wine) and businesses (Steve's employer, presumably Google, etc) - which
care about code quality.

I repeat my earlier point: If we allow everyone to commit without
review, we will spend a lot of our time (and commit history) undoing
stuff. This is one of the reasons behind the rise of distributed version
control in the first place. Git, and Github, are specifically designed
to support pull requests. The implication is that the person merging the
changes is not the same as the person posting them, and that they should
be signed-off, i.e. reviewed.
 - Don't accept pull requests if nobody can review them right now.
 This will absolutely cause a severe bottleneck, causing development to
 grind to a halt, and probably destroying morale in the process.  Who wants
 to work their ass of on some code only for it to sit in a branch
 indefinitely?
 - Allow the reviewers to make reasonable demands for clear code. A pull
 request is a negotiation between the contributor and the maintainer.
 Of course, reviewers can point out unclean code, and a conscientious
 developer will want to commit good code so they'll fix it.  If a coder is
 ignoring reasonable code review feedback then that might be grounds for
 removing commit access.
We are not using SVN. We are not using CVS. NOBODY uses Git and then
gives everyone push access. It just doesn't make sense. The correct,
standard workflow used by just about everyone is to fork and post a pull
request, which may or may not be merged.

I will reply to your points about big changes separately.



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Matthew Toseland
On 22/03/15 22:21, Roland Haeder wrote:
 On Sun, 22 Mar 2015 21:40:41 +
 Matthew Toseland mj...@cam.ac.uk wrote:

 On 22/03/15 19:22, Ian Clarke wrote:
 Unfortunately, sometimes it is necessary to make big changes.
 One good example is the initial import of all files for a new project.
 There you commit a huge chunk of files and directories.

 As I'm (different project) a newcomer to a project's development, I
 had started cloning the original repository (creating a personal
 clone) and started developing. I did this by committing carefully and
 I tried to avoid huge changes (not always possible, more below) to let
 them easily review them.

 Then I stepped towards them and my commits got rejected because I
 didn't make a feature branch or fix branch ... :-( That is kind of
 frustrating because my commits won't make it in.

 What I mean here is that the maintainer should not be so bureaucratic
 to newcomers as this could turn them away and they may never come back.

 The maintainer can then still review commits relatively easily by
 setting up a merge branch

 git checkout master # or wherever your development branch is
 git remote add rolands_code git://git.bla.example/some/foo.git --track
 master
 git checkout -b code_reviewed_okay/master
 git checkout master
 git checkout -b rolands_master
 git pull rolands_code

 This way you can check your code with changes from developer
 roland (yes, my name) without mixing it so quickly with your own
 code. Then start the review process by looking at each commit and
 cherry-pick it into code_reviewed_okay/master branch:

 git checkout code_reviewed_okay/master
 git cherry-pick xx
 git checkout rolands_master

 Then you can check them again (on functionality) in the review branch
 (as not all commits from rolands_master are in) and continue with next
 one.

 Maybe there is a better way. :-) But this way may work. After that
 merge all reviewed commits into master:

 git checkout master
 git merge -S code_reviewed_okay/master

 That seems to be a fine way to me.

 My 2 cents. :-)
Your code should not be rejected because it is on master on your own
repository IMHO. However, if you are going to make multiple pull
requests you really need to use separate branches. If you have several
different changes on the same branch then you definitely need to
separate them out before making a pull request.

However, this is pretty easy using git branch, git reset and git
cherry-pick.



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Roland Haeder
On Sun, 22 Mar 2015 22:27:24 +
Matthew Toseland mj...@cam.ac.uk wrote:

 Your code should not be rejected because it is on master on your own
 repository IMHO. However, if you are going to make multiple pull
 requests you really need to use separate branches. If you have several
 different changes on the same branch then you definitely need to
 separate them out before making a pull request.
 
 However, this is pretty easy using git branch, git reset and git
 cherry-pick.
 

Okay, I will try it then later. Thank you. :-)


signature.asc
Description: PGP signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-22 Thread Roland Haeder
On Sun, 22 Mar 2015 21:40:41 +
Matthew Toseland mj...@cam.ac.uk wrote:

 On 22/03/15 19:22, Ian Clarke wrote:
 Unfortunately, sometimes it is necessary to make big changes.

One good example is the initial import of all files for a new project.
There you commit a huge chunk of files and directories.

As I'm (different project) a newcomer to a project's development, I
had started cloning the original repository (creating a personal
clone) and started developing. I did this by committing carefully and
I tried to avoid huge changes (not always possible, more below) to let
them easily review them.

Then I stepped towards them and my commits got rejected because I
didn't make a feature branch or fix branch ... :-( That is kind of
frustrating because my commits won't make it in.

What I mean here is that the maintainer should not be so bureaucratic
to newcomers as this could turn them away and they may never come back.

The maintainer can then still review commits relatively easily by
setting up a merge branch

git checkout master # or wherever your development branch is
git remote add rolands_code git://git.bla.example/some/foo.git --track
master
git checkout -b code_reviewed_okay/master
git checkout master
git checkout -b rolands_master
git pull rolands_code

This way you can check your code with changes from developer
roland (yes, my name) without mixing it so quickly with your own
code. Then start the review process by looking at each commit and
cherry-pick it into code_reviewed_okay/master branch:

git checkout code_reviewed_okay/master
git cherry-pick xx
git checkout rolands_master

Then you can check them again (on functionality) in the review branch
(as not all commits from rolands_master are in) and continue with next
one.

Maybe there is a better way. :-) But this way may work. After that
merge all reviewed commits into master:

git checkout master
git merge -S code_reviewed_okay/master

That seems to be a fine way to me.

My 2 cents. :-)


signature.asc
Description: PGP signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-21 Thread Matthew Toseland
On 21/03/15 17:45, Ian Clarke wrote:
 Talking to a few people, I think our current approach to code review is
 problematic.

 For example, I've been told that some people are arguing that commits are
 too granular, and need to be combined to make code review easier. This is a
 mistake, there is nothing wrong with very granular commits, just as there
 is nothing wrong with more verbose code if it helps clarity.  Pull requests
 should be used for code review, not individual commits.  The number of
 individual commits should be irrelevant.

 It sounds like people are trying to use commits for code review, whereas
 they should be using Github pull requests.

 Here is a process that has worked well in my experience:

- For any isolatable feature or bugfix, create a new branch just for
that feature or bug request (perhaps put the bug id # in the name of the
branch).  *Do not combine multiple features or bugfixes into a single
branch.*  If it can be merged independently, it should have it's own
branch.
- Commits to this branch can be as granular as the programmer wants,
generally the more frequent the commits the better
- When the programmer is ready for feedback, they should create a pull
request between this branch and the main branch - they could post the URL
of the pull request to IRC to encourage feedback
- It's fine for a programmer to request feedback before the feature is
complete, but they should note this in the title of the pull request - eg.
DO_NOT_MERGE
- For code review, the most useful tab is Files changed - which shows
all differences between the feature branch and the main branch (individual
commits are irrelevant here) - comments can be added here
- Once the feature is complete and the review feedback has been read and
perhaps acted upon by the programmer working on the feature, it should be
merged

 The person contributing the code is responsible for asking for and
 incorporating feedback, but they control the process, the process should
 not control them.  So, for example, in some cases the programmer might
 decide that a code review is unnecessary.  That will be their call.  Better
 to trust human judgement over a rigid process.

 Thoughts?

 Ian.
Most of the above boils down to review the diff, not the history. That
is probably sensible.

The last point is everyone can commit anything without review. That's
the fundamental question here: Do we want to require that some
responsible person (release manager, person with push rights) reviews
and signs off on the code before it is pushed?

There are 2 main reasons for this:
1. Security. How useful this is is debatable.
2. Disruptive changes to APIs.



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-21 Thread Matthew Toseland
On 21/03/15 18:58, Matthew Toseland wrote:
 On 21/03/15 17:45, Ian Clarke wrote:
 Talking to a few people, I think our current approach to code review is
 problematic.

 For example, I've been told that some people are arguing that commits are
 too granular, and need to be combined to make code review easier. This is a
 mistake, there is nothing wrong with very granular commits, just as there
 is nothing wrong with more verbose code if it helps clarity.  Pull requests
 should be used for code review, not individual commits.  The number of
 individual commits should be irrelevant.

 It sounds like people are trying to use commits for code review, whereas
 they should be using Github pull requests.

 Here is a process that has worked well in my experience:

- For any isolatable feature or bugfix, create a new branch just for
that feature or bug request (perhaps put the bug id # in the name of the
branch).  *Do not combine multiple features or bugfixes into a single
branch.*  If it can be merged independently, it should have it's own
branch.
- Commits to this branch can be as granular as the programmer wants,
generally the more frequent the commits the better
- When the programmer is ready for feedback, they should create a pull
request between this branch and the main branch - they could post the URL
of the pull request to IRC to encourage feedback
- It's fine for a programmer to request feedback before the feature is
complete, but they should note this in the title of the pull request - eg.
DO_NOT_MERGE
- For code review, the most useful tab is Files changed - which shows
all differences between the feature branch and the main branch (individual
commits are irrelevant here) - comments can be added here
- Once the feature is complete and the review feedback has been read and
perhaps acted upon by the programmer working on the feature, it should be
merged

 The person contributing the code is responsible for asking for and
 incorporating feedback, but they control the process, the process should
 not control them.  So, for example, in some cases the programmer might
 decide that a code review is unnecessary.  That will be their call.  Better
 to trust human judgement over a rigid process.

 Thoughts?

 Ian.
 Most of the above boils down to review the diff, not the history. That
 is probably sensible.
It implies that the RM's can reasonably ask for some sort of guide to
the changes to make diff-level review of big patches easier.
 The last point is everyone can commit anything without review. That's
 the fundamental question here: Do we want to require that some
 responsible person (release manager, person with push rights) reviews
 and signs off on the code before it is pushed?

 There are 2 main reasons for this:
 1. Security. How useful this is is debatable.
 2. Disruptive changes to APIs.




signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-21 Thread Ian Clarke
On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk
wrote:

 Most of the above boils down to review the diff, not the history. That
 is probably sensible.


That's part of it, but also that a branch should be created for each
bugfix/feature, which ideally should be as small a unit of work as possible
(that can be merged without breaking stuff).


 The last point is everyone can commit anything without review. That's
 the fundamental question here: Do we want to require that some
 responsible person (release manager, person with push rights) reviews
 and signs off on the code before it is pushed?


I think the question is moot, since (so far as I'm aware) we don't have
anyone that can commit to reviewing all code reliably and quickly, so such
a requirement would only serve to create a severe bottleneck in our
development process.

All commits are public, all commits can be reviewed by anyone, but in the
event that nobody is in a position to review something we can't allow
development to grind to a halt.  If people care about reviewing code then
they can and should review code.

Ian.



 There are 2 main reasons for this:
 1. Security. How useful this is is debatable.
 2. Disruptive changes to APIs.


 ___
 Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl




-- 

*Ian Clarke* / Co-Founder  CTO

*OneSpot, Inc*

Email: i...@onespot.com
Web: http://www.onespot.com
Personal Blog: http://blog.locut.us/
LinkedIn: http://www.linkedin.com/in/iancjclarke
Twitter: http://twitter.com/sanity
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-21 Thread Ian Clarke
[resending with correct from: address]

On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk
wrote:

 Most of the above boils down to review the diff, not the history. That
 is probably sensible.


That's part of it, but also that a branch should be created for each
bugfix/feature, which ideally should be as small a unit of work as possible
(that can be merged without breaking stuff).


 The last point is everyone can commit anything without review. That's
 the fundamental question here: Do we want to require that some
 responsible person (release manager, person with push rights) reviews
 and signs off on the code before it is pushed?


I think the question is moot, since (so far as I'm aware) we don't have
anyone that can commit to reviewing all code reliably and quickly, so such
a requirement would only serve to create a severe bottleneck in our
development process.

All commits are public, all commits can be reviewed by anyone, but in the
event that nobody is in a position to review something we can't allow
development to grind to a halt.  If people care about reviewing code then
they can and should review code.

Ian.

-- 
Ian Clarke
Founder, The Freenet Project
Email: i...@freenetproject.org
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Github and code review

2015-03-21 Thread Matthew Toseland
On 21/03/15 20:49, Ian Clarke wrote:
 On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk
 wrote:
 Most of the above boils down to review the diff, not the history. That
 is probably sensible.
 That's part of it, but also that a branch should be created for each
 bugfix/feature, which ideally should be as small a unit of work as possible
 (that can be merged without breaking stuff).
Absolutely. See e.g.:
http://nvie.com/posts/a-successful-git-branching-model/

We try to stick to this ... or at least, that was the plan/consensus!
 The last point is everyone can commit anything without review. That's
 the fundamental question here: Do we want to require that some
 responsible person (release manager, person with push rights) reviews
 and signs off on the code before it is pushed?
 I think the question is moot, since (so far as I'm aware) we don't have
 anyone that can commit to reviewing all code reliably and quickly, so such
 a requirement would only serve to create a severe bottleneck in our
 development process.

 All commits are public, all commits can be reviewed by anyone, but in the
 event that nobody is in a position to review something we can't allow
 development to grind to a halt.  If people care about reviewing code then
 they can and should review code.
After it's been committed.

The problem is then we spend a lot of time undoing stuff. This can and
does happen, and has happened, even relatively recently. It's part of
the reason why most projects use a branch-and-merge model, rather than
giving all the devs push rights.
 Ian.
Why we might need to revert or reject changes:
0. Stupid stuff. E.g. committing jars to repositories.
 There are 2 main reasons for this:
 1. Security. How useful this is is debatable.
E.g. introducing a rootkit which will be run on developer's boxes by
editing the build scripts, or the tests. It may be possible to prevent
this via automated tools. Subtle changes and dirty tricks may be hard to
detect, but some things *do* show up in a quick code review.
 2. Disruptive changes to APIs.
This has also happened. Especially if the description is incomplete,
there is a significant risk of refactoring breaking other code (e.g.
plugins; this was part of the problem with Xor's changes), introducing
new APIs that don't make sense etc.

I agree that review capacity is potentially a bottleneck. There are
different ways to solve this:
- Have paid staff who review and merge stuff.
- Don't accept pull requests if nobody can review them right now.
- Allow the reviewers to make reasonable demands for clear code. A pull
request is a negotiation between the contributor and the maintainer.

Many OSS projects use processes like this...



signature.asc
Description: OpenPGP digital signature
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl