Re: [freenet-dev] Github and code review
On 01/04/15 21:56, Arne Babenhauserheide wrote: Am Dienstag, 31. März 2015, 22:29:24 schrieb Ian: On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de wrote: It’s work from paid contributors for which we need structures which reduce the cost of code-review compared to what you propose I haven't heard of any proposal other than my own that would reduce the cost of code-review. What specifically has been proposed that would reduce the cost of code review, and why specifically would it reduce the cost of code-review? Requiring contributors to refactor large pull-requests into semantically meaningful commits. Or into separate pull requests. Which is what Ian suggested. Sometimes a change will be big enough that it needs to be treated as a series of changes of manageable size. However these changes will still be dependent on each other. For example my work this summer should have been broken up into something like: Add infrastructure Add splitfile storage backend Add splitfile storage tests Integrate splitfile storage Implement back compatibility Delete remnants of old code Improve APIs since we've had to change them anyway (This is an oversimplification, of course...) These could be separate pull requests, but they have to be treated as a series, because after integrate splitfile storage, we lose backwards compatibility. Ideally we'd like to avoid such situations entirely but when dealing with persistence, legacy code, and refactoring APIs, we can't always achieve this, and sometimes it's not desirable (i.e. we want to get all the API changes in simultaneously - they can be separate pull requests but should go in for the same build). The point is, the two approaches are functionally equivalent: - A series of largish squashed commits. - A series of pull requests. IMHO the former is marginally preferable, because a series of pull requests is a slightly awkward beast. But I don't think it matters much. Pick one, continue. Committers (including the release manager) should require one or the other. But that's not the real issue here. The fundamental disagreement here is *whether we should have advance code review at all*. Unfortunately this appears to be a case of conflicting goals... This reduces the cost of code review for the reviewer - which is where we have a bottleneck. It is very clear that the current process is neither streamlined nor is it painless, so change is clearly necessary. The process is streamlined and painless, when both parties agree to make it easy. See the past few pull-requests for examples which show that it works well. And paid staff should stick with the agreed process. But we have agreement on this now from Xor, so that's fine. The code review of fcp-rewrite was exhausting, because (a) the pull-request was huge, and (b) the coder got defensive and long-winded instead of accepting the review where he agreed or giving short, friendly explanations where he thought that it was mistaken - and accept that code which does not pass the review cannot be merged. (a) will happen from time to time. By requiring refactoring of history for huge pull-requests and generally making it as easy as reasonable to review we can deal with it (though what’s reasonable might differ - in the end it’s the reviewers who have to decide that, because they are the bottleneck). (b) however is not about the tools. It is about the question what behavior we as community can expect from paid developers, and how to organize our community in a way which minimizes the friction from very different time budgets. Best wishes, Arne signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 04/03/2015 02:19 PM, Matthew Toseland wrote: On 01/04/15 21:56, Arne Babenhauserheide wrote: Am Dienstag, 31. März 2015, 22:29:24 schrieb Ian: On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de wrote: It’s work from paid contributors for which we need structures which reduce the cost of code-review compared to what you propose I haven't heard of any proposal other than my own that would reduce the cost of code-review. What specifically has been proposed that would reduce the cost of code review, and why specifically would it reduce the cost of code-review? Requiring contributors to refactor large pull-requests into semantically meaningful commits. Or into separate pull requests. Which is what Ian suggested. Sometimes a change will be big enough that it needs to be treated as a series of changes of manageable size. However these changes will still be dependent on each other. For example my work this summer should have been broken up into something like: Add infrastructure Add splitfile storage backend Add splitfile storage tests Integrate splitfile storage Implement back compatibility Delete remnants of old code Improve APIs since we've had to change them anyway (This is an oversimplification, of course...) These could be separate pull requests, but they have to be treated as a series, because after integrate splitfile storage, we lose backwards compatibility. Ideally we'd like to avoid such situations entirely but when dealing with persistence, legacy code, and refactoring APIs, we can't always achieve this, and sometimes it's not desirable (i.e. we want to get all the API changes in simultaneously - they can be separate pull requests but should go in for the same build). The point is, the two approaches are functionally equivalent: - A series of largish squashed commits. - A series of pull requests. IMHO the former is marginally preferable, because a series of pull requests is a slightly awkward beast. But I don't think it matters much. Pick one, continue. Committers (including the release manager) should require one or the other. We all seem to agree that small pull requests that can be reviewed in a few minutes is the ideal. I propose the contributing guidelines include something like this: ## Pull request scope A pull request is easiest to review when it is small enough take only a few minutes. If your pull request is much larger than this, we might ask that you split it over a series of pull requests. I don't see why we need to require exactly one technique. What about keeping the part about squashing with an intro like this: If you'd rather keep a single pull request, please squash into high-level commits: If there are objections to including this suggestion even at this level I will drop it so that we can proceed. But that's not the real issue here. The fundamental disagreement here is *whether we should have advance code review at all*. Unfortunately this appears to be a case of conflicting goals... I didn't take that as serious proposals that no one review code at all, rather an example of how we really do need to figure out a way to review effectively. signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
DISCLAIMER: As the one who offended the fred review process, I *do not* participate in this discussion to manipulate your decision. It is up to you folks to decide what the review standard is. I will comply to all rules for pull requests :) If you say I shall squash, I will squash. Instead, the information below is provided as a service so you have more input to use to come to a conclusion, as I feel that some aspects of squashing have not been mentioned by anyone else yet. After I've explained the unmentioned disadvantages of squashing, at text position [1] I will also provide an alternative to squashing which you might want to discuss. (Another reason for this mail is: I have the personal desire to excuse myself for causing the huge discussion. I'm sorry, seriously. So to have a chance to be forgiven, I want to a least explain why I objected so strongly to squashing.) On Sunday, March 29, 2015 02:29:56 PM Bert Massop wrote: In short: I'm in favor of squashing when done in a sensible and information-preserving way. [...] This is an interesting observation. There appears to be a certain hierarchy to which history can be considered useful for a particular purpose. IMHO rewriting / losing history is only a bad thing if the history actually serves a purpose: this depends on who will be using the log in the future, and for what purpose. I really don't care about anyone's Oops, fix XYZ or Also fix XYZ at ABC commits (and nor should any other developer or reviewer) as long as the relevant code has not yet surfaced. Yes, this provides information as to how the code was originally created, but it serves no purpose in understanding, reviewing or maintaining the code. I consider this kind of commits *noise* if they end up in the master tree (or in the initial commit set of a pull request), though they may be useful in my own branch during development (e.g. I can revert more granularly, read my work log, etc.). (FYI: I consider myself a noisy committer, too.) A pull request may require such commits to be added to the publicly known code, at that point they do add information — up to the point where the pull request is merged, from then on they no longer serve a purpose. Unfortunately, this is short-sighted. Consider a branch of a game is developed, it is called add-vehicle-system. It contains the initial new (poor) pseudo code: class DrumBrake extends Brake { brake() { speed -= 10; } } class Bike extends Vehicle { DrumBrake b; void brake() { b.brake(); }} class Hummer extends Car { DrumBrake b; void brake() { b.brake(); }} class Porsche extends Car { DrumBrake b; void brake() { b.brake(); }} Now the code isn't merged yet, and some engineer comes along and notices that all the vehicles have copy-pasted and by-design poor drum brakes, so he adds commits: Add disc brake Fix Hummer to use disc brake The code is now: class DrumBrake extends Brake { brake() { speed -= 10; } } class DiscBrake extends Brake { brake() { speed -= 100; } } class Bike extends Vehicle { DrumBrake b; void brake() { b.brake(); }} class Hummer extends Car { DiscBrake b; void brake() { b.brake(); }} class Porsche extends Car { DrumBrake b; void brake() { b.brake(); }} Unfortunately, the fastest vehicle - the Porsche - was forgotten about and still has the poor drum brakes. This isn't noticed yet though, the whole add-vehicle-system branch is done and thus is squashed and merged into a single commit: Add vehicle implementation Then someday, someone will notice that the Porsche brakes poorly, and look into the history of the code to figure out why this is. Unfortunately, since all Fix ... to use disc brake commits are gone, he will NOT notice that a Fix Porsche to use disc brake commit was lacking. Thus, he will think: Well, this must have been intentional. I've also heard that in the 80s even some sports cars still had drum brakes. Guess its OK. and leave the Porsche with the poor brakes as is. Now you might say that this is an academically constructed example. Yea well I cannot remember a practical one yet, sorry :) But if you don't accept this, then what *is* the point of preserving *any* Git history anyway? The users only want the most recent version, just as you said. So we might as well delete *all* history. But we don't, because the purpose of history is specifically to have *old* versions available so people can investigate the historical evolution of code to help with debugging. So by demanding to delete old versions by squashing, you demand to invalidate the sole purpose of the history itself :) Now as said, nevertheless, I don't want to cast a vote here. I do understand that people are overwhelmed by the amount of code I have thrown at them, and I *will* squash pull requests when requested to. Maybe it will work out just fine. I have only used squashing once, maybe I actually like it if I get more used to it. There are for sure change
Re: [freenet-dev] Github and code review
Am Dienstag, 31. März 2015, 22:29:24 schrieb Ian: On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de wrote: It’s work from paid contributors for which we need structures which reduce the cost of code-review compared to what you propose I haven't heard of any proposal other than my own that would reduce the cost of code-review. What specifically has been proposed that would reduce the cost of code review, and why specifically would it reduce the cost of code-review? Requiring contributors to refactor large pull-requests into semantically meaningful commits. This reduces the cost of code review for the reviewer - which is where we have a bottleneck. It is very clear that the current process is neither streamlined nor is it painless, so change is clearly necessary. The process is streamlined and painless, when both parties agree to make it easy. See the past few pull-requests for examples which show that it works well. The code review of fcp-rewrite was exhausting, because (a) the pull-request was huge, and (b) the coder got defensive and long-winded instead of accepting the review where he agreed or giving short, friendly explanations where he thought that it was mistaken - and accept that code which does not pass the review cannot be merged. (a) will happen from time to time. By requiring refactoring of history for huge pull-requests and generally making it as easy as reasonable to review we can deal with it (though what’s reasonable might differ - in the end it’s the reviewers who have to decide that, because they are the bottleneck). (b) however is not about the tools. It is about the question what behavior we as community can expect from paid developers, and how to organize our community in a way which minimizes the friction from very different time budgets. Best wishes, Arne signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Mon, Mar 30, 2015 at 3:16 PM, Arne Babenhauserheide arne_...@web.de wrote: It’s work from paid contributors for which we need structures which reduce the cost of code-review compared to what you propose I haven't heard of any proposal other than my own that would reduce the cost of code-review. What specifically has been proposed that would reduce the cost of code review, and why specifically would it reduce the cost of code-review? - and not reviewing code isn’t an option anymore My hope is that if the process of code review can be made less painful, which is the intent of my proposal, then we will have sufficient resources to do comprehensive code reviews. However, if we do not have sufficient resources to do comprehensive code reviews then our options are either to make do without comprehensive code reviews, or to suspend development of Freenet. since toad left we managed to decentralize the project structures to some degree, and allowing unreviewed pushes would void the additional safety against compromise and dependendcy on individuals which that decentralization offers. We need to get to a point where we can cope with having anyone leave the group without notice, and that means that every change must be known by at least two people. I agree, that is a desirable situation. The best way to achieve that is to ensure that the code-review process is as streamlined and painless as possible, which is why I've proposed a process that I know to be streamlined and painless because I've used it myself (as have many others). It is very clear that the current process is neither streamlined nor is it painless, so change is clearly necessary. Ian. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
Am Sonntag, 29. März 2015, 13:43:42 schrieb Ian: Nothing you are advocating will change anything if we simply don't have enough people willing to do code reviews. At least with my proposal it will be a much less painful process than it appears to be today. Florent put that pretty well: There’s no problem with reviewing work from unpaid contributors. It’s work from paid contributors for which we need structures which reduce the cost of code-review compared to what you propose - and not reviewing code isn’t an option anymore: since toad left we managed to decentralize the project structures to some degree, and allowing unreviewed pushes would void the additional safety against compromise and dependendcy on individuals which that decentralization offers. We need to get to a point where we can cope with having anyone leave the group without notice, and that means that every change must be known by at least two people. Best wishes, Arne -- Ich hab' nichts zu verbergen – hab ich gedacht: - http://draketo.de/licht/lieder/ich-hab-nichts-zu-verbergen signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sat, Mar 28, 2015 at 7:18 PM, Arne Babenhauserheide arne_...@web.de wrote: Am Samstag, 28. März 2015, 11:32:30 schrieb Ian: I agree with Bombe that it’s not nice to lose the history, but with git that’s the best we can do. It’s a limitation of the tool. It's not a limitation of the tool, it's a limitation created by your desire to misuse the tool. We have a need, we use a tool. To fulfil the need with the tool we have to misuse the tool. Consequently the tool is too limited for our usecase. There are several hacks around that - including merge commits with explanations and only ever looking at merge commits (hiding all non-merges) - and that these hacks are used by others shows that we aren’t the only ones with these needs, but git offers no clean solution. Sadly there also isn’t a clean way forward: Git users tend to be pretty defensive of their tool, and as such I expect that moving to a different tool would hurt more than it would help. So we’re stuck with the hacks. There is a clear way forward, we can use Github for code review the way that it was clearly intended to be used, and the way 95% of competent development teams use it. That is to review pull requests, not commits. It certainly works very well for my team (with a larger codebase than Freenet). We've never had any of the problems you guys seem to be so concerned about. Do you run a free software community with vastly differing amount of worktime per week? No, but I see nothing about Freenet that would make this work any less well. We cannot expect people to be able to review every 4 days. Yes we can, that is how code review is supposed to happen. If people can't be bothered to review code then that is unfortunate, but we can't allow development to grind to a halt because of it. The complaints I've seen are not that reviews are too frequent, but rather that when they do occur people are required to review massive changes. A code review should only require a few minutes. This is a reality we cannot change without having more paid developers. Different from a company, we have to structure our workflow in a way which keeps review from becoming a bottleneck while ensuring that all code is reviewed. Nothing you are advocating will change anything if we simply don't have enough people willing to do code reviews. At least with my proposal it will be a much less painful process than it appears to be today. Ian. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
Am Samstag, 28. März 2015, 11:32:30 schrieb Ian: I agree with Bombe that it’s not nice to lose the history, but with git that’s the best we can do. It’s a limitation of the tool. It's not a limitation of the tool, it's a limitation created by your desire to misuse the tool. We have a need, we use a tool. To fulfil the need with the tool we have to misuse the tool. Consequently the tool is too limited for our usecase. There are several hacks around that - including merge commits with explanations and only ever looking at merge commits (hiding all non-merges) - and that these hacks are used by others shows that we aren’t the only ones with these needs, but git offers no clean solution. Sadly there also isn’t a clean way forward: Git users tend to be pretty defensive of their tool, and as such I expect that moving to a different tool would hurt more than it would help. So we’re stuck with the hacks. It certainly works very well for my team (with a larger codebase than Freenet). We've never had any of the problems you guys seem to be so concerned about. Do you run a free software community with vastly differing amount of worktime per week? We cannot expect people to be able to review every 4 days. This is a reality we cannot change without having more paid developers. Different from a company, we have to structure our workflow in a way which keeps review from becoming a bottleneck while ensuring that all code is reviewed. Best wishes, Arne -- Celebrate with ye beauty and gather yer friends for a Pirate Party! → http://1w6.org/english/flyerbook-rules#pirate-party ← signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sat, Mar 28, 2015 at 11:13 AM, Arne Babenhauserheide arne_...@web.de wrote: I agree with Bombe that it’s not nice to lose the history, but with git that’s the best we can do. It’s a limitation of the tool. It's not a limitation of the tool, it's a limitation created by your desire to misuse the tool. Individual commits should not be reviewed, pull requests should. I'm really surprised that people are so resistant to this, this is widely accepted practice. It certainly works very well for my team (with a larger codebase than Freenet). We've never had any of the problems you guys seem to be so concerned about. If the pull request is too big to be reviewed, then it should have been broken into small pull requests. Typically a pull request should represent *at most* 4 days worth of work. Ian. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
Am Samstag, 28. März 2015, 02:57:48 schrieb Steve Dougherty: # Authors squash commits into high-level changes making up final version; review pull request as commits. My stance is: Keep the commits if they are clean enough to be reviewed, refactor them mercilessly if that’s necessary to avoid making review a bottleneck. If a new contributor provides 5 commits where three would suffice, but it’s easy to merge (and asking to fix it would take more time than the reviewing takes), just merge them. If it’s a complex change and reviewing it is hard, then require the author of the pull request to make it easy to review by using the commits as second hierarchy. I agree with Bombe that it’s not nice to lose the history, but with git that’s the best we can do. It’s a limitation of the tool. But even though I would love to see Freenet use Mercurial by default (in which this problem can be resolved with the evolve extension¹), I think that it would be hard to change. So we’re stuck with that limitation and have to employ it as efficiently as we can - even if that means compromising the clean history and unchanging hisotry ideology. Best wishes, Arne ¹: The evolve extension of Mercurial has the concepts of hidden changesets with rewrite-markers, so the history which is shown by default is clean, but you can always inquire how it came into being. signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 03/27/2015 08:05 PM, Ian wrote: On Wed, Mar 25, 2015 at 12:55 PM, David Roden bo...@pterodactylus.net wrote: On 25.03.2015, at 06:14, Steve Dougherty st...@asksteved.com wrote: My inclination is to merge the unit of review. If the diff can be easily described in a single commit summary, the pull request is squashed and merged as a single commit. I really wish people would stop doing this. Commits are atomic units of work, building on one another (unless the person creating them is rather unorganized or is mixing several things in which case a pull request should probably also not be merged until those situations are cleaned up). Squashing commits destroys valuable information for no practical reason. I completely agree. It serves no useful purpose, and is entirely contrary to widely established best-practices for using Github. The intent of squashing commits is not to destroy useful history. It's to make history easier to read and allow reviewing changes incrementally by presenting them in high-level commits which reflect the final state of the code which passes review. To be more specific, this is about turning something with changes made during development / code review like Clarify test reasoning Add more tests for new feature Fix NPE in new feature Fix inverted branch condition in new feature Add tests for new feature Refactor support classes and new feature Add exciting new feature Add support needed by new feature into Add tests for new feature Add exciting new feature Add support needed by new feature not into Add exciting new feature It is my opinion that it is not worth maintaining in repo history a distinction between an initial implementation and changes made during code review or refinement. Of course only feature / bugfix branches are subject to squashing - next/master/release branches are not. git supports this workflow - git commit has --fixup and --squash. We use this at my job - a reviewer who is up to date can see the incremental part in response to review, and a reviewer coming up to date can rebase autosquash before reading. We seem to have arrived at two approaches to this. I'm willing to go with either at this point, and I'd appreciate more perspectives. We'll update CONTRIBUTING.md with what we agree on. # Review pull request as a diff; merge without squashing. David and Ian favor this. Benefits: * Common use of GitHub pull requests. * Does not require the effort / complication of squashing commits. Disadvantages: * Keeps changes made toward the final version of the pull request in the repo history - harder to review later. * Difficult to review incrementally. This prevents working with very large changes, not that we want them anyway. It displays changes in multiple layers simultaneously. * Hard to review commit messages. # Authors squash commits into high-level changes making up final version; review pull request as commits. Arne and I favor this. Benefits: * Easy to review in repo history - high-level context is with each commit; avoids storing incremental changes toward final version in the stable repo history. * Allows reviewing changes incrementally - commits can be ordered and given messages to aid understanding; diff ordering is not customizable. Disadvantages: * Atypical use of GitHub pull requests - higher bar of entry for larger changes. * More effort / complication for authors. Is this an accurate summary? I'm not clear on the opinions of Florent, Matthew, Roland, (is Bert around?) or others on these. Thoughts? Thanks, Steve signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 03/22/2015 06:21 PM, Roland Haeder wrote: On Sun, 22 Mar 2015 21:40:41 + Matthew Toseland mj...@cam.ac.uk wrote: On 22/03/15 19:22, Ian Clarke wrote: Unfortunately, sometimes it is necessary to make big changes. One good example is the initial import of all files for a new project. There you commit a huge chunk of files and directories. As I'm (different project) a newcomer to a project's development, I had started cloning the original repository (creating a personal clone) and started developing. I did this by committing carefully and I tried to avoid huge changes (not always possible, more below) to let them easily review them. Then I stepped towards them and my commits got rejected because I didn't make a feature branch or fix branch ... :-( That is kind of frustrating because my commits won't make it in. Oh no! :( That shouldn't have happened - that does not strike me as reason alone to reject a pull request. Do you have a link to it? What I mean here is that the maintainer should not be so bureaucratic to newcomers as this could turn them away and they may never come back. Indeed. I've been making an effort to be more flexible like https://emu.freenetproject.org/pipermail/devl/2015-February/037983.html signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sun, 2015-03-22 at 14:11 -0500, Ian wrote: On Sun, Mar 22, 2015 at 1:36 AM, Florent Daigniere nextg...@freenetproject.org wrote: On Sun, 2015-03-22 at 00:42 +, Matthew Toseland wrote: On 21/03/15 20:49, Ian Clarke wrote: On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: Most of the above boils down to review the diff, not the history. That is probably sensible. That's part of it, but also that a branch should be created for each bugfix/feature, which ideally should be as small a unit of work as possible (that can be merged without breaking stuff). It only is if the diff is of reasonable size... which it is most of the time, *except* when it comes from paid devs. Then that needs to change. With a feature/bugfix-per-branch approach the diffs should be reviewable in just a few minutes. If they aren't, then the coder is probably stuffing multiple features/bugfixes into a single branch and that's contrary to the guidelines I've outlined. Whatever happened in the past, what matters is what we do going forward. Agreed. They code in their cave for months, drop a 100kloc diff doing way more than a single feature/bugfix onto the maintainer and expect it to be merged; I'm sure that refactoring is good but not when it's forked off for 6month... This is the problem we had recently with both toad's and xor's code. We're talking about dozens of features and bugfixes AND refactoring. Then that needs to change going forward. Agreed. Whether there's a single change per commit/branch/pull request doesn't matter as long as one of them is enforced. The way Github's code review tools are designed, you definitely want it to be per branch (which will have a 1-1 relationship with a pull request), not per commit. No disagreement here either. I'd argue that for us git and github are the wrong tools but I completely agree with your analysis of how we're mis-using them :) Until now the base unit was supposed to be one change per commit; I'm all for changing it to something else but that won't solve the problem unless it's enforced strictly. You can't impose processes on people, they need to agree to them or it won't work. That being said, I don't know why any reasonable person wouldn't agree to what I've outlined. It's a tried and tested approach. OSS projects do; when maintainers don't like the code they just don't merge it (and that might leads to forks and that's perfectly fine). This is what's happening now and part of why we're in limbo. What's sad is that most of the problems are from the code *paid* devs are producing. Some might argue that it's because they're producing more than volunteers but I don't think so. I believe that their evaluation / incentives model needs to change for their behaviour to adapt. Maybe it's time to reconsider bounties (pay per feature/bugfix). Florent signature.asc Description: This is a digitally signed message part ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sun, Mar 22, 2015 at 4:39 PM, Ian Clarke i...@freenetproject.org wrote: That is precisely how every mature project works. Just because neither your business ventures nor your involvement in open source are mature doesn't mean there aren't projects out there - both open source (Linux, Wine) and businesses (Steve's employer, presumably Google, etc) - which care about code quality. Wow. Why would you say something so stupid in a forum where future employers could potentially see it? You forget that I've seen your code, it wouldn't come anywhere close to passing a code review by even the least experienced developer on my company's engineering team, so please don't presume to lecture me about code quality, or how mature software projects work. I really hope they teach you how to be less of an asshole during your computer science course, or you'll have a very hard time getting or keeping a job, either commercial or in academia. Your previous paragraph would be a firing offense at most companies. Professional engineering teams simply don't tolerate asshole behavior any more. Ugh, this interaction was exactly the type of thing we need to avoid. Attacking each-other's competence serves no useful purpose whatsoever. Ian. -- Founder, The Freenet Project Email: i...@freenetproject.org ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Mon, Mar 23, 2015 at 5:54 AM, Florent Daigniere nextg...@freenetproject.org wrote: On Sun, 2015-03-22 at 14:11 -0500, Ian wrote: The way Github's code review tools are designed, you definitely want it to be per branch (which will have a 1-1 relationship with a pull request), not per commit. No disagreement here either. I'd argue that for us git and github are the wrong tools but I completely agree with your analysis of how we're mis-using them :) I'm not sure, I haven't had experience with any other code review tool, but they work very well for us at my day-job, the entire team (9 experienced software engineers) seems happy with them. You can't impose processes on people, they need to agree to them or it won't work. That being said, I don't know why any reasonable person wouldn't agree to what I've outlined. It's a tried and tested approach. OSS projects do; when maintainers don't like the code they just don't merge it (and that might leads to forks and that's perfectly fine). This is what's happening now and part of why we're in limbo. In the early years of the project Freenet had a fairly liberal attitude to contributions, trust but verify. I did my best to minimize red-tape for developers. This was beneficial because it meant there was a low barrier to entry for volunteers. If we didn't have this approach, I doubt many of the earliest contributors, people like Oskar Sandberg and Scott Miller, would have become involved (both started with very minor contributions). And as I mentioned previously, I just don't think we have the manpower for formalized gatekeepers (ie. we're not the Linux Kernel), so it's simply not an option to be that rigid. What's sad is that most of the problems are from the code *paid* devs are producing. Some might argue that it's because they're producing more than volunteers but I don't think so. I think we need to avoid personal criticisms of people (yes, I know that I attacked Toad earlier in this thread, but I was provoked - I should have taken the high road, I'd had a few beers at that point). I believe that their evaluation / incentives model needs to change for their behaviour to adapt. Maybe it's time to reconsider bounties (pay per feature/bugfix). I'm not sure about that. Monetary motivation works well for salespeople, I don't think it works well for engineers. It would also be a significant amount of work to administer, and I just don't think we have the manpower to do that. It could also result in a bias towards work on outwardly visible bells and whistles, and against internal stuff like refactoring. Ian. -- *Ian Clarke* / Co-Founder CTO *OneSpot, Inc* Email: i...@onespot.com Web: http://www.onespot.com Personal Blog: http://blog.locut.us/ LinkedIn: http://www.linkedin.com/in/iancjclarke Twitter: http://twitter.com/sanity ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 22/03/15 21:39, Ian Clarke wrote: On Sun, Mar 22, 2015 at 4:20 PM, Matthew Toseland mj...@cam.ac.uk wrote: But regardless, nothing in the process I've outlined would inhibit correcting problems like this. Ideally they'd be corrected at the code review stage, but if they make it past that then they can be corrected with a new branch, just like any other bugfix. No, they can't. Removing binaries (or worse, copyright-infringing files) requires rewriting the history. Why would removing a binary require rewriting the history, Because we don't want those files in the repository at all. We don't want everyone to have to download jars which are many years out of date and aren't used anyway when they clone the repository. and why would anyone commit a copyright-infringing file? Can happen with installers. Can happen with people taking short-cuts, although that's less likely as Freenet is fairly specialised. Can happen as a deliberate attempt to disrupt. Can even happen by accident. In my years of using Github I have never ever been in a situation that necessitated rewriting history. The possibility that someone might do something really dumb and easily avoidable should not dictate our code review policy. Everyone does this when they first use Git. For example, most of the members of my group project this year included multiple copies of huge third party javascript libraries, and even jars. Some of our contributors will be on that level - they won't initially know any better. This is simply because of lack of experience. If we have volunteers, many of them will have limited experience. This may also be true of paid developers, and certainly of most GSoC interns. 2. Disruptive changes to APIs. This has also happened. Especially if the description is incomplete, there is a significant risk of refactoring breaking other code (e.g. plugins; this was part of the problem with Xor's changes), introducing new APIs that don't make sense etc. Both code review and a decent continuous integration system should address this. This is a good continuous integration service that is free for OSS projects: https://travis-ci.org/ You can see how I use it on this other OSS project of mine: http://quickml.org/ Not all problems can be detected by automated tools. Sometimes it is necessary to change classes that are used by plugins, and some of those plugins are unofficial, i.e. third party. Yes, just like any other software project. What would make us any different, and why would that necessitate deviating from the process I've proposed (which is pretty-much standard practice among well-run dev teams already)? Your point was that continuous integration can detect all problems, so advance code review is unnecessary. This is not true. CI is worth a lot, but it doesn't solve all the problems being addressed here. And it is certainly not true that all well-run development teams accept unreviewed code! Wine doesn't, Linux doesn't, Steve's company doesn't, and I'd be pretty amazed if Google did. As for the rest, I will not be drawn into personal attacks. I care about Freenet and would like it to succeed, regardless of whether I am paid to work for it. Some of its goals may be naive in the modern world, but there is much potential here. Sadly I don't have time to volunteer at the moment; I'm only involved now because a major crisis has erupted and I had hoped I could contribute something. signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sun, Mar 22, 2015 at 4:20 PM, Matthew Toseland mj...@cam.ac.uk wrote: But regardless, nothing in the process I've outlined would inhibit correcting problems like this. Ideally they'd be corrected at the code review stage, but if they make it past that then they can be corrected with a new branch, just like any other bugfix. No, they can't. Removing binaries (or worse, copyright-infringing files) requires rewriting the history. Why would removing a binary require rewriting the history, and why would anyone commit a copyright-infringing file? In my years of using Github I have never ever been in a situation that necessitated rewriting history. The possibility that someone might do something really dumb and easily avoidable should not dictate our code review policy. 2. Disruptive changes to APIs. This has also happened. Especially if the description is incomplete, there is a significant risk of refactoring breaking other code (e.g. plugins; this was part of the problem with Xor's changes), introducing new APIs that don't make sense etc. Both code review and a decent continuous integration system should address this. This is a good continuous integration service that is free for OSS projects: https://travis-ci.org/ You can see how I use it on this other OSS project of mine: http://quickml.org/ Not all problems can be detected by automated tools. Sometimes it is necessary to change classes that are used by plugins, and some of those plugins are unofficial, i.e. third party. Yes, just like any other software project. What would make us any different, and why would that necessitate deviating from the process I've proposed (which is pretty-much standard practice among well-run dev teams already)? I'm not sure I understand your point. That seems very unlikely to happen, we barely have budget for actual developers, let alone dedicated QA people. And really, who'd want that job anyway? Even setting aside budget and finding suitable people, coders should have primary responsibility for ensuring that their code is good. It's unhealthy to have a situation where coders think that ensuring their code is clean and robust is someone else's problem. That is precisely how every mature project works. Just because neither your business ventures nor your involvement in open source are mature doesn't mean there aren't projects out there - both open source (Linux, Wine) and businesses (Steve's employer, presumably Google, etc) - which care about code quality. Wow. Why would you say something so stupid in a forum where future employers could potentially see it? You forget that I've seen your code, it wouldn't come anywhere close to passing a code review by even the least experienced developer on my company's engineering team, so please don't presume to lecture me about code quality, or how mature software projects work. I really hope they teach you how to be less of an asshole during your computer science course, or you'll have a very hard time getting or keeping a job, either commercial or in academia. Your previous paragraph would be a firing offense at most companies. Professional engineering teams simply don't tolerate asshole behavior any more. Ian. -- Ian Clarke Founder, The Freenet Project Email: i...@freenetproject.org ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sat, Mar 21, 2015 at 7:42 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: On 21/03/15 20:49, Ian Clarke wrote: That's part of it, but also that a branch should be created for each bugfix/feature, which ideally should be as small a unit of work as possible (that can be merged without breaking stuff). Absolutely. See e.g.: http://nvie.com/posts/a-successful-git-branching-model/ We try to stick to this ... or at least, that was the plan/consensus! Hmm, it doesn't sound like we do in practice, given that I believe there were pull requests with hundreds of commits which sounded like it was for much more than a single feature or bugfix. Anyway, the point isn't to complain about what happened in the past, just to agree on our process going forward. The problem is then we spend a lot of time undoing stuff. This can and does happen, and has happened, even relatively recently. It's part of the reason why most projects use a branch-and-merge model, rather than giving all the devs push rights. Ian. Why we might need to revert or reject changes: 0. Stupid stuff. E.g. committing jars to repositories. Committing jars to repositories is kind of careless, don't people have sensible .gitignore files? Can a .gitignore be committed to the repo? But regardless, nothing in the process I've outlined would inhibit correcting problems like this. Ideally they'd be corrected at the code review stage, but if they make it past that then they can be corrected with a new branch, just like any other bugfix. 2. Disruptive changes to APIs. This has also happened. Especially if the description is incomplete, there is a significant risk of refactoring breaking other code (e.g. plugins; this was part of the problem with Xor's changes), introducing new APIs that don't make sense etc. Both code review and a decent continuous integration system should address this. This is a good continuous integration service that is free for OSS projects: https://travis-ci.org/ You can see how I use it on this other OSS project of mine: http://quickml.org/ I agree that review capacity is potentially a bottleneck. There are different ways to solve this: - Have paid staff who review and merge stuff. That seems very unlikely to happen, we barely have budget for actual developers, let alone dedicated QA people. And really, who'd want that job anyway? Even setting aside budget and finding suitable people, coders should have primary responsibility for ensuring that their code is good. It's unhealthy to have a situation where coders think that ensuring their code is clean and robust is someone else's problem. - Don't accept pull requests if nobody can review them right now. This will absolutely cause a severe bottleneck, causing development to grind to a halt, and probably destroying morale in the process. Who wants to work their ass of on some code only for it to sit in a branch indefinitely? - Allow the reviewers to make reasonable demands for clear code. A pull request is a negotiation between the contributor and the maintainer. Of course, reviewers can point out unclean code, and a conscientious developer will want to commit good code so they'll fix it. If a coder is ignoring reasonable code review feedback then that might be grounds for removing commit access. Ian. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sun, Mar 22, 2015 at 6:44 AM, Arne Babenhauserheide arne_...@web.de wrote: Am Samstag, 21. März 2015, 12:45:39 schrieb Ian Clarke: It sounds like people are trying to use commits for code review, whereas they should be using Github pull requests. For huge changes we are using commits as a second level hierarchy: When the diff is too big to understand on its own, then the commits have to form an easy to follow story which allows understanding the change step by step. I think you've misidentified the problem and therefore the solution. The problem is that diffs would ever be too big to understand on their own, therefore the solution is that the diff should never be too big to understand on it's own. A branch/pull-request should rarely represent more than a few days of work, and thus should rarely take more than a few minutes to review. As such they unpaid devs cannot just sit down for 6 hours and read through a huge diff to understand it in its entirety. They need the diff more structured. A 6 hour code review is insane, and should never be necessary. - For any isolatable feature or bugfix, create a new branch just for that feature or bug request (perhaps put the bug id # in the name of the branch). *Do not combine multiple features or bugfixes into a single branch.* If it can be merged independently, it should have it's own branch. We get into a problem with this, when the changes get too extensive without being ready for merging into a release. It should never be necessary for changes to be that extensive. In my day-job we're working on a far more complex codebase than Freenet, and yet a branch/pull-request almost *never* represents more than 4 days of work, and therefore code reviews rarely require more than 10-20 minutes. Your scheme provides fast development of individual features, but does not provide information spread within the group: Only one person has seen the current version of the code. I don't understand that, anyone that wants to is free to look at the code/pull-requests. Longterm this needs to a situation where no one can understand the whole code well enough to change it efficiently, which in turn leads to pseudo-ownership over certain sub-sections of the code. If we're producing code that isn't comprehensible without external explanation, then we're not producing good code. Well written code shouldn't require an explanation, it should be self-explanatory. This book lays out some excellent guidelines for this kind of code, every programmer should read it: http://amzn.com/0132350882 Ian. -- Ian Clarke Founder, The Freenet Project Email: i...@freenetproject.org ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
Sure, but let's not get bogged down with complaining about the past, we all know it's been dysfunctional, let's focus on what we're going to do from now on. Ian. On Sun, Mar 22, 2015 at 2:06 AM, Florent Daigniere nextg...@freenetproject.org wrote: On Sun, 2015-03-22 at 07:36 +0100, Florent Daigniere wrote: On Sun, 2015-03-22 at 00:42 +, Matthew Toseland wrote: On 21/03/15 20:49, Ian Clarke wrote: On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: Most of the above boils down to review the diff, not the history. That is probably sensible. That's part of it, but also that a branch should be created for each bugfix/feature, which ideally should be as small a unit of work as possible (that can be merged without breaking stuff). It only is if the diff is of reasonable size... which it is most of the time, *except* when it comes from paid devs. They code in their cave for months, drop a 100kloc diff doing way more than a single feature/bugfix onto the maintainer and expect it to be merged; I'm sure that refactoring is good but not when it's forked off for 6month... This is the problem we had recently with both toad's and xor's code. We're talking about dozens of features and bugfixes AND refactoring. What blocks development is that the refactoring isn't merged until their work is ready... and there's usually a good reason for it; they code first and think later (to be fair it's a common trait amongst devs working alone) which means that as long as they haven't tried it out they're not quite sure of what they want in terms of API... so it's only in a mergeable state when it works. Whether there's a single change per commit/branch/pull request doesn't matter as long as one of them is enforced. Until now the base unit was supposed to be one change per commit; I'm all for changing it to something else but that won't solve the problem unless it's enforced strictly. Florent Just to put some perspective on what I've written above: Here's toad's diff: https://github.com/freenet/fred/pull/287/files Here's xor's diff: https://github.com/freenet/fred/pull/319/files in both cases github gives up rendering it: Sorry, we could not display the changes to this file because there were too many other changes to display. and doesn't even display the commit count: This pull request is big! We're only showing the most recent 250 commits. As for the individual commits, there's plenty which are doing more than a bugfix/feature/single semantic change. Florent -- Ian Clarke Founder, The Freenet Project Email: i...@freenetproject.org ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 22/03/15 19:22, Ian Clarke wrote: I think you've misidentified the problem and therefore the solution. The problem is that diffs would ever be too big to understand on their own, therefore the solution is that the diff should never be too big to understand on it's own. A branch/pull-request should rarely represent more than a few days of work, and thus should rarely take more than a few minutes to review. ... It should never be necessary for changes to be that extensive. In my day-job we're working on a far more complex codebase than Freenet, and yet a branch/pull-request almost *never* represents more than 4 days of work, and therefore code reviews rarely require more than 10-20 minutes. Unfortunately, sometimes it is necessary to make big changes. For example, my work last summer: We can't keep on using Db4o because it's been discontinued. We can't easily move to another object database because they are not standardised. In any case we need to rewrite it to a simpler structure for stability and performance reasons. But this meant rewriting a lot of core code, adding necessary infrastructure, and then implementing backward compatibility. Since this will break APIs, we may as well fix some of the other problems at the same time. Now, if I'd had time, I would have broken this up into a series of pull requests (or big commits) to be reviewed separately. However, most of them cannot be merged independantly - either they introduce new functionality that isn't used (so why should it be in the codebase?), or they break backwards compatibility (which is a largish chunk of work itself). It appears that time was a factor in Xor's case too - he was worried about starting his thesis soon. I believe Ian's answer will be that time is always a factor and the correct solution is to cut corners and hope for the best - we will clean up the code some day. Some day never comes, because there's always another deadline (for employees)... IMHO this is a management style which does not work with the kind of code that Freenet is, that modern OSS is, that Google's infrastructure is. It works well enough with code that doesn't need to be maintained, and it just about works if you can rewrite it completely every year, or ship it once and forget it. Re volunteers, if they know what the rules are and if maintainers respond quickly there's a good chance that they will help to solve the problems, and the code will get merged. Longterm this needs to a situation where no one can understand the whole code well enough to change it efficiently, which in turn leads to pseudo-ownership over certain sub-sections of the code. If we're producing code that isn't comprehensible without external explanation, then we're not producing good code. Well written code shouldn't require an explanation, it should be self-explanatory. This book lays out some excellent guidelines for this kind of code, every programmer should read it: http://amzn.com/0132350882 Yes, and allowing everyone to commit with no supervision is a perfect recipe for the sort of spaghetti code that we would like to avoid. signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sun, Mar 22, 2015 at 1:36 AM, Florent Daigniere nextg...@freenetproject.org wrote: On Sun, 2015-03-22 at 00:42 +, Matthew Toseland wrote: On 21/03/15 20:49, Ian Clarke wrote: On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: Most of the above boils down to review the diff, not the history. That is probably sensible. That's part of it, but also that a branch should be created for each bugfix/feature, which ideally should be as small a unit of work as possible (that can be merged without breaking stuff). It only is if the diff is of reasonable size... which it is most of the time, *except* when it comes from paid devs. Then that needs to change. With a feature/bugfix-per-branch approach the diffs should be reviewable in just a few minutes. If they aren't, then the coder is probably stuffing multiple features/bugfixes into a single branch and that's contrary to the guidelines I've outlined. Whatever happened in the past, what matters is what we do going forward. They code in their cave for months, drop a 100kloc diff doing way more than a single feature/bugfix onto the maintainer and expect it to be merged; I'm sure that refactoring is good but not when it's forked off for 6month... This is the problem we had recently with both toad's and xor's code. We're talking about dozens of features and bugfixes AND refactoring. Then that needs to change going forward. Whether there's a single change per commit/branch/pull request doesn't matter as long as one of them is enforced. The way Github's code review tools are designed, you definitely want it to be per branch (which will have a 1-1 relationship with a pull request), not per commit. Until now the base unit was supposed to be one change per commit; I'm all for changing it to something else but that won't solve the problem unless it's enforced strictly. You can't impose processes on people, they need to agree to them or it won't work. That being said, I don't know why any reasonable person wouldn't agree to what I've outlined. It's a tried and tested approach. Ian. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 22/03/15 19:07, Ian wrote: On Sat, Mar 21, 2015 at 7:42 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: 0. Stupid stuff. E.g. committing jars to repositories. Committing jars to repositories is kind of careless, don't people have sensible .gitignore files? Can a .gitignore be committed to the repo? But regardless, nothing in the process I've outlined would inhibit correcting problems like this. Ideally they'd be corrected at the code review stage, but if they make it past that then they can be corrected with a new branch, just like any other bugfix. No, they can't. Removing binaries (or worse, copyright-infringing files) requires rewriting the history. Rewriting the history is *BAD*. It makes it impossible for a third party to follow what is going on. It makes life extremely difficult/messy for our own developers, breaking their unfinished work when they pull from master. 2. Disruptive changes to APIs. This has also happened. Especially if the description is incomplete, there is a significant risk of refactoring breaking other code (e.g. plugins; this was part of the problem with Xor's changes), introducing new APIs that don't make sense etc. Both code review and a decent continuous integration system should address this. This is a good continuous integration service that is free for OSS projects: https://travis-ci.org/ You can see how I use it on this other OSS project of mine: http://quickml.org/ Not all problems can be detected by automated tools. Sometimes it is necessary to change classes that are used by plugins, and some of those plugins are unofficial, i.e. third party. I agree that review capacity is potentially a bottleneck. There are different ways to solve this: - Have paid staff who review and merge stuff. That seems very unlikely to happen, we barely have budget for actual developers, let alone dedicated QA people. And really, who'd want that job anyway? Even setting aside budget and finding suitable people, coders should have primary responsibility for ensuring that their code is good. It's unhealthy to have a situation where coders think that ensuring their code is clean and robust is someone else's problem. That is precisely how every mature project works. Just because neither your business ventures nor your involvement in open source are mature doesn't mean there aren't projects out there - both open source (Linux, Wine) and businesses (Steve's employer, presumably Google, etc) - which care about code quality. I repeat my earlier point: If we allow everyone to commit without review, we will spend a lot of our time (and commit history) undoing stuff. This is one of the reasons behind the rise of distributed version control in the first place. Git, and Github, are specifically designed to support pull requests. The implication is that the person merging the changes is not the same as the person posting them, and that they should be signed-off, i.e. reviewed. - Don't accept pull requests if nobody can review them right now. This will absolutely cause a severe bottleneck, causing development to grind to a halt, and probably destroying morale in the process. Who wants to work their ass of on some code only for it to sit in a branch indefinitely? - Allow the reviewers to make reasonable demands for clear code. A pull request is a negotiation between the contributor and the maintainer. Of course, reviewers can point out unclean code, and a conscientious developer will want to commit good code so they'll fix it. If a coder is ignoring reasonable code review feedback then that might be grounds for removing commit access. We are not using SVN. We are not using CVS. NOBODY uses Git and then gives everyone push access. It just doesn't make sense. The correct, standard workflow used by just about everyone is to fork and post a pull request, which may or may not be merged. I will reply to your points about big changes separately. signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 22/03/15 22:21, Roland Haeder wrote: On Sun, 22 Mar 2015 21:40:41 + Matthew Toseland mj...@cam.ac.uk wrote: On 22/03/15 19:22, Ian Clarke wrote: Unfortunately, sometimes it is necessary to make big changes. One good example is the initial import of all files for a new project. There you commit a huge chunk of files and directories. As I'm (different project) a newcomer to a project's development, I had started cloning the original repository (creating a personal clone) and started developing. I did this by committing carefully and I tried to avoid huge changes (not always possible, more below) to let them easily review them. Then I stepped towards them and my commits got rejected because I didn't make a feature branch or fix branch ... :-( That is kind of frustrating because my commits won't make it in. What I mean here is that the maintainer should not be so bureaucratic to newcomers as this could turn them away and they may never come back. The maintainer can then still review commits relatively easily by setting up a merge branch git checkout master # or wherever your development branch is git remote add rolands_code git://git.bla.example/some/foo.git --track master git checkout -b code_reviewed_okay/master git checkout master git checkout -b rolands_master git pull rolands_code This way you can check your code with changes from developer roland (yes, my name) without mixing it so quickly with your own code. Then start the review process by looking at each commit and cherry-pick it into code_reviewed_okay/master branch: git checkout code_reviewed_okay/master git cherry-pick xx git checkout rolands_master Then you can check them again (on functionality) in the review branch (as not all commits from rolands_master are in) and continue with next one. Maybe there is a better way. :-) But this way may work. After that merge all reviewed commits into master: git checkout master git merge -S code_reviewed_okay/master That seems to be a fine way to me. My 2 cents. :-) Your code should not be rejected because it is on master on your own repository IMHO. However, if you are going to make multiple pull requests you really need to use separate branches. If you have several different changes on the same branch then you definitely need to separate them out before making a pull request. However, this is pretty easy using git branch, git reset and git cherry-pick. signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sun, 22 Mar 2015 22:27:24 + Matthew Toseland mj...@cam.ac.uk wrote: Your code should not be rejected because it is on master on your own repository IMHO. However, if you are going to make multiple pull requests you really need to use separate branches. If you have several different changes on the same branch then you definitely need to separate them out before making a pull request. However, this is pretty easy using git branch, git reset and git cherry-pick. Okay, I will try it then later. Thank you. :-) signature.asc Description: PGP signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sun, 22 Mar 2015 21:40:41 + Matthew Toseland mj...@cam.ac.uk wrote: On 22/03/15 19:22, Ian Clarke wrote: Unfortunately, sometimes it is necessary to make big changes. One good example is the initial import of all files for a new project. There you commit a huge chunk of files and directories. As I'm (different project) a newcomer to a project's development, I had started cloning the original repository (creating a personal clone) and started developing. I did this by committing carefully and I tried to avoid huge changes (not always possible, more below) to let them easily review them. Then I stepped towards them and my commits got rejected because I didn't make a feature branch or fix branch ... :-( That is kind of frustrating because my commits won't make it in. What I mean here is that the maintainer should not be so bureaucratic to newcomers as this could turn them away and they may never come back. The maintainer can then still review commits relatively easily by setting up a merge branch git checkout master # or wherever your development branch is git remote add rolands_code git://git.bla.example/some/foo.git --track master git checkout -b code_reviewed_okay/master git checkout master git checkout -b rolands_master git pull rolands_code This way you can check your code with changes from developer roland (yes, my name) without mixing it so quickly with your own code. Then start the review process by looking at each commit and cherry-pick it into code_reviewed_okay/master branch: git checkout code_reviewed_okay/master git cherry-pick xx git checkout rolands_master Then you can check them again (on functionality) in the review branch (as not all commits from rolands_master are in) and continue with next one. Maybe there is a better way. :-) But this way may work. After that merge all reviewed commits into master: git checkout master git merge -S code_reviewed_okay/master That seems to be a fine way to me. My 2 cents. :-) signature.asc Description: PGP signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 21/03/15 17:45, Ian Clarke wrote: Talking to a few people, I think our current approach to code review is problematic. For example, I've been told that some people are arguing that commits are too granular, and need to be combined to make code review easier. This is a mistake, there is nothing wrong with very granular commits, just as there is nothing wrong with more verbose code if it helps clarity. Pull requests should be used for code review, not individual commits. The number of individual commits should be irrelevant. It sounds like people are trying to use commits for code review, whereas they should be using Github pull requests. Here is a process that has worked well in my experience: - For any isolatable feature or bugfix, create a new branch just for that feature or bug request (perhaps put the bug id # in the name of the branch). *Do not combine multiple features or bugfixes into a single branch.* If it can be merged independently, it should have it's own branch. - Commits to this branch can be as granular as the programmer wants, generally the more frequent the commits the better - When the programmer is ready for feedback, they should create a pull request between this branch and the main branch - they could post the URL of the pull request to IRC to encourage feedback - It's fine for a programmer to request feedback before the feature is complete, but they should note this in the title of the pull request - eg. DO_NOT_MERGE - For code review, the most useful tab is Files changed - which shows all differences between the feature branch and the main branch (individual commits are irrelevant here) - comments can be added here - Once the feature is complete and the review feedback has been read and perhaps acted upon by the programmer working on the feature, it should be merged The person contributing the code is responsible for asking for and incorporating feedback, but they control the process, the process should not control them. So, for example, in some cases the programmer might decide that a code review is unnecessary. That will be their call. Better to trust human judgement over a rigid process. Thoughts? Ian. Most of the above boils down to review the diff, not the history. That is probably sensible. The last point is everyone can commit anything without review. That's the fundamental question here: Do we want to require that some responsible person (release manager, person with push rights) reviews and signs off on the code before it is pushed? There are 2 main reasons for this: 1. Security. How useful this is is debatable. 2. Disruptive changes to APIs. signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 21/03/15 18:58, Matthew Toseland wrote: On 21/03/15 17:45, Ian Clarke wrote: Talking to a few people, I think our current approach to code review is problematic. For example, I've been told that some people are arguing that commits are too granular, and need to be combined to make code review easier. This is a mistake, there is nothing wrong with very granular commits, just as there is nothing wrong with more verbose code if it helps clarity. Pull requests should be used for code review, not individual commits. The number of individual commits should be irrelevant. It sounds like people are trying to use commits for code review, whereas they should be using Github pull requests. Here is a process that has worked well in my experience: - For any isolatable feature or bugfix, create a new branch just for that feature or bug request (perhaps put the bug id # in the name of the branch). *Do not combine multiple features or bugfixes into a single branch.* If it can be merged independently, it should have it's own branch. - Commits to this branch can be as granular as the programmer wants, generally the more frequent the commits the better - When the programmer is ready for feedback, they should create a pull request between this branch and the main branch - they could post the URL of the pull request to IRC to encourage feedback - It's fine for a programmer to request feedback before the feature is complete, but they should note this in the title of the pull request - eg. DO_NOT_MERGE - For code review, the most useful tab is Files changed - which shows all differences between the feature branch and the main branch (individual commits are irrelevant here) - comments can be added here - Once the feature is complete and the review feedback has been read and perhaps acted upon by the programmer working on the feature, it should be merged The person contributing the code is responsible for asking for and incorporating feedback, but they control the process, the process should not control them. So, for example, in some cases the programmer might decide that a code review is unnecessary. That will be their call. Better to trust human judgement over a rigid process. Thoughts? Ian. Most of the above boils down to review the diff, not the history. That is probably sensible. It implies that the RM's can reasonably ask for some sort of guide to the changes to make diff-level review of big patches easier. The last point is everyone can commit anything without review. That's the fundamental question here: Do we want to require that some responsible person (release manager, person with push rights) reviews and signs off on the code before it is pushed? There are 2 main reasons for this: 1. Security. How useful this is is debatable. 2. Disruptive changes to APIs. signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: Most of the above boils down to review the diff, not the history. That is probably sensible. That's part of it, but also that a branch should be created for each bugfix/feature, which ideally should be as small a unit of work as possible (that can be merged without breaking stuff). The last point is everyone can commit anything without review. That's the fundamental question here: Do we want to require that some responsible person (release manager, person with push rights) reviews and signs off on the code before it is pushed? I think the question is moot, since (so far as I'm aware) we don't have anyone that can commit to reviewing all code reliably and quickly, so such a requirement would only serve to create a severe bottleneck in our development process. All commits are public, all commits can be reviewed by anyone, but in the event that nobody is in a position to review something we can't allow development to grind to a halt. If people care about reviewing code then they can and should review code. Ian. There are 2 main reasons for this: 1. Security. How useful this is is debatable. 2. Disruptive changes to APIs. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl -- *Ian Clarke* / Co-Founder CTO *OneSpot, Inc* Email: i...@onespot.com Web: http://www.onespot.com Personal Blog: http://blog.locut.us/ LinkedIn: http://www.linkedin.com/in/iancjclarke Twitter: http://twitter.com/sanity ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
[resending with correct from: address] On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: Most of the above boils down to review the diff, not the history. That is probably sensible. That's part of it, but also that a branch should be created for each bugfix/feature, which ideally should be as small a unit of work as possible (that can be merged without breaking stuff). The last point is everyone can commit anything without review. That's the fundamental question here: Do we want to require that some responsible person (release manager, person with push rights) reviews and signs off on the code before it is pushed? I think the question is moot, since (so far as I'm aware) we don't have anyone that can commit to reviewing all code reliably and quickly, so such a requirement would only serve to create a severe bottleneck in our development process. All commits are public, all commits can be reviewed by anyone, but in the event that nobody is in a position to review something we can't allow development to grind to a halt. If people care about reviewing code then they can and should review code. Ian. -- Ian Clarke Founder, The Freenet Project Email: i...@freenetproject.org ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Github and code review
On 21/03/15 20:49, Ian Clarke wrote: On Sat, Mar 21, 2015 at 1:58 PM, Matthew Toseland matt...@toselandcs.co.uk wrote: Most of the above boils down to review the diff, not the history. That is probably sensible. That's part of it, but also that a branch should be created for each bugfix/feature, which ideally should be as small a unit of work as possible (that can be merged without breaking stuff). Absolutely. See e.g.: http://nvie.com/posts/a-successful-git-branching-model/ We try to stick to this ... or at least, that was the plan/consensus! The last point is everyone can commit anything without review. That's the fundamental question here: Do we want to require that some responsible person (release manager, person with push rights) reviews and signs off on the code before it is pushed? I think the question is moot, since (so far as I'm aware) we don't have anyone that can commit to reviewing all code reliably and quickly, so such a requirement would only serve to create a severe bottleneck in our development process. All commits are public, all commits can be reviewed by anyone, but in the event that nobody is in a position to review something we can't allow development to grind to a halt. If people care about reviewing code then they can and should review code. After it's been committed. The problem is then we spend a lot of time undoing stuff. This can and does happen, and has happened, even relatively recently. It's part of the reason why most projects use a branch-and-merge model, rather than giving all the devs push rights. Ian. Why we might need to revert or reject changes: 0. Stupid stuff. E.g. committing jars to repositories. There are 2 main reasons for this: 1. Security. How useful this is is debatable. E.g. introducing a rootkit which will be run on developer's boxes by editing the build scripts, or the tests. It may be possible to prevent this via automated tools. Subtle changes and dirty tricks may be hard to detect, but some things *do* show up in a quick code review. 2. Disruptive changes to APIs. This has also happened. Especially if the description is incomplete, there is a significant risk of refactoring breaking other code (e.g. plugins; this was part of the problem with Xor's changes), introducing new APIs that don't make sense etc. I agree that review capacity is potentially a bottleneck. There are different ways to solve this: - Have paid staff who review and merge stuff. - Don't accept pull requests if nobody can review them right now. - Allow the reviewers to make reasonable demands for clear code. A pull request is a negotiation between the contributor and the maintainer. Many OSS projects use processes like this... signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl