Re: [Numpy-discussion] let's use patch review

2008-05-16 Thread Stéfan van der Walt
2008/5/16 Anne Archibald [EMAIL PROTECTED]:
 How frequently does numpy receive patches that warrant review? The
 zillion little doc fixes don't, even moderate-sized patches from
 experienced developers probably don't warrant review.

Those moderately-sized patches are the ones that need review,
especially.  Review provides useful information on a couple of levels:

a) Motivation -- why do we want/need this patch
b) Functionality -- does it do what the developer intended it to
c) Implementation -- is it written according to current best practices

Level (a) is normally discussed on the mailing list, if needed.  Level
(b) is covered by unit tests, *if* those were written.  Then, level
(c) is where the main advantage lies: we can learn from one another
how to develop better code.

I am somewhat split in two on this one.  I love the idea of patch
review; it undoubtedly raises the quality of the codebase.  That said,
it comes at a cost in developer time, and I'm not sure we have that
luxury (we don't have a Michael Abshoff, unfortunately).  Making it
optional might be a good compromise, although the person who wrote a
patch isn't the best one to judge whether it should be reviewed (of
course, we all think our code is good!).

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-15 Thread David Huard
2008/5/14 David Cournapeau [EMAIL PROTECTED]:

 On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote:
 
  What does that mean?  How does one know when there is a consensus?

 There can be a system to make this automatic. For example, the code is
 never commited directly to svn, but to a gatekeeper, and people vote by
 an email command to say if they want the patch in; when the total number
 of votes is above some threshold, the gatekeeper commit the patch.


There is about 5 commits/day, I don't think it's a good idea to wait for a
vote on each one of them.



 David

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-15 Thread Ondrej Certik
On Thu, May 15, 2008 at 3:08 PM, David Huard [EMAIL PROTECTED] wrote:
 2008/5/14 David Cournapeau [EMAIL PROTECTED]:

 On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote:
 
  What does that mean?  How does one know when there is a consensus?

 There can be a system to make this automatic. For example, the code is
 never commited directly to svn, but to a gatekeeper, and people vote by
 an email command to say if they want the patch in; when the total number
 of votes is above some threshold, the gatekeeper commit the patch.

 There is about 5 commits/day, I don't think it's a good idea to wait for a
 vote on each one of them.

Me neither. I think just one reviewer is enough.

Ondrej
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-15 Thread Francesc Alted
A Wednesday 14 May 2008, Ondrej Certik escrigué:
 Hi,

 I read the recent flamebate about unittests, formal procedures for a
 commit etc. and it was amusing. :)
 I think Stefan is right about the unit tests. I also think that
 Travis is right that there is no formal procedure that can assure
 what we want.
[snip]

For what is worth, Ivan and me were using patch peer review for more 
than a year in the PyTables project, and, although I was quite 
reluctant to adopt it at the begining, the reality is that it resulted 
in *much* better code quality to be added to the repository.

Here it is the strategy ww used:

0. A ticket is opened explaining the feature to add or the thing to fix.

1. The ticket taker (i.e. the responsible of adding the new code) 
creates a new branch (properly labeled) for the desired 
modification/patch.  The ticket is updated with a link to the new 
branch and the ownership of the ticket is transferred to the taker.

2. Once the modification is in the new branch, the ticket is updated 
explaining the actions done, and the ownership of the ticket 
transferred to the reviewer.

3. The peer reviews the code, and write suggestions in the same ticket 
about the new code.  If the reviewer doesn't feel the need to suggest 
anything, he will write this in the ticket also.  The ticket is 
transferred to the original author.

4. The original author should revise the suggestions of his peer, and if 
more actions are needed, he should address them.  After this, he will 
transfer the ticket to the peer for a new review.

5. Phase 3 and 4 are repeated until an agreement is held by both parts, 
and the discussion remains in the ticket (one can temporarily tranfer 
part of the ticket discussion to the mailing list, if necessary).

6. After the agreement, the original author commits the patch in the 
temporary code branch to the affected branches (normally trunk and the 
stable branch of the project) and removes the temporary branch.  The 
author has to tell explicitely in the ticket to which branches he has 
applied the new patch.

7. The ticket is closed.

Of course, this works great with a pair programming paradigm (as was the 
case for PyTables), but for a project as NumPy there are more 
developers than just a pair, so you should decide how to choose the 
reviewer.  One possibility is to form pairs by affinity, so that they 
can act normally together.  Another possibility would be to force all 
the developers to subscribe to the ticket mailing list, and, for each 
ticket that requires a peer review, a call should be sent in order to 
gather a reviewer (who can offer as a volunteer by adding a note to the 
ticket, for example).

I don't need to say that this procedure was not used for small or 
trivial changes (that were fixed directly), but only when the issue was 
important enough to deserve the attention of the mate.

My two cents,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-15 Thread David Cournapeau
David Huard wrote:

 There is about 5 commits/day, I don't think it's a good idea to wait 
 for a vote on each one of them.

There is definitely a balance to find, and I am not convinced it would 
work well with subversion (it really makes sense to have those review 
with merge request, not per commit). For example, in scons, they have a 
fairly heavy review process which IMO prevents it from getting more 
contribution. In bzr, it works pretty well (they use a gateway system), 
but most main developers are paid for it.

Having a somewhat official review process would also help solving one of 
my problem with trac: when someone sends a patch on trac, we don't know 
it, we have to look for it, and some of them are lost/duplicated. 
Requesting unit tests for new contributors is too much, I think, I hate 
it for other projects where I am less involved than numpy/scipy. But say 
I have 20 minutes to spend on reviewing patches: with a system which a 
list of available patches, it would be easy to do so. Maybe it is 
possible with trac and I just missed it.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-15 Thread Anne Archibald
2008/5/15 Francesc Alted [EMAIL PROTECTED]:

 I don't need to say that this procedure was not used for small or
 trivial changes (that were fixed directly), but only when the issue was
 important enough to deserve the attention of the mate.

I think here's the rub: when I hear patch review system it sounds to
me like an obstacle course for getting code into the software. Maybe
it's justified, but I think at the moment there are many many things
that are just awaiting a little bit of attention from someone who
knows the code. A patch review system overapplied would multiply that
number by rather a lot.

How about a purely-optional patch review system? I've submitted
patches I wanted reviewed before they went in the trunk. As it was, I
didn't have SVN access, so I just posted them to trac or emailed them
to somebody, who then pondered and committed them. But a patch review
system - provided people were promptly reviewing patches - would have
fit the bill nicely.

How frequently does numpy receive patches that warrant review? The
zillion little doc fixes don't, even moderate-sized patches from
experienced developers probably don't warrant review. So in the last
while, what are changes that needed review, and what happened to them?

* financial code - email to the list, discussion, eventual inclusion
* matrix change - bug filed, substantial discussion, confusion about
consensus committed, further dicussion, consensus committed, users
complain, patch backed out and issue placed on hold
* MA change - developed independently, discussed on mailing list, committed
* histogram change - filed as bug, discussed on mailing list, committed
* median change - discussed on mailing list, committed
* .npy file format - discussed and implemented at a sprint, committed

Did I miss any major ones? Of course, svn log will give you a list of
minor fixes in the last few months.

It seems to me like the review process at the moment is just discuss
it on the mailing list. Tools to facilitate that would be valuable;
it would be handy to be able to point to a particular version of the
code somewhere on the Web (rather than just in patches attached to
email), for example.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-15 Thread David Cournapeau
Anne Archibald wrote:
 I think here's the rub: when I hear patch review system it sounds to
 me like an obstacle course for getting code into the software. Maybe
 it's justified, but I think at the moment there are many many things
 that are just awaiting a little bit of attention from someone who
 knows the code. A patch review system overapplied would multiply that
 number by rather a lot.
   

I think in this discussion, it is easy to see the drawbacks, and not 
seeing the advantages (better code, etc...).

 How about a purely-optional patch review system? I've submitted
 patches I wanted reviewed before they went in the trunk. As it was, I
 didn't have SVN access, so I just posted them to trac or emailed them
 to somebody, who then pondered and committed them. But a patch review
 system - provided people were promptly reviewing patches - would have
 fit the bill nicely.
   

It is not much, but I've just created a trac report to see all (open) 
tickets with an attachment. What would be good would be to create a new 
ticket type, like patch, such as instead of seeing all attachments 
(including build log as now), we see real attachments.

I know next to nothing about databases and how complicated it would be 
to do it in trac, but I know other projects using trac have it, so it is 
doable.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-14 Thread Eric Firing
Ondrej Certik wrote:
 Hi,
 
 I read the recent flamebate about unittests, formal procedures for a
 commit etc. and it was amusing. :)
 I think Stefan is right about the unit tests. I also think that Travis
 is right that there is no formal procedure that can assure what we
 want.
 
 I think that a solution is a patch review. Every big/succesful project
 does it. And the workflow as I see it is this:

Are you sure numpy is big enough that a formal mechanism is needed--for 
everyone?  It makes good sense for my (rare) patches to be reviewed, but 
shouldn't some of the core developers be allowed to simply get on with 
it?  As it is, my patches can easily be reviewed because I don't have 
commit access.

 
 1) Travis will fix a bug, and submit it to a patch review. If he is
 busy, that's the only thing he will do
 2) Someone else reviews it. Stefan will be the one who will always
 point out missing tests.

That we can agree on!

 3) There needs to be a common consensus that the patch is ok to go in.

What does that mean?  How does one know when there is a consensus?

 4) when the patch is reviewed and ok to go in, anyone with a commit
 access will commit it.

But it has to be a specific person in each case, not anyone.

 
 I think it's as simple as that.
 
 Sometimes no one has enought time to write a proper test, yet someone
 has a free minute to fix a bug. Then I think it's ok to put the code
 in, as I think it's good to fix a bug now. However,

How does that fit with the workflow above?  Does Travis commit the 
bugfix, or not?

 the issue is definitely not closed and the bug is not fixed (!) until
 someone writes a proper test. I.e. putting code in that is not tested,
 however it doesn't break things, is imho ok, as it will not hurt
 anyone and it will temporarily fix a bug (but of course the code will
 be broken at some point in the future, if there is no test for it).
That is overstating the case; for 788, for example, no one in his right 
mind would undo the one-line correction that Travis made.  Chances are, 
there will be all sorts of breakage and foulups and the revelation of 
new bugs in the future--but not another instance that would be caught by 
the test for 788.
 
[...]
 http://codereview.appspot.com/953
 
 and added some comments. So what do you think?
Looks like it could be useful.  I replied to the comments. I haven't 
read the docs, and I don't know what the next step is when a revision of 
the patch is in order, as it is in this case.

Eric
 
 Ondrej
 
 P.S. efiring, my comments are real questions to your patch. :)
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] let's use patch review

2008-05-14 Thread David Cournapeau
On Wed, 2008-05-14 at 13:58 -1000, Eric Firing wrote:
 
 What does that mean?  How does one know when there is a consensus?

There can be a system to make this automatic. For example, the code is
never commited directly to svn, but to a gatekeeper, and people vote by
an email command to say if they want the patch in; when the total number
of votes is above some threshold, the gatekeeper commit the patch.

David

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion