Re: Commit Times for Issues

2007-11-16 Thread Chris Mattmann
Hi Guys,

 I'd like to chime in here on this one. My +1 for shortening the time to
commit for issues. I fear that development effort on Nutch has teetered on
the dwindling side of things for the last year or so, and there (in my
opinion, so feel free to disagree) is certainly a stigma to the trunk and
its sacred nature that discourages people (including myself) from
introducing new code there.

 I would like to propose even extending Dennis's idea below and developing a
new philosophy towards the Nutch CM. To me, the big picture change is the
following statement: the trunk is something that can be broke. Let's just
accept that it's possible. If it's broke, someone will report it. Nutch has
a big enough user base now that plays around with new builds and revisions
that this will get caught. Guess what. If the trunk is broke, then it can be
fixed. 

 I'll tell you guys a story of one of my bosses here at JPL. He used to work
for a civil defense contractor in the U.S., with very rigorous design and
software development process. Unit tests for each line of code type of
place. In any case, my boss used to break his company's equivalent of the
trunk daily build process all the time. Well one day he gets called in to
speak with the vice president of engineering at the company, who proceeds to
tell him: You're really good at breaking the code, eh?. My boss
immediately jumps up to defend himself, citing the fact that it wasn't a big
problem and that he has fixed it already, but the vice president cuts him
off and says, You probably think I'm mad. Well let me tell you: I'm not.
You can break the code all you want because you know what it tells me? That
you're actually *DOING WORK* unlike the rest of these people who work here
and do very little.

 The above story has stuck with me and made me feel a lot better about
situations such as those in that it gives me the belief that waiting until
everything is perfect before acting in a situation isn't always the best
thing to do because you may end up waiting forever. It's better to make
incremental progress (even falter while doing so), because what you end up
with may be just as good (or even better) as if you tried to be a
perfectionist and only made progress/did work when you felt everything was
right.

 My 2 cents,
  Chris


 


On 11/15/07 1:37 PM, Dennis Kubes [EMAIL PROTECTED] wrote:

 So I have been talking with some of the other committers and I wanted to
 layout a suggestion for standardizing some of the nutch committer
 workflow processes in the hope of speeding up nutch development.
 
 The first one I was hoping to tackle is time to commit.  At least for me
 it has been hard to know when to commit something, especially when it
 was trivial or no one commented on the issue.  Here is what is being
 proposed:
 
 Trivial changes = immediate, this at the discretion of the committers
 Minor changes = 24 hours from latest patch or 1 or more +1 from committers
 Major and blocker changes = 4 days from latest patch or 2 or more +1
 from committers
 
 This way if an issue has been active for some time but no one has taken
 a look at it, and it has passed all unit tests, then we can go ahead and
 commit it.  Also this should allow more of the smaller changes to be
 handled faster.
 
 So these of course are just some suggestions would love to hear from
 others in the community.  What I think would be best is to come to a
 consensus on this and then have a wiki page describing this and other
 processes for committers.
 
 Dennis Kubes

__
Chris Mattmann, Ph.D.
[EMAIL PROTECTED]
Cognizant Development Engineer
Early Detection Research Network Project
_
Jet Propulsion LaboratoryPasadena, CA
Office: 171-266B Mailstop:  171-246
___

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.




Re: Commit Times for Issues

2007-11-16 Thread Dennis Kubes
So a few years ago I started a dating site called oneforever.com.  Good 
technology, but it took us 9 months to develop the first version. 
Mostly because we wanted everything to be perfect.  So we would work on 
something, if it was not perfect change it, and so on.  We never did get 
it perfect, we just got it to the point where we had to launch it.


A few months ago a started a different project focused around social 
networking and search.  With this project I took the viewpoint of 
consistent progress every day.  I would make some improvement to it 
everyday, no matter how small.  No such thing as perfect, just better. 
This project developed much quicker and I think is actually a better 
code base.  And what was more it was fun to work on.


All of this is to say that I don't think there is any such thing as 
perfection.  I do think there is better, continuously better.  And since 
we all enjoy programming (I hope), the making something better (not 
perfect or best) is the fun part (or at least should be).  I can only 
talk from my experience but I think the best part of programming is when 
I have found the solution to the problem and it just works.


So as we are developing this *standard* for committers I agree with 
Chris that we should make this fun and casual and not be worried about 
breaking the trunk.  After all, it's only code (I know, to some people 
that is heresy :)) I actually think we are all in agreement about this. 
 I would love to hear from some of the other committers or members of 
the community before we put these thoughts down on a wiki.


Oh, and I am ok with minor issues having a longer wait time or 1 or more +1.

Dennis Kubes

Chris Mattmann wrote:

Hi Guys,

 I'd like to chime in here on this one. My +1 for shortening the time to
commit for issues. I fear that development effort on Nutch has teetered on
the dwindling side of things for the last year or so, and there (in my
opinion, so feel free to disagree) is certainly a stigma to the trunk and
its sacred nature that discourages people (including myself) from
introducing new code there.

 I would like to propose even extending Dennis's idea below and developing a
new philosophy towards the Nutch CM. To me, the big picture change is the
following statement: the trunk is something that can be broke. Let's just
accept that it's possible. If it's broke, someone will report it. Nutch has
a big enough user base now that plays around with new builds and revisions
that this will get caught. Guess what. If the trunk is broke, then it can be
fixed. 


 I'll tell you guys a story of one of my bosses here at JPL. He used to work
for a civil defense contractor in the U.S., with very rigorous design and
software development process. Unit tests for each line of code type of
place. In any case, my boss used to break his company's equivalent of the
trunk daily build process all the time. Well one day he gets called in to
speak with the vice president of engineering at the company, who proceeds to
tell him: You're really good at breaking the code, eh?. My boss
immediately jumps up to defend himself, citing the fact that it wasn't a big
problem and that he has fixed it already, but the vice president cuts him
off and says, You probably think I'm mad. Well let me tell you: I'm not.
You can break the code all you want because you know what it tells me? That
you're actually *DOING WORK* unlike the rest of these people who work here
and do very little.

 The above story has stuck with me and made me feel a lot better about
situations such as those in that it gives me the belief that waiting until
everything is perfect before acting in a situation isn't always the best
thing to do because you may end up waiting forever. It's better to make
incremental progress (even falter while doing so), because what you end up
with may be just as good (or even better) as if you tried to be a
perfectionist and only made progress/did work when you felt everything was
right.

 My 2 cents,
  Chris


 



On 11/15/07 1:37 PM, Dennis Kubes [EMAIL PROTECTED] wrote:


So I have been talking with some of the other committers and I wanted to
layout a suggestion for standardizing some of the nutch committer
workflow processes in the hope of speeding up nutch development.

The first one I was hoping to tackle is time to commit.  At least for me
it has been hard to know when to commit something, especially when it
was trivial or no one commented on the issue.  Here is what is being
proposed:

Trivial changes = immediate, this at the discretion of the committers
Minor changes = 24 hours from latest patch or 1 or more +1 from committers
Major and blocker changes = 4 days from latest patch or 2 or more +1
from committers

This way if an issue has been active for some time but no one has taken
a look at it, and it has passed all unit tests, then we can go ahead and
commit it.  Also this should allow more of the smaller changes to be
handled faster.

So these of course are 

Re: Commit Times for Issues

2007-11-16 Thread Marcin Okraszewski
I can say something from a contributor point of view. I've contributed two 
rather trivial patches and ... I'm discouraged. Simply the process was far too 
long. Actually I had to ask that someone takes a look for it. Once someone 
invest his time to create patch, write a Jira entry, etc., you rather expect it 
to be reviewed and possibly committed. If there is at least one person who 
needs it that much that is willing to develop it, it may mean there might be 
others who would need it as well.

Just to add. I've done some several contributions to some other projects, but 
this is first time I have a feeling like this.

As looking for perfection, it must be balanced in my opinion. If there is 
something trivial which is not done perfect, which does not break architecture 
... well, it might be acceptable. But if something would make a spaghetti code, 
I wouldn't be so much for it. So my rule of thumb would be - once it breaks 
well design, introduces too big complexity, it shouldn't be accepted. If it 
doesn't influence those, but does what it should, maybe in a bit clumsy way - 
why not. It still solves someone's problem or need.

Regards,
Marcin


Dnia 16 listopada 2007 18:45 Dennis Kubes [EMAIL PROTECTED] napisał(a):

 So a few years ago I started a dating site called oneforever.com.  Good 
 technology, but it took us 9 months to develop the first version. 
 Mostly because we wanted everything to be perfect.  So we would work on 
 something, if it was not perfect change it, and so on.  We never did get 
 it perfect, we just got it to the point where we had to launch it.
 
 A few months ago a started a different project focused around social 
 networking and search.  With this project I took the viewpoint of 
 consistent progress every day.  I would make some improvement to it 
 everyday, no matter how small.  No such thing as perfect, just better. 
 This project developed much quicker and I think is actually a better 
 code base.  And what was more it was fun to work on.
 
 All of this is to say that I don't think there is any such thing as 
 perfection.  I do think there is better, continuously better.  And since 
 we all enjoy programming (I hope), the making something better (not 
 perfect or best) is the fun part (or at least should be).  I can only 
 talk from my experience but I think the best part of programming is when 
 I have found the solution to the problem and it just works.
 
 So as we are developing this *standard* for committers I agree with 
 Chris that we should make this fun and casual and not be worried about 
 breaking the trunk.  After all, it's only code (I know, to some people 
 that is heresy :)) I actually think we are all in agreement about this. 
   I would love to hear from some of the other committers or members of 
 the community before we put these thoughts down on a wiki.
 
 Oh, and I am ok with minor issues having a longer wait time or 1 or more +1.
 
 Dennis Kubes
 
 Chris Mattmann wrote:
  Hi Guys,
  
   I'd like to chime in here on this one. My +1 for shortening the time to
  commit for issues. I fear that development effort on Nutch has teetered on
  the dwindling side of things for the last year or so, and there (in my
  opinion, so feel free to disagree) is certainly a stigma to the trunk and
  its sacred nature that discourages people (including myself) from
  introducing new code there.
  
   I would like to propose even extending Dennis's idea below and developing a
  new philosophy towards the Nutch CM. To me, the big picture change is the
  following statement: the trunk is something that can be broke. Let's just
  accept that it's possible. If it's broke, someone will report it. Nutch has
  a big enough user base now that plays around with new builds and revisions
  that this will get caught. Guess what. If the trunk is broke, then it can be
  fixed. 
  
   I'll tell you guys a story of one of my bosses here at JPL. He used to work
  for a civil defense contractor in the U.S., with very rigorous design and
  software development process. Unit tests for each line of code type of
  place. In any case, my boss used to break his company's equivalent of the
  trunk daily build process all the time. Well one day he gets called in to
  speak with the vice president of engineering at the company, who proceeds to
  tell him: You're really good at breaking the code, eh?. My boss
  immediately jumps up to defend himself, citing the fact that it wasn't a big
  problem and that he has fixed it already, but the vice president cuts him
  off and says, You probably think I'm mad. Well let me tell you: I'm not.
  You can break the code all you want because you know what it tells me? That
  you're actually *DOING WORK* unlike the rest of these people who work here
  and do very little.
  
   The above story has stuck with me and made me feel a lot better about
  situations such as those in that it gives me the belief that waiting until
  everything is perfect before acting 

Re: Commit Times for Issues

2007-11-16 Thread Dennis Kubes



Marcin Okraszewski wrote:

I can say something from a contributor point of view. I've contributed two 
rather trivial patches and ... I'm discouraged. Simply the process was far too 
long. Actually I had to ask that someone takes a look for it. Once someone 
invest his time to create patch, write a Jira entry, etc., you rather expect it 
to be reviewed and possibly committed. If there is at least one person who 
needs it that much that is willing to develop it, it may mean there might be 
others who would need it as well.



Some of the committers have also discussed adding something like a 
pending review workflow to Nutch JIRA for just these cases.  Although 
was leaving that for another discussion. maybe now is the time to discuss.



Just to add. I've done some several contributions to some other projects, but 
this is first time I have a feeling like this.

As looking for perfection, it must be balanced in my opinion. If there is 
something trivial which is not done perfect, which does not break architecture 
... well, it might be acceptable. But if something would make a spaghetti code, 
I wouldn't be so much for it. So my rule of thumb would be - once it breaks 
well design, introduces too big complexity, it shouldn't be accepted. If it 
doesn't influence those, but does what it should, maybe in a bit clumsy way - 
why not. It still solves someone's problem or need.



I agree.  Quality is still a necessity. Including bad code isn't 
progress IMHO.


Dennis


Regards,
Marcin


Dnia 16 listopada 2007 18:45 Dennis Kubes [EMAIL PROTECTED] napisał(a):

So a few years ago I started a dating site called oneforever.com.  Good 
technology, but it took us 9 months to develop the first version. 
Mostly because we wanted everything to be perfect.  So we would work on 
something, if it was not perfect change it, and so on.  We never did get 
it perfect, we just got it to the point where we had to launch it.


A few months ago a started a different project focused around social 
networking and search.  With this project I took the viewpoint of 
consistent progress every day.  I would make some improvement to it 
everyday, no matter how small.  No such thing as perfect, just better. 
This project developed much quicker and I think is actually a better 
code base.  And what was more it was fun to work on.


All of this is to say that I don't think there is any such thing as 
perfection.  I do think there is better, continuously better.  And since 
we all enjoy programming (I hope), the making something better (not 
perfect or best) is the fun part (or at least should be).  I can only 
talk from my experience but I think the best part of programming is when 
I have found the solution to the problem and it just works.


So as we are developing this *standard* for committers I agree with 
Chris that we should make this fun and casual and not be worried about 
breaking the trunk.  After all, it's only code (I know, to some people 
that is heresy :)) I actually think we are all in agreement about this. 
  I would love to hear from some of the other committers or members of 
the community before we put these thoughts down on a wiki.


Oh, and I am ok with minor issues having a longer wait time or 1 or more +1.

Dennis Kubes

Chris Mattmann wrote:

Hi Guys,

 I'd like to chime in here on this one. My +1 for shortening the time to
commit for issues. I fear that development effort on Nutch has teetered on
the dwindling side of things for the last year or so, and there (in my
opinion, so feel free to disagree) is certainly a stigma to the trunk and
its sacred nature that discourages people (including myself) from
introducing new code there.

 I would like to propose even extending Dennis's idea below and developing a
new philosophy towards the Nutch CM. To me, the big picture change is the
following statement: the trunk is something that can be broke. Let's just
accept that it's possible. If it's broke, someone will report it. Nutch has
a big enough user base now that plays around with new builds and revisions
that this will get caught. Guess what. If the trunk is broke, then it can be
fixed. 


 I'll tell you guys a story of one of my bosses here at JPL. He used to work
for a civil defense contractor in the U.S., with very rigorous design and
software development process. Unit tests for each line of code type of
place. In any case, my boss used to break his company's equivalent of the
trunk daily build process all the time. Well one day he gets called in to
speak with the vice president of engineering at the company, who proceeds to
tell him: You're really good at breaking the code, eh?. My boss
immediately jumps up to defend himself, citing the fact that it wasn't a big
problem and that he has fixed it already, but the vice president cuts him
off and says, You probably think I'm mad. Well let me tell you: I'm not.
You can break the code all you want because you know what it tells me? That
you're actually *DOING WORK* 

Re: Commit Times for Issues

2007-11-16 Thread Andrzej Bialecki

Dennis Kubes wrote:



Marcin Okraszewski wrote:
I can say something from a contributor point of view. I've contributed 
two rather trivial patches and ... I'm discouraged. Simply the process 
was far too long. Actually I had to ask that someone takes a look for 
it. Once someone invest his time to create patch, write a Jira entry, 
etc., you rather expect it to be reviewed and possibly committed. If 
there is at least one person who needs it that much that is willing to 
develop it, it may mean there might be others who would need it as well.




Some of the committers have also discussed adding something like a 
pending review workflow to Nutch JIRA for just these cases.  Although 
was leaving that for another discussion. maybe now is the time to discuss.


This is an important and IMHO much needed change.


As looking for perfection, it must be balanced in my opinion. If there 
is something trivial which is not done perfect, which does not break 
architecture ... well, it might be acceptable. But if something would 
make a spaghetti code, I wouldn't be so much for it. So my rule of 
thumb would be - once it breaks well design, introduces too big 
complexity, it shouldn't be accepted. If it doesn't influence those, 
but does what it should, maybe in a bit clumsy way - why not. It still 
solves someone's problem or need.




I agree.  Quality is still a necessity. Including bad code isn't 
progress IMHO.


I'm ok with occasional unintentional breakage of trunk. I also second 
your feelings toward committing poorly thought-through code - but if we 
allow more freedom in the trunk/ development we need to be prepared that 
such situations will occur. We can view this as a part of the process, 
but let's not hesitate to remove or revert commits that after closer 
examination, even though they are committed, reveal their bad impact or 
bad design, or lack of maintenance. A good example of this other side of 
the process is the recent removal of GData server from Lucene contrib.


In other words, the process of accepting patches would look something 
like this:


 * if a problem is confirmed,
 * and a patch exists,
 * and the patch solves the problem and passes the tests
(* and if it's peer-reviewed in case of more serious changes)
 * then we should commit it straight away.

This is, by the way, the workflow that Hadoop uses and in my opinion we 
should use it too.


However, that's one side of the story - the other side is to watch out 
for creeping featurism. In my opinion we should not commit changes that 
are useful only for niche users, but may require significant changes to 
Nutch. I would be ok with some rarely-used changes if they serve 
specific scenarios that might be useful for other users - but if it's a 
complex change that satisfies the neede only of one user then it 
wouldn't be ok - so a certain balance in this experimentation is also 
needed.


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Commit Times for Issues

2007-11-15 Thread Dennis Kubes
So I have been talking with some of the other committers and I wanted to 
layout a suggestion for standardizing some of the nutch committer 
workflow processes in the hope of speeding up nutch development.


The first one I was hoping to tackle is time to commit.  At least for me 
it has been hard to know when to commit something, especially when it 
was trivial or no one commented on the issue.  Here is what is being 
proposed:


Trivial changes = immediate, this at the discretion of the committers
Minor changes = 24 hours from latest patch or 1 or more +1 from committers
Major and blocker changes = 4 days from latest patch or 2 or more +1 
from committers


This way if an issue has been active for some time but no one has taken 
a look at it, and it has passed all unit tests, then we can go ahead and 
commit it.  Also this should allow more of the smaller changes to be 
handled faster.


So these of course are just some suggestions would love to hear from 
others in the community.  What I think would be best is to come to a 
consensus on this and then have a wiki page describing this and other 
processes for committers.


Dennis Kubes


Re: Commit Times for Issues

2007-11-15 Thread Andrzej Bialecki

Dennis Kubes wrote:
So I have been talking with some of the other committers and I wanted to 
layout a suggestion for standardizing some of the nutch committer 
workflow processes in the hope of speeding up nutch development.


The first one I was hoping to tackle is time to commit.  At least for me 
it has been hard to know when to commit something, especially when it 
was trivial or no one commented on the issue.  Here is what is being 
proposed:


Trivial changes = immediate, this at the discretion of the committers
Minor changes = 24 hours from latest patch or 1 or more +1 from committers
Major and blocker changes = 4 days from latest patch or 2 or more +1 
from committers


This way if an issue has been active for some time but no one has taken 
a look at it, and it has passed all unit tests, then we can go ahead and 
commit it.  Also this should allow more of the smaller changes to be 
handled faster.


So these of course are just some suggestions would love to hear from 
others in the community.  What I think would be best is to come to a 
consensus on this and then have a wiki page describing this and other 
processes for committers.


I agree with the overall plan - we need to speed up the process and 
release the committers from worrying too much whether a patch is ripe 
enough to commit it.


Though I think that in case of minor changes, the 24 hours period is too 
short. By definition, since they are not trivial then it means they 
could use a peer review. Sometimes it's difficult to get a patch 
reviewed within 24 hours, and in the coding enthusiasm it's easy to be 
too quick ... I'd say 48 hours if no review, or less if the patch is 
reviewed and gets +1.


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com