Dennis Kubes wrote:


Marcin Okraszewski wrote:
I can say something from a contributor point of view. I've contributed two rather trivial patches and ... I'm discouraged. Simply the process was far too long. Actually I had to ask that someone takes a look for it. Once someone invest his time to create patch, write a Jira entry, etc., you rather expect it to be reviewed and possibly committed. If there is at least one person who needs it that much that is willing to develop it, it may mean there might be others who would need it as well.


Some of the committers have also discussed adding something like a "pending review" workflow to Nutch JIRA for just these cases. Although was leaving that for another discussion. maybe now is the time to discuss.

This is an important and IMHO much needed change.


As looking for perfection, it must be balanced in my opinion. If there is something trivial which is not done perfect, which does not break architecture ... well, it might be acceptable. But if something would make a spaghetti code, I wouldn't be so much for it. So my rule of thumb would be - once it breaks well design, introduces too big complexity, it shouldn't be accepted. If it doesn't influence those, but does what it should, maybe in a bit clumsy way - why not. It still solves someone's problem or need.


I agree. Quality is still a necessity. Including bad code isn't progress IMHO.

I'm ok with occasional unintentional breakage of trunk. I also second your feelings toward committing poorly thought-through code - but if we allow more freedom in the trunk/ development we need to be prepared that such situations will occur. We can view this as a part of the process, but let's not hesitate to remove or revert commits that after closer examination, even though they are committed, reveal their bad impact or bad design, or lack of maintenance. A good example of this other side of the process is the recent removal of GData server from Lucene contrib.

In other words, the process of accepting patches would look something like this:

 * if a problem is confirmed,
 * and a patch exists,
 * and the patch solves the problem and passes the tests
(* and if it's peer-reviewed in case of more serious changes)
 * then we should commit it straight away.

This is, by the way, the workflow that Hadoop uses and in my opinion we should use it too.

However, that's one side of the story - the other side is to watch out for creeping featurism. In my opinion we should not commit changes that are useful only for niche users, but may require significant changes to Nutch. I would be ok with some rarely-used changes if they serve specific scenarios that might be useful for other users - but if it's a complex change that satisfies the neede only of one user then it wouldn't be ok - so a certain balance in this experimentation is also needed.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to