Dennis Kubes wrote:
Marcin Okraszewski wrote:
I can say something from a contributor point of view. I've contributed
two rather trivial patches and ... I'm discouraged. Simply the process
was far too long. Actually I had to ask that someone takes a look for
it. Once someone invest his time to create patch, write a Jira entry,
etc., you rather expect it to be reviewed and possibly committed. If
there is at least one person who needs it that much that is willing to
develop it, it may mean there might be others who would need it as well.
Some of the committers have also discussed adding something like a
"pending review" workflow to Nutch JIRA for just these cases. Although
was leaving that for another discussion. maybe now is the time to discuss.
This is an important and IMHO much needed change.
As looking for perfection, it must be balanced in my opinion. If there
is something trivial which is not done perfect, which does not break
architecture ... well, it might be acceptable. But if something would
make a spaghetti code, I wouldn't be so much for it. So my rule of
thumb would be - once it breaks well design, introduces too big
complexity, it shouldn't be accepted. If it doesn't influence those,
but does what it should, maybe in a bit clumsy way - why not. It still
solves someone's problem or need.
I agree. Quality is still a necessity. Including bad code isn't
progress IMHO.
I'm ok with occasional unintentional breakage of trunk. I also second
your feelings toward committing poorly thought-through code - but if we
allow more freedom in the trunk/ development we need to be prepared that
such situations will occur. We can view this as a part of the process,
but let's not hesitate to remove or revert commits that after closer
examination, even though they are committed, reveal their bad impact or
bad design, or lack of maintenance. A good example of this other side of
the process is the recent removal of GData server from Lucene contrib.
In other words, the process of accepting patches would look something
like this:
* if a problem is confirmed,
* and a patch exists,
* and the patch solves the problem and passes the tests
(* and if it's peer-reviewed in case of more serious changes)
* then we should commit it straight away.
This is, by the way, the workflow that Hadoop uses and in my opinion we
should use it too.
However, that's one side of the story - the other side is to watch out
for creeping featurism. In my opinion we should not commit changes that
are useful only for niche users, but may require significant changes to
Nutch. I would be ok with some rarely-used changes if they serve
specific scenarios that might be useful for other users - but if it's a
complex change that satisfies the neede only of one user then it
wouldn't be ok - so a certain balance in this experimentation is also
needed.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com