Doug Cutting wrote:
Andrzej Bialecki wrote:
(As a side note: I've been a committer in a couple of Open Source projects for the past 10 years. I have noticed a funny thing about the first commits - they almost inevitably go wrong. You either commit some scratch files, or break the whitespace, or commit to a wrong branch, or accidentally delete the whole package... you know, this sort of things. It just occured to me it would be worthwhile to remind you about this well-established principle, just in case... ;-) )
I did notice that at least one of your patches used carriage returns instead of newlines. The Nutch code uses newlines as its end-of-line
Strange, that's probably the one I edited in Vim... Others were produced using Eclipse.
convention. There are a few 'fixcrlf' calls in build.xml to repair this for generated sources and documentation. Perhaps we should have one in another task (compile? test? a new task?) that normalizes ends of lines for edited code prior to checkin. What do you think?
Well, it reminds me about the infrastructure the FreeBSD project uses.
Allow me this longish digression... CVS can be instrumented to check the patch just before committing, and either fix a patch or refuse a commit if some rule fails. This is implemented e.g. in FreeBSD CVS, and even though it's sometimes irritating, it serves its purpose. FreeBSD CVS scripts check several aspects of each commit (like e.g. the presence of RCS $Id$, absence of unresolved conflict left-overs, they update the copyright to current year, etc), and they can abort a commit if some rules fail. In a similar fashion we could abort a commit if e.g. line endings are incorrect - or we could silently correct them with a simple shell script... :-).
FreeBSD also uses a somewhat baroque RCS template, like the one below:
PR: Submitted by: Reviewed by: Approved by: Obtained from: MFC after: CVS: -------------------------------------------------------------- CVS: PR: Fill this in if a GNATS PR is affected by the change. CVS: Submitted by: Fill this in if someone else sent in the change. CVS: Reviewed by: Fill this in if someone else reviewed your CVS: modification. CVS: Approved by: Fill this in if you needed approval for this commit. CVS: Obtained from: Fill this in if the change is from third party CVS: software. CVS: MFC after: N [day[s]|week[s]|month[s]] CVS: Fill in to get MFC notification later. CVS: (days assumed unless specified)
I'm not advocating adding all of this, that would be a suicide at the moment, but some of this infrastructure is really helpful, and conveys the experience of a long-running large-scale community.
There is also something to be said about a semi-formal concept of "mentoring" new committers in FreeBSD, where a new committer is obliged to pass his/her patches through his "mentor" (see the "Reviewed by:" above) - this helps to improve the code quality, and saves one from embarassing mistakes. It also helps new comitters to find reviewers for their code, if no one else is willing to take a look at their patches...
Another nice thing about FreeBSD CVS setup are the commit logs - SourceForge still has a lot to learn about this, it seems. Here's an example of a commit log:
---------- X ---------- From: [EMAIL PROTECTED] Date: 11/6/2002 4:35 PM Subject: cvs commit: ports/mail/qconfirm Makefile distinfo
ijliao 2002/11/06 07:35:39 PST
Modified files:
mail/qconfirm Makefile distinfo
Log:
upgrade to 0.6.1PR: 44997 Submitted by: maintainer
Revision Changes Path 1.5 +5 -1 ports/mail/qconfirm/Makefile 1.4 +1 -1 ports/mail/qconfirm/distinfo
----------- X ------------
Compare this to the excruciatingly detailed and confusing messages that SourceForge sends by default...
Java doesn't care about the end of line convention, and most editors can handle both equally well, but CVS diffs seem to get confused when someone changes only a few lines but also changes the end-of-line convention on the entire file. Even 'cvs diff -w' doesn't really help.
Hmm... "cvs -Bbd" ? Or maybe it's the same...
So, unless someone has a good workaround for this, I think it is best to keep the repository newline-based.
I agree with the '\n'-only policy - I don't think anyone seriously considers Notepad.exe as their editor of choice :-).
What are the other style-related rules? The policy guide refers to the Sun coding guide, but that document is a bit ambiguous, especially on the use of tabs. I assume we do NOT use the literal Tab character for indentation, and the primary indent is always 4 spaces; line continuations should use 8 spaces.
-- Best regards, Andrzej Bialecki
------------------------------------------------- Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator ------------------------------------------------- FreeBSD developer (http://www.freebsd.org)
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers
