>may you will find that interesting also: >http://maven.apache.org/using/multiproject.html
I'd rather suggest to support Apache HttpClient, huge amount of unnecessary code could be easily removed from Nutch. We don't need to calculate "actual URL" after redirecting, GetMethod does it all for us. Using HTTP HEAD can improve performance; and many more staff. Google uses HEAD method, I noticed from logs. What about NekoHTML parser? getTextHelper method seems to be very strange, Java 5 does it all (DOM level 3); new Parser plugin could be based on http://htmlparser.sourceforge.net - and again we can remove buggy getOutlinks(). I have experience with Maven, and CruiseControl. All Maven's staff (checkstyle, javadoc, xdoc, developer's activity report, etc.) could be run via ANT. Not a first priority... ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
