Attached is my patch that enables nutch for non-http fetching. The idea is to have non-http response mimic http one, so that code changes are at minimum. Specifically: (1) Response.java is made as an interface instead of a class (2) HostQueueKey is tweaked to include url scheme (protocol) and port (3) robots.txt handling is made to work with non-http protocol. (4) HttpResponse.java and Http.java are modifided to reflect changes above. (5) Dummy FtpResponse.java and Ftp.java are added
With this patch, further extension for other protocols should be straightforward.
This looks great to me. Thanks.
The ftp code is not inculded, since it relies on
a customed version of sun.net.www.protocol.ftp, thus not portable. A separate patch will be submitted after I find a way to clean it.
So is there any point in my trying to integrate this patch now, or should I rather wait until you've worked this out?
I note that there's no shortage of Java FTP clients:
http://www.javaworld.com/javaworld/jw-04-2003/jw-0404-ftp.html
Thanks again,
Doug
------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
