The ftp code is not inculded, since it relies on
a customed version of sun.net.www.protocol.ftp, thus not portable. A separate patch will be submitted after I find a way to clean it.
So is there any point in my trying to integrate this patch now, or should I rather wait until you've worked this out?
It is preferred to integrate this patch now, for it only provides hooks to protocol-dependent codes. I intentionally separated it from ftp code. If these hooks are in, someone somewhere might be interested in writing code for protocols other than http or ftp.
Okay, I've integrated your patch into CVS. Thanks for contributing it!
I note that there's no shortage of Java FTP clients:
http://www.javaworld.com/javaworld/jw-04-2003/jw-0404-ftp.html
Back in November, I checked opensource ones on that list. For a while, have written codes using globus-ftp. However that line of work was put on hold, because globus-ftp library had issues in exception-handling (which, in turn, hangs FetcherThread). No ftp clients examined so far can handle socket timeout in a clean and configurable way. I guess this is also one of the reasons that you wrote nutch http client code anew.
Yes, that was certainly a big reason. Other reasons were things like: we wanted the (somewhat unusual) ability to truncate long requests; we wanted to make sure that there were no memory leaks (some HTTP clients cache, e.g., cookies in a static table) or lock contention; and generally wanted to have very good control and understanding of our HTTP requests, since we need to be able to make millions in a single JVM session at a very high rate (hundreds per second).
My current ftp implementation
uses java URL class (with help of a little hacked sun.net.www.*)
and is thus not fully portable, though pretty reliable. I will post it after some cleanup. Any suggestions?
What are the copyright restictions on the sun.net.www.* code? My guess is that we probably can't accept a hacked version of that code. If the code is already included in most JVMs, can you get away with subclassing things, overriding a few methods?
How hard would it be to write a simple FTP client from scratch? I originally wrote Nutch's HTTP client in about a day. It's evolved since then to support more features, but a working, correct HTTP client is not very difficult to write. Is FTP that much harder?
Doug
------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
