Kelvin Tan wrote:
Interesting. I haven't tried it myself. Do you have any code/benchmarks for
this?
I never committed it anywhere. I initially tried to write Nutch's IPC
mechanism with nio and it was slow and buggy. One problem was that I
needed to switch streams to non-blocking mode in order to read
arbitrarily large objects, then switch them back to blocking mode in
order to select() on them. But you can't change this state and remove
them from the selector without going through the scheduler. So the
benefit of skipping the scheduler wasn't there. If I was willing to
fragment objects into fixed size chunks then it might have worked, but
that's a lot of work. It's a strange limitation, since with native
sockets one can select and then perform arbitrary stream i/o, not
limited to a single buffer.
Also, there's an nio version of Lucene's Directory that's a bit slower
than the non-nio version, but this is not using select() or anything.
Are you aware of others facing the same problem?
How much non-blocking nio code do you find in real Java code? I have
not seen a lot.
I did find that Sun has implemented a high-performance HTTP client using
nio. This is documented at:
http://blogs.sun.com/roller/resources/fp/grizzly.pdf
From what I can tell the primary benefit is in number of simultaneous
clients, not in throughput. Does a crawler require 1000's of
simultaneous connections? If so, then it looks like careful use of nio
could offer some real benefits.
Doug
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers