Re: HTTP transport?

Doug Cutting Mon, 28 Sep 2009 10:01:45 -0700

Scott Carey wrote:

HTTP is very useful and typically performs very well.  It has lots of
things built-in too.  In addition to what you mention, it  has a
caching mechanism built-in, range queries, and all sorts of ways to
tag along state if needed.  To top it off there are a lot of testing
and debugging tools available for it.  So from that front using it is
very attractive.


Glad you agree!

However, In my experience zero-copy is not going to be much of a gain
performance-wise for this sort of application, and will limit what
can be done.  As long as a servlet doesn't transcode data and mostly
copies, it will be very fast - many multiples of gigabit ethernet
speed per CPU - far more than most disk setups will handle for a
while.

In MapReduce, datanodes are also running map and reduce tasks, so we'dlike it if datanodes not only keep up with disks and networks, but alsouse minimal CPU to do so. Zerocopy on the datanode has been shown tohelp significantly MapReduce benchmarks. That said, zero copy may ormay not be significantly better than one-copy. I intend to benchmarkthat. But the important thing to measure is not just throughput butalso idle CPU.

Additionally, I'm not sure CRC checking should occur on the
client.  TCP/IP already checksums packets, so network data corruption
over HTTP is not a primary concern.   The big concern is silent data
corruption on the disk.

I believe that disks are the largest source of data corruption, but I amnot confident they are the only source. HDFS uses end-to-end checksums.As data is written to HDFS it is immediately checksummed on theclient. This checksum then lives with the data and is validated on theclient immediately before the data is returned to the application. Thegoal is to catch corruption wherever it may occur, on disks, on thenetwork, or while buffered in memory. In addition, the checksum isvalidated after data is transmitted to datanodes but before beforeblocks are stored, so that initial network and memory corruptions arecaught early and the writing process fails, rather than permitting anapplication to write corrupt data. Finally, datanodes periodically scanfor corrupt blocks on disks, replacing them with non-corrupt replicas,decreasing the chance that over time all replicas become corrupt.

Additionally, embedding Tomcat tends to be more tricky than Jetty,
though that can be overcome.  One might argue that we don't even want
a servlet container, we just want an HTTP connector.  The Servlet API
is familiar, but for a high performance transport it might just be
overhead and restrictive.  Direct access to Tomcat's NIO connector
might be significantly lighter-weight and more flexible. Tomcat's NIO
connector implementation works great and I have had great success
with up to 10K connections with the pure Java connector using
ordinary byte buffers and about 20 servlet threads.

I hope to start benchmarking bulk data RPC over the next few weeks.I'll probably start with a servlet using Jetty, then see if I canincrease throughput and decrease CPU utilization through the use ofthings like Tomcat's NIO connector, Grizzly, etc.


Doug

Re: HTTP transport?

Reply via email to