I am certainly interested in where this experiment leads. I am sure many on the list would be interested too.
Using native Java API would certainly simplify things (but not required). To find the bottleneck, I would look in obvious places first : 1. cpu on the client 2. network (netstat on one of the datanodes and the client would be good) 3. disk I/O on datanodes (iostat -x) Is the experimental set up described more in detail some where? With very high b/w networks, TCP buffer sizes could be a factor even with LAN latencies. A jira would also be a good place to discuss details. Raghu. On Tue, Nov 24, 2009 at 3:35 PM, Michael Thomas <tho...@hep.caltech.edu>wrote: > Hey guys, > > During the SC09 exercise, our data transfer tool was using the FUSE > interface to HDFS. As Brian said, we were also reading 16 files in > parallel. This seemed to be the optimal number, beyond which the aggregate > read rate did not improve. > > We have worked scheduled to modify our data transfer tool to use the native > hadoop java APIs, as well as running some additional tests offline to see if > the HDFS-FUSE interface is the bottleneck as we suspect. > > Regards, > > --Mike > > > On 11/24/2009 03:01 PM, Brian Bockelman wrote: > >> Hey Raghu, >> >> There are a few performance issues. Last week during Supercomputing '09, >> Caltech was having issues with getting more than 2.6 Gbps per HDFS client >> process (I think they were pulling 16 files per process, but Mike knows the >> details). I think they'd appreciate any advice you have about tuning HDFS >> performance. >> >> We're starting early R&D for 100Gbps dataflows, and I believe improving >> our current HDFS performance is on the TODO list. >> >> Brian >> >> (PS - I'm not saying HDFS is at fault here - it always remains a >> possibility that we're using it in a sub-optimal manner. If you have any >> favorite Java performance instrumentation to recommend, we'd also be >> interested in that.) >> >> On Nov 24, 2009, at 12:35 PM, Raghu Angadi wrote: >> >> Sequential read is the simplest case and it is pretty hard to improve >>> upon >>> the current raw performance (HDFS client does take more CPU than one >>> might >>> expect, Todd implemented an improvement for CPU consumed). >>> >>> Just to reiterate what Todd said, there is an implicit read ahead for >>> sequential reads with TCP buffers and kernel read ahead on Datanodes. >>> >>> If you extend the read ahead buffer to be more of a buffer cache for the >>> block, it could have big impact for some read access patterns (e.g. >>> binary >>> search). >>> >>> Raghu. >>> >>> On Mon, Nov 23, 2009 at 11:23 PM, Martin Mituzas<xietao1...@hotmail.com >>> >wrote: >>> >>> >>>> I read the code and find the call >>>> DFSInputStream.read(buf, off, len) >>>> will cause the DataNode read len bytes (or less if encounting the end of >>>> block) , why does not hdfs read ahead to improve performance for >>>> sequential >>>> read? >>>> -- >>>> View this message in context: >>>> >>>> http://old.nabble.com/why-does-not-hdfs-read-ahead---tp26491449p26491449.html >>>> Sent from the Hadoop core-user mailing list archive at Nabble.com. >>>> >>>> >>>> >> > >