I am certainly interested in where this experiment leads. I am sure many on
the list would be interested too.

Using native Java API would certainly simplify things (but not required).

To find the bottleneck, I would look in obvious places first :
 1. cpu on the client
 2. network (netstat on one of the datanodes and the client would be good)
 3. disk I/O on datanodes (iostat -x)

Is the experimental set up described more in detail some where?

With very high b/w networks, TCP buffer sizes could be a factor even with
LAN latencies.

A jira would also be a good place to discuss details.

Raghu.

On Tue, Nov 24, 2009 at 3:35 PM, Michael Thomas <tho...@hep.caltech.edu>wrote:

> Hey guys,
>
> During the SC09 exercise, our data transfer tool was using the FUSE
> interface to HDFS.  As Brian said, we were also reading 16 files in
> parallel.  This seemed to be the optimal number, beyond which the aggregate
> read rate did not improve.
>
> We have worked scheduled to modify our data transfer tool to use the native
> hadoop java APIs, as well as running some additional tests offline to see if
> the HDFS-FUSE interface is the bottleneck as we suspect.
>
> Regards,
>
> --Mike
>
>
> On 11/24/2009 03:01 PM, Brian Bockelman wrote:
>
>> Hey Raghu,
>>
>> There are a few performance issues.  Last week during Supercomputing '09,
>> Caltech was having issues with getting more than 2.6 Gbps per HDFS client
>> process (I think they were pulling 16 files per process, but Mike knows the
>> details).  I think they'd appreciate any advice you have about tuning HDFS
>> performance.
>>
>> We're starting early R&D for 100Gbps dataflows, and I believe improving
>> our current HDFS performance is on the TODO list.
>>
>> Brian
>>
>> (PS - I'm not saying HDFS is at fault here - it always remains a
>> possibility that we're using it in a sub-optimal manner.  If you have any
>> favorite Java performance instrumentation to recommend, we'd also be
>> interested in that.)
>>
>> On Nov 24, 2009, at 12:35 PM, Raghu Angadi wrote:
>>
>>  Sequential read is the simplest case and it is pretty hard to improve
>>> upon
>>> the current raw performance (HDFS client does take more CPU than one
>>> might
>>> expect, Todd implemented an improvement for CPU consumed).
>>>
>>> Just to reiterate what Todd said, there is an implicit read ahead for
>>> sequential reads with TCP buffers and kernel read ahead on Datanodes.
>>>
>>> If you extend the read ahead buffer to be more of a buffer cache for the
>>> block, it could have big impact for some read access patterns (e.g.
>>> binary
>>> search).
>>>
>>> Raghu.
>>>
>>> On Mon, Nov 23, 2009 at 11:23 PM, Martin Mituzas<xietao1...@hotmail.com
>>> >wrote:
>>>
>>>
>>>> I read the code and find the call
>>>> DFSInputStream.read(buf, off, len)
>>>> will cause the DataNode read len bytes (or less if encounting the end of
>>>> block) , why does not hdfs read ahead to improve performance for
>>>> sequential
>>>> read?
>>>> --
>>>> View this message in context:
>>>>
>>>> http://old.nabble.com/why-does-not-hdfs-read-ahead---tp26491449p26491449.html
>>>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>>>>
>>>>
>>>>
>>
>
>

Reply via email to