Re: File Transfer Rates

Brian Bockelman Tue, 10 Feb 2009 21:15:12 -0800


On Feb 10, 2009, at 11:09 PM, Mark Kerzner wrote:

Brian, large files using command-line hadoop go fast, so it issomethingabout my computer or network. I won't worry about this now,especially in
light of Amit reporting fast writes and reads.

You're creating files using SequenceFile, right? It might be that thecreation of the sequence file is the portion which is slow, not thenetwork I/O.

I don't have much knowledge about optimization of SequenceFilecreation. I assume that you'll want to start by tweaking compressionon and off. Additionally, Jeff (I think) pointed to a Hadoop Archivefile, which also might be an alternative for your system. I don'tknow enough to give you a set of pros and cons, just enough to mentionit as an alternative to experiment with.


Sorry I'm not useful here...

Brian

Mark
On Tue, Feb 10, 2009 at 5:00 PM, Brian Bockelman<[email protected]>wrote:
On Feb 10, 2009, at 4:53 PM, Mark Kerzner wrote:

Brian, I have a similar question: why does transfer from a local
filesystem
to SequenceFile takes so long (about 1 second per Meg)?
Hey Mark,
I saw your question about speed the other day ... unfortunately, Ididn't
have any specific advice so I stayed quiet :)

In a correctly configured cluster, performance is mostly limited by
available hardware. If it's obvious that performance is well belowhardwarelimits (such as in your case), it's usually (a) you're notgenerating files
fast enough or (b) something is configured wrong.
Have you just tried hadoop fs -put .... for some large file hangingaroundlocally? If that doesn't go more than 5MB/s or so (when yourhardware canobviously do such a rate), then there's probably a configurationissue.
Brian
Thank you,
Mark

On Tue, Feb 10, 2009 at 4:46 PM, Brian Bockelman <[email protected]
wrote:
On Feb 10, 2009, at 4:10 PM, Wasim Bari wrote:

Hi,
Could someone help me to find some real Figures (transfer rate)aboutHadoop File transfer from local filesystem to HDFS, S3 etc andamong
Storage Systems (HDFS to S3 etc)

Thanks,

Wasim
What are you looking for?  Maximum possible transfer rate?  Maximum
possible transfer rate per client? Generally, if you're usingthe Javaclient, transfer rate to/from HDFS is limited by the hardware youhave
and
the network connection (if you have 1Gbps per client).
I could give you a graph showing a peak of 9Gbps from our Hadoopinstanceto the WAN, but that's not very interesting if you don't have a10Gbps
pipe...

Brian

Re: File Transfer Rates

Reply via email to