I set dfs.datanode.max.xcievers to 4096, but this didn't seem to have
any effect on performance.

Here are some benchmarks (not sure what typical values are):

----- TestDFSIO ----- : write
           Date & time: Tue Mar 30 04:53:18 EDT 2010
       Number of files: 10
Total MBytes processed: 10000
     Throughput mb/sec: 23.41355598064167
Average IO rate mb/sec: 25.179018020629883
 IO rate std deviation: 7.022948102609891
    Test exec time sec: 74.437

----- TestDFSIO ----- : read
           Date & time: Tue Mar 30 05:02:01 EDT 2010
       Number of files: 10
Total MBytes processed: 10000
     Throughput mb/sec: 10.735545929349373
Average IO rate mb/sec: 10.741226196289062
 IO rate std deviation: 0.24872891783558398
    Test exec time sec: 119.561

----- TestDFSIO ----- : write
           Date & time: Tue Mar 30 05:09:59 EDT 2010
       Number of files: 40
Total MBytes processed: 40000
     Throughput mb/sec: 3.3887489806219473
Average IO rate mb/sec: 5.173769950866699
 IO rate std deviation: 6.293246618896401
    Test exec time sec: 360.765

----- TestDFSIO ----- : read
           Date & time: Tue Mar 30 05:18:20 EDT 2010
       Number of files: 40
Total MBytes processed: 40000
     Throughput mb/sec: 2.345990558443698
Average IO rate mb/sec: 2.3469674587249756
 IO rate std deviation: 0.04731737036312141
    Test exec time sec: 477.568

I also used 40 files in the benchmarks because I have 10 compute nodes
with mapred.tasktracker.map.tasks.maximum set to 4. It looks like
performance degrades quite a bit when switching from 10 files.

I set mapred.tasktracker.map.tasks.maximum to 1 and ran a MR job. This
got map completion times back down to the expected 15-30 seconds, but
did not change the overall running time.

Does this just mean that the RAID isn't able to keep up with 10*4=40
parallel requests, but it is able to keep up with 10*1=10 parallel
requests? And if so, is there anything I can do to change this? I know
this isn't how HDFS is meant to be used, but this single DN/RAID setup
has worked for me in the past on a similarly-sized cluster.

Ed

On Tue, Mar 30, 2010 at 4:29 AM, Ankur C. Goel <gan...@yahoo-inc.com> wrote:
>
> M/R is performance is known to be better when using just a bunch of disks 
> (BOD) instead of RAID.
>
> From your setup it looks like your single datanode must be running hot on I/O 
> activity.
>
> The parameter- dfs.datanode.handler.count only control the number of datanode 
> threads serving IPC request.
> These are NOT used for actual block transfer. Try upping - 
> dfs.datanode.max.xcievers.
>
> You can then run the I/O  benchmarks to measure the I/O throughput -
> jar $HADOOP_INSTALL/hadoop-*-test.jar TestDFSIO -write -nrFiles 10 -fileSize 
> 1000
>
> -...@nkur
>
> On 3/30/10 12:46 PM, "Ed Mazur" <ma...@cs.umass.edu> wrote:
>
> Hi,
>
> I have a 12 node cluster where instead of running a DN on each compute
> node, I'm running just one DN backed by a large RAID (with a
> dfs.replication of 1). The compute node storage is limited, so the
> idea behind this was to free up more space for intermediate job data.
> So the cluster has that one node with the DN, a master node with the
> JT/NN, and 10 compute nodes each with a TT. I am running 0.20.1+169.68
> from Cloudera.
>
> The problem is that MR job performance is now worse than when using a
> traditional HDFS setup. A job that took 76 minutes before now takes
> 169 minutes. I've used this single DN setup before on a
> similarly-sized cluster without any problems, so what can I do to find
> the bottleneck?
>
> -Loading data into HDFS was fast, under 30 minutes to load ~240GB, so
> I'm thinking this is a DN <-> map task communication problem.
>
> -With a traditional HDFS setup, map tasks were taking 10-30 seconds,
> but they now take 45-90 seconds or more.
>
> -I grep'd the DN logs to find how long the size 67633152 HDFS reads
> (map inputs) were taking. With the central DN, the reads were an order
> of magnitude slower than with traditional HDFS (e.g. 82008147000 vs.
> 8238455000).
>
> -I tried increasing dfs.datanode.handler.count to 10, but this didn't
> seem to have any effect.
>
> -Could low memory be an issue? The machine the DN is running on only
> has 2GB and there is less than 100MB free without the DN running. I
> haven't observed any swapping going on though.
>
> -I looked at netstat during a job. I wasn't too sure what to look for,
> but I didn't see any substantial send/receive buffering.
>
> I've tried everything I can think of, so I'd really appreciate any tips. 
> Thanks.
>
> Ed
>
>

Reply via email to