ok, here you go: I have 3 servers: datanode on server 1, 2, 3 namenode on server 1 secondarynamenode on server 2
all servers are at the hetzner datacenter and connected through 100Mbit link, pings between them about 0.1ms each server has 24Gb ram and intel core i7 3Ghz CPU disk is 700Gb RAID the bindings related configuration is the following: server 1: core-site.xml -------------------------------------- <name>fs.default.name</name> <value>hdfs://5.6.7.11:8020</value> -------------------------------------- hdfs-site.xml -------------------------------------- <name>dfs.datanode.address</name> <value>0.0.0.0:50010</value> <name>dfs.datanode.http.address</name> <value>0.0.0.0:50075</value> <name>dfs.http.address</name> <value>5.6.7.11:50070</value> <name>dfs.secondary.https.port</name> <value>50490</value> <name>dfs.https.port</name> <value>50470</value> <name>dfs.https.address</name> <value>5.6.7.11:50470</value> <name>dfs.secondary.http.address</name> <value>5.6.7.12:50090</value> -------------------------------------- server 2: core-site.xml -------------------------------------- <name>fs.default.name</name> <value>hdfs://5.6.7.11:8020</value> -------------------------------------- hdfs-site.xml -------------------------------------- <name>dfs.datanode.address</name> <value>0.0.0.0:50010</value> <name>dfs.datanode.http.address</name> <value>0.0.0.0:50075</value> <name>dfs.http.address</name> <value>5.6.7.11:50070</value> <name>dfs.secondary.https.port</name> <value>50490</value> <name>dfs.https.port</name> <value>50470</value> <name>dfs.https.address</name> <value>5.6.7.11:50470</value> <name>dfs.secondary.http.address</name> <value>5.6.7.12:50090</value> -------------------------------------- server 3: core-site.xml -------------------------------------- <name>fs.default.name</name> <value>hdfs://5.6.7.11:8020</value> -------------------------------------- hdfs-site.xml -------------------------------------- <name>dfs.datanode.address</name> <value>0.0.0.0:50010</value> <name>dfs.datanode.http.address</name> <value>0.0.0.0:50075</value> <name>dfs.http.address</name> <value>127.0.0.1:50070</value> <name>dfs.secondary.https.port</name> <value>50490</value> <name>dfs.https.port</name> <value>50470</value> <name>dfs.https.address</name> <value>127.0.0.1:50470</value> <name>dfs.secondary.http.address</name> <value>5.6.7.12:50090</value> -------------------------------------- netstat output: server 1 > tcp 0 0 5.6.7.11:8020 0.0.0.0:* LISTEN > 10870/java > tcp 0 0 5.6.7.11:50070 0.0.0.0:* LISTEN > 10870/java > tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN > 10997/java > tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN > 10997/java > tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN > 10997/java server 2 > tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN > 23683/java > tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN > 23683/java > tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN > 23683/java > tcp 0 0 5.6.7.12:50090 0.0.0.0:* LISTEN > 23778/java server 3 > tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN > 894/java > tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN > 894/java > tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN > 894/java if I'm transferring big files between servers I'm getting about 9Mb/s and even 10Mb/s with rsync On 10/09/12 11:56 PM, Harsh J wrote: > Hi, > > OK, can you detail your network infrastructure used here, and also > make sure your daemons are binding to the right interfaces as well > (use netstat to check perhaps)? What rate of transfer do you get for > simple file transfers (ftp, scp, etc.)? > > On Wed, Oct 10, 2012 at 12:24 PM, Alexey <alexx...@gmail.com> wrote: >> Hello Harsh, >> >> I notices such issues from the start. >> Yes, I mean dfs.balance.bandwidthPerSec property, I set this property to >> 5000000. >> >> On 10/09/12 11:50 PM, Harsh J wrote: >>> Hey Alexey, >>> >>> Have you noticed this right from the start itself? Also, what exactly >>> do you mean by "Limited replication bandwidth between datanodes - >>> 5Mb." - Are you talking of dfs.balance.bandwidthPerSec property? >>> >>> On Wed, Oct 10, 2012 at 10:53 AM, Alexey <alexx...@gmail.com> wrote: >>>> Additional info: I also tried to use openjdk instead of sun's - issue >>>> still persists >>>> >>>> On 10/09/12 03:12 AM, Alexey wrote: >>>>> Hi, >>>>> >>>>> I have an issues with hadoop dfs, I have 3 servers (24Gb RAM on each). >>>>> The servers are not overloaded, they just have hadoop installed. One >>>>> have datanode and namenode, second - datanode only, third - datanode and >>>>> secondarynamenode. >>>>> >>>>> Hadoop datanodes have a max memory limit 8Gb. Default replication factor >>>>> - 2. Limited replication bandwidth between datanodes - 5Mb. >>>>> >>>>> I've setupped hadoop to communicate between nodes by IP address. >>>>> Everything is works - I can read/write files on each datanode, etc. But >>>>> the issue is that hadoop dfs commands are executing very slow, even >>>>> "hadoop dfs -ls /" takes about 3 seconds to execute, but it have only >>>>> one folder /user in it. >>>>> Files are also uploading to the hdfs very slow - hundreds >>>>> kilobytes/second. >>>>> >>>>> I'm using Debian stable x86-64 distribution and hadoop running through >>>>> sun-java6-jdk 6.26-0squeeze1 >>>>> >>>>> Please give me any suggestions what I need to adjust/check to arrange >>>>> this issue. >>>>> >>>>> As I said before - overall hdfs configuration is correct, because >>>>> everything works except performance. >>>>> >>>>> -- >>>>> Best regards >>>>> Alexey >>>>> >>>> >>>> -- >>>> Best regards >>>> Alexey >>> >>> >>> >> >> -- >> Best regards >> Alexey > > > -- Best regards Alexey