Re: Want to unsubscibe myself

2014-09-03 Thread Wilm Schumacher
Hi,

you have to send an e-mail from your mail account to:
user-unsubscr...@hadoop.apache.org

Btw: I would like to propose an automatic regular e-mail with some
informations about this mailing list, e.g. with "how to unsubscribe" ;).

Best wishes

Wilm


HDFS balance

2014-09-03 Thread Georgi Ivanov
Hi,
We have 11 nodes cluster.
Every hour a cron job is started to upload one file( ~1GB) to Hadoop on
node1. (plain hadoop fs -put)

This way node1 is getting full because the first replica is always
stored on the node where the command is executed.
Every day i am running re-balance, but this seems to be not enough.
The effect of this is :
host1 4.7TB/5.3TB
host[2-10] : 4.1/5.3

So i am always out of space on host1.

What i can do is , spread the job to all the nodes and execute the job
on random host.
I don't really like this solution as it involves some NFS mounts,
security issues etc.

Is there any better solution ?

Thanks in advance.
George



Re: HDFS balance

2014-09-03 Thread AnilKumar B
Better to create one client/gateway node(where no DN is running) and
schedule your cron from that machine.

Thanks & Regards,
B Anil Kumar.


On Wed, Sep 3, 2014 at 1:25 PM, Georgi Ivanov 
wrote:

> Hi,
> We have 11 nodes cluster.
> Every hour a cron job is started to upload one file( ~1GB) to Hadoop on
> node1. (plain hadoop fs -put)
>
> This way node1 is getting full because the first replica is always
> stored on the node where the command is executed.
> Every day i am running re-balance, but this seems to be not enough.
> The effect of this is :
> host1 4.7TB/5.3TB
> host[2-10] : 4.1/5.3
>
> So i am always out of space on host1.
>
> What i can do is , spread the job to all the nodes and execute the job
> on random host.
> I don't really like this solution as it involves some NFS mounts,
> security issues etc.
>
> Is there any better solution ?
>
> Thanks in advance.
> George
>
>


Re: Hadoop 2.0.0 stopping itself

2014-09-03 Thread Harsh J
The general@ mailing list is NOT for end-user questions and
discussion. Please use the user@hadoop.apache.org mailing list for
such issues.

Please also read http://hadoop.apache.org/mailing_lists.html prior to
posting emails here.

As to your problem, you will need to look at your logs to determine
why the service crashes. It is likely that the NameNode is not
starting up due to missing data or non-formatting, but you will get a
proper precise reason by looking for a FATAL error in your master
service logs.

On Wed, Sep 3, 2014 at 2:30 PM, Juan García  wrote:
> Hello everyboyd,
>
> when I start Hadoop using start-dfs and start-yarn all nodes are correctly
> initiated. I check that using jps. After a moment, I'm not sure a duration
> (10 or 20 min), if I check, other nodes are down and jps only show jps
> proccess. Can I configure Hadoop to non stop? Is an error? I'm very new in
> Hadoop.
>
> Thanks in advance.



-- 
Harsh J


OEV - NameNode crash Edits file for 1.0.3

2014-09-03 Thread Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
Hi

My NameNode  is crashed. It gave a NullPointerException at 
FsDirectory.addChild

1)  My Hadoop version 1.0.3
2)  I patched a code for identifying the files throwing 
NullPointerException -  I found those files.
3)  Now I want to remove those file operations in the edits file
4)  I converted the edits file to edits.xml  using Hadoop 2 tool  - OEV
5)  I removed the unwanted file ops and convered edit.xml to edits binary 
format using same OEV tool of Hadoop 2
6)  If I put this edits file,  NameNode is throwing Unexpected version 
IOException in FSNameSystem.java

   How can I recover my NameNode?  NameNode starts if I  perform  another 
option of skiping the files that throws NullPointerException (later I was able 
to build a new FsImage)
  Is this approach correct? Is there is any good option?


Thanks and Regards
Prabakaran.N  aka NP
Nokia Networks, Bangalore
When "I" is replaced by "We" - even Illness becomes "Wellness"






Unsubscribe

2014-09-03 Thread Don Hilborn
unsubscribe
*Don Hilborn *Solutions Engineer, Hortonworks
*Mobile: 832-444-5463*
Email: *dhilb...@hortonworks.com *
Website: *http://www.hortonworks.com/ *


Hortonworks where business data becomes business insight




















-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


RE: HDFS balance

2014-09-03 Thread John Lilley
Can you run the load from an "edge node" that is not a DataNode?
john

John Lilley
Chief Architect, RedPoint Global Inc.
1515 Walnut Street | Suite 300 | Boulder, CO 80302
T: +1 303 541 1516  | M: +1 720 938 5761 | F: +1 781-705-2077
Skype: jlilley.redpoint | john.lil...@redpoint.net | www.redpoint.net


-Original Message-
From: Georgi Ivanov [mailto:iva...@vesseltracker.com] 
Sent: Wednesday, September 03, 2014 1:56 AM
To: user@hadoop.apache.org
Subject: HDFS balance

Hi,
We have 11 nodes cluster.
Every hour a cron job is started to upload one file( ~1GB) to Hadoop on node1. 
(plain hadoop fs -put)

This way node1 is getting full because the first replica is always stored on 
the node where the command is executed.
Every day i am running re-balance, but this seems to be not enough.
The effect of this is :
host1 4.7TB/5.3TB
host[2-10] : 4.1/5.3

So i am always out of space on host1.

What i can do is , spread the job to all the nodes and execute the job on 
random host.
I don't really like this solution as it involves some NFS mounts, security 
issues etc.

Is there any better solution ?

Thanks in advance.
George



RE: How can I increase the speed balancing?

2014-09-03 Thread John Lilley
I have also found that neither
dfsadmin - setBalanacerBandwidth
nor
dfs.datanode.balance.bandwidthPerSec’
have any notable effect on apparent balancer rate.  This is on Hadoop 2.2.0

john

From: cho ju il [mailto:tjst...@kgrid.co.kr]
Sent: Wednesday, September 03, 2014 12:55 AM
To: user@hadoop.apache.org
Subject: RE: How can I increase the speed balancing?


Bandwidth is enough

And i use command "bin/hdfs dfsadmin -setBalancerBandwidth 52428800 ".

Yet balancing slow.

I think because the file transfer speed is slow and  move 5 files per 1 server .

-Original Message-
From: "Srikanth upputuri"
To: "user@hadoop.apache.org";
Cc:
Sent: 2014-09-03 (수) 14:10:24
Subject: RE: How can I increase the speed balancing?

I am not sure what you meant by ‘Bandwidth is not a lack of data nodes’ but 
have you configured the balancer bandwidth property 
‘dfs.datanode.balance.bandwidthPerSec’? If not it defaults to 1KB/s. You can 
increase this to improve the balancer speed. You may also set it dynamically 
using the command ‘dfsadmin -setBalanacerBandwidth newbandwidth’ before running 
balancer.





From: cho ju il [mailto:tjst...@kgrid.co.kr]
Sent: 03 September 2014 06:31
To: user@hadoop.apache.org
Subject: How can I increase the speed balancing?



hadoop version 2.4.1

Balancing speed is slow.

I think because the file transfer speed is slow.

Bandwidth is not a lack of data nodes.

How can I increase the speed balancing?









*** balancer log

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.207:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.205:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.203:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.210:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.211:40010

2014-09-03 09:44:18,042 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.114:40010

2014-09-03 09:44:18,042 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.206:40010

2014-09-03 09:44:18,042 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.201:40010

2014-09-03 09:44:18,043 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.202:40010

2014-09-03 09:44:18,043 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.204:40010

2014-09-03 09:44:18,043 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 3 
over-utilized: [Source[192.168.0.203:40010, utilization=99.99693907952093], 
Source[192.168.0.201:40010, utilization=99.99713240471648], 
Source[192.168.0.202:40010, utilization=99.99652052169367]]

2014-09-03 09:44:18,043 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 2 
underutilized: [BalancerDatanode[192.168.0.211:40010, 
utilization=62.735024524531006], BalancerDatanode[192.168.0.114:40010, 
utilization=2.3174560700459224]]

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Need to move 70.30 TB to make the cluster balanced.

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Decided to move 10 GB bytes from 192.168.0.203:40010 to 192.168.0.211:40010

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Decided to move 10 GB bytes from 192.168.0.201:40010 to 192.168.0.114:40010

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Will move 20 GB in this iteration

2014-09-03 09:44:23,643 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Successfully moved blk_1077216256_3475617 with size=1746577 from 
192.168.0.201:40010 to 192.168.0.114:40010 through 192.168.0.205:40010

2014-09-03 09:45:38,730 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Successfully moved blk_1076170742_2430103 with size=16746627 from 
192.168.0.203:40010 to 192.168.0.211:40010 through 192.168.0.203:40010

2014-09-03 09:46:42,649 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Successfully moved blk_1077216315_3475676 with size=38338949 from 
192.168.0.201:40010 to 192.168.0.114:40010 through 192.168.0.206:40010

2014-09-03 09:50:13,239 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Successfully moved blk_1074229029_488207 with size=89080307 from 
192.168.0.203:40010 to 192.168.0.211:40010 through 192.168.0.203:40010

2014-09-03 09:50:58,659 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Successfully moved blk_1077216317_3475678 with size=134217728 from 
192.168.0.201:40010 to 192.168.0.114:40010 through 192.168.0.210:40010

2014-09-03 09:51:15,660 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Successfully moved blk_1075275002_1534363 with size=

Re: Unsubscribe

2014-09-03 Thread Ted Yu
See http://hadoop.apache.org/mailing_lists.html

Please send email to user-unsubscr...@hadoop.apache.org


On Wed, Sep 3, 2014 at 5:23 AM, Don Hilborn 
wrote:

> unsubscribe
> *Don Hilborn *Solutions Engineer, Hortonworks
>  *Mobile: 832-444-5463 <832-444-5463>*
> Email: *dhilb...@hortonworks.com *
> Website: *http://www.hortonworks.com/ *
>
>
> Hortonworks where business data becomes business insight
> 
>
>
>
> 
>
> 
> 
>
> 
> 
>
> 
> 
>
> 
> 
>
> 
> 
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.


Re: Hadoop 2.0.0 stopping itself

2014-09-03 Thread Juan García
First of all, sorry for the inconvenience. Thanks for advice. I will 
check my logs.



El 03/09/14 a las #4, Harsh J escribió:

The general@ mailing list is NOT for end-user questions and
discussion. Please use the user@hadoop.apache.org mailing list for
such issues.

Please also read http://hadoop.apache.org/mailing_lists.html prior to
posting emails here.

As to your problem, you will need to look at your logs to determine
why the service crashes. It is likely that the NameNode is not
starting up due to missing data or non-formatting, but you will get a
proper precise reason by looking for a FATAL error in your master
service logs.

On Wed, Sep 3, 2014 at 2:30 PM, Juan García  wrote:

Hello everyboyd,

when I start Hadoop using start-dfs and start-yarn all nodes are correctly
initiated. I check that using jps. After a moment, I'm not sure a duration
(10 or 20 min), if I check, other nodes are down and jps only show jps
proccess. Can I configure Hadoop to non stop? Is an error? I'm very new in
Hadoop.

Thanks in advance.







question about matching java API with libHDFS

2014-09-03 Thread Demai Ni
hi, folks,

I am currently using java to access HDFS. for example, I am using this API
" DFSclient.getNamenode().getBlockLocations(...)..." to retrieve file block
information.

Now I need to move the same logic into C/C++. so I am looking at libHDFS,
and this wiki page: http://wiki.apache.org/hadoop/LibHDFS. And I am also
using the hdfs_test.c for some reference. However, I couldn't find a way to
easily figure out whether above Java API is exposed through libHDFS?

Probably not, since I couldn't find it. Then, it lead to my next question.
Is there an easy way to plug in the libHDFS framework, to include additonal
API?

thanks a lot for your suggestions

Demai


StandbyCheckpointer: Exception in doCheckpoint

2014-09-03 Thread cho ju il
hadoop version 2.4.1
checkpoing fail
 
** namenode log, 
2014-09-03 13:35:05,338 ERROR 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Exception in 
doCheckpointjava.io.IOException: No image directories available!at 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1047)at
 
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1020)at
 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:182)at
 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)at
 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.doWork(StandbyCheckpointer.java:336)at
 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.access$600(StandbyCheckpointer.java:243)at
 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread$1.run(StandbyCheckpointer.java:263)at
 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)at
 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.run(StandbyCheckpointer.java:259)
 
 
 
** fsimage did not checkpoint
-rw-rw-r--. 1 hadoop hadoop 34985111 2014-09-03 11:34 
fsimage_9498201
-rw-rw-r--. 1 hadoop hadoop   62 2014-09-03 11:34 
fsimage_9498201.md5
-rw-rw-r--. 1 hadoop hadoop 35047073 2014-09-03 12:34 
fsimage_9501479

-rw-rw-r--. 1 hadoop hadoop   62 2014-09-03 12:34 
fsimage_9501479.md5


Datanode can not start with error "Error creating plugin: org.apache.hadoop.metrics2.sink.FileSink"

2014-09-03 Thread ch huang
hi,maillist:

   i have a 10-worknode hadoop cluster using CDH 4.4.0 , one of my datanode
,one of it's disk is full

, when i restart this datanode ,i get error


STARTUP_MSG:   java = 1.7.0_45
/
2014-09-04 10:20:00,576 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal
handlers for [TERM, HUP, INT]
2014-09-04 10:20:01,457 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
loaded properties from hadoop-metrics2.properties
2014-09-04 10:20:01,465 WARN
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Error creating sink
'file'
org.apache.hadoop.metrics2.impl.MetricsConfigException: Error creating
plugin: org.apache.hadoop.metrics2.sink.FileSink
at
org.apache.hadoop.metrics2.impl.MetricsConfig.getPlugin(MetricsConfig.java:203)
at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.newSink(MetricsSystemImpl.java:478)
at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configureSinks(MetricsSystemImpl.java:450)
at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:429)
at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:180)
at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:156)
at
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:54)
at
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1792)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1728)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1751)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1904)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1925)
Caused by: org.apache.hadoop.metrics2.MetricsException: Error creating
datanode-metrics.out
at org.apache.hadoop.metrics2.sink.FileSink.init(FileSink.java:53)
at
org.apache.hadoop.metrics2.impl.MetricsConfig.getPlugin(MetricsConfig.java:199)
... 12 more
Caused by: java.io.FileNotFoundException: datanode-metrics.out (Permission
denied)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:221)
at java.io.FileWriter.(FileWriter.java:107)
at org.apache.hadoop.metrics2.sink.FileSink.init(FileSink.java:48)
... 13 more
2014-09-04 10:20:01,488 INFO
org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Sink ganglia started
2014-09-04 10:20:01,546 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
period at 5 second(s).
2014-09-04 10:20:01,546 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
started
2014-09-04 10:20:01,547 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is ch15
2014-09-04 10:20:01,569 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at
/0.0.0.0:50010
2014-09-04 10:20:01,572 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is
10485760 bytes/s
2014-09-04 10:20:01,607 INFO org.mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
2014-09-04 10:20:01,657 INFO org.apache.hadoop.http.HttpServer: Added
global filter 'safety'
(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2014-09-04 10:20:01,660 INFO org.apache.hadoop.http.HttpServer: Added
filter static_user_filter
(class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
context datanode
2014-09-04 10:20:01,660 INFO org.apache.hadoop.http.HttpServer: Added
filter static_user_filter
(class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
context static
2014-09-04 10:20:01,660 INFO org.apache.hadoop.http.HttpServer: Added
filter static_user_filter
(class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to
context logs
2014-09-04 10:20:01,664 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at
0.0.0.0:50075
2014-09-04 10:20:01,668 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = true
2014-09-04 10:20:01,670 INFO org.apache.hadoop.http.HttpServer:
addJerseyResourcePackage:
packageName=org.apache.hadoop.hdfs.server.datanode.web.resources;org.apache.hadoop.hdfs.web.resources,
pathSpec=/webhdfs/v1/*
2014-09-04 10:20:01,676 INFO org.apache.hadoop.http.HttpServer:
HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: 0.0.0.0:50075
at
org.apache.hadoop.http.HttpServer.openListener(HttpServer.java:729)
at org.apache.hadoop.http.HttpServer.start(HttpServer.java:673)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:424)
at
org.apac

RE: How can I increase the speed balancing?

2014-09-03 Thread Srikanth upputuri
AFAIK, this setting is meant to throttle bandwidth usage by balancer so that 
the balancing traffic will not severely impact the performance of the running 
jobs. Increasing this value will show effect only when there is enough total 
available bandwidth on the network. On an already overloaded network changing 
this value may not show much improvement. I suggest you look at the total 
network capacity, network usage on your datanodes to assess if there is 
sufficient room to increase your balancer bandwidth. Can you also try running 
balancer when there is no other traffic and see if changing this value has any 
impact.

Correction: Default balancer bandwidth is 1MB/s, not 1KB/s as I mentioned in my 
previous post. Sorry for the typo.

From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: 03 September 2014 17:38
To: user@hadoop.apache.org
Subject: RE: How can I increase the speed balancing?

I have also found that neither
dfsadmin - setBalanacerBandwidth
nor
dfs.datanode.balance.bandwidthPerSec’
have any notable effect on apparent balancer rate.  This is on Hadoop 2.2.0

john

From: cho ju il [mailto:tjst...@kgrid.co.kr]
Sent: Wednesday, September 03, 2014 12:55 AM
To: user@hadoop.apache.org
Subject: RE: How can I increase the speed balancing?


Bandwidth is enough

And i use command "bin/hdfs dfsadmin -setBalancerBandwidth 52428800 ".

Yet balancing slow.

I think because the file transfer speed is slow and  move 5 files per 1 server .

-Original Message-
From: "Srikanth 
upputuri"mailto:srikanth.upput...@huawei.com>>
To: 
"user@hadoop.apache.org"mailto:user@hadoop.apache.org>>;
Cc:
Sent: 2014-09-03 (수) 14:10:24
Subject: RE: How can I increase the speed balancing?

I am not sure what you meant by ‘Bandwidth is not a lack of data nodes’ but 
have you configured the balancer bandwidth property 
‘dfs.datanode.balance.bandwidthPerSec’? If not it defaults to 1KB/s. You can 
increase this to improve the balancer speed. You may also set it dynamically 
using the command ‘dfsadmin -setBalanacerBandwidth newbandwidth’ before running 
balancer.





From: cho ju il [mailto:tjst...@kgrid.co.kr]
Sent: 03 September 2014 06:31
To: user@hadoop.apache.org
Subject: How can I increase the speed balancing?



hadoop version 2.4.1

Balancing speed is slow.

I think because the file transfer speed is slow.

Bandwidth is not a lack of data nodes.

How can I increase the speed balancing?









*** balancer log

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.207:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.205:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.203:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.210:40010

2014-09-03 09:44:18,041 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.211:40010

2014-09-03 09:44:18,042 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.114:40010

2014-09-03 09:44:18,042 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.206:40010

2014-09-03 09:44:18,042 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.201:40010

2014-09-03 09:44:18,043 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.202:40010

2014-09-03 09:44:18,043 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/192.168.0.204:40010

2014-09-03 09:44:18,043 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 3 
over-utilized: [Source[192.168.0.203:40010, utilization=99.99693907952093], 
Source[192.168.0.201:40010, utilization=99.99713240471648], 
Source[192.168.0.202:40010, utilization=99.99652052169367]]

2014-09-03 09:44:18,043 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 2 
underutilized: [BalancerDatanode[192.168.0.211:40010, 
utilization=62.735024524531006], BalancerDatanode[192.168.0.114:40010, 
utilization=2.3174560700459224]]

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Need to move 70.30 TB to make the cluster balanced.

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Decided to move 10 GB bytes from 192.168.0.203:40010 to 192.168.0.211:40010

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Decided to move 10 GB bytes from 192.168.0.201:40010 to 192.168.0.114:40010

2014-09-03 09:44:18,044 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Will move 20 GB in this iteration

2014-09-03 09:44:23,643 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 
Successfully moved blk_1077216256_3475617 with size=1746577 from 
19