We found that a full java heap PermSpace caused the datanode process to be 
slow. With the high replication factor, the delay seems to build up with the 
number of datanodes.
Increasing the PermSpace improved performance significantly and made impala a 
lot faster as well.
Any other java settings that someone recommends having a look at to improve 
performance?

Cheers,
Laurens



-----Original Message-----
From: Laurens Bronwasser
Sent: Tuesday, September 02, 2014 03:47 PM W. Europe Standard Time
To: user@hadoop.apache.org
Cc: Julien Lehuen; Tyler McDougall
Subject: Re: Replication factor affecting write performance

Thanks Stanley for responding.

The size of the cluster B is 12 data nodes. Cluster A has 18 (slower, older) 
data nodes.
The topology is that all data nodes and name nodes are under one switch. The 
clusters are not sharing a switch though, the cluster are in different data 
centers.

Perhaps this helps, below is a trace of the combined logs of all data nodes, 
after writing one block of 500MB with replication factor 12, on Cluster B.
Note that the logs are combined and sorted on millisecond precision, certain 
lines may have happened in a different order:

2014-09-02 15:18:41,135 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.87:55341 dest: /10.141.0.159:50010 hinterlands.trading.imc.intra
2014-09-02 15:18:41,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.159:46759 dest: /10.141.0.168:50010 redridge.trading.imc.intra
2014-09-02 15:18:41,138 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.168:37742 dest: /10.141.0.157:50010 eversong.trading.imc.intra
2014-09-02 15:18:41,139 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.157:36874 dest: /10.141.0.156:50010 elwynn.trading.imc.intra
2014-09-02 15:18:41,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.156:43356 dest: /10.141.0.170:50010 searing.trading.imc.intra
2014-09-02 15:18:41,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.170:35201 dest: /10.141.0.155:50010 duskwood.trading.imc.intra
2014-09-02 15:18:41,142 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.155:52119 dest: /10.141.0.174:50010 tirisfal.trading.imc.intra
2014-09-02 15:18:41,143 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.174:45775 dest: /10.141.0.173:50010 sorrows.trading.imc.intra
2014-09-02 15:18:41,144 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.173:55624 dest: /10.141.0.171:50010 shimmering.trading.imc.intra
2014-09-02 15:18:41,145 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.171:45532 dest: /10.141.0.172:50010 silverpine.trading.imc.intra
2014-09-02 15:18:41,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.172:35460 dest: /10.141.0.158:50010 gilneas.trading.imc.intra
2014-09-02 15:18:41,147 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587 src: 
/10.141.0.158:59490 dest: /10.141.0.169:50010 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.158:59490, dest: /10.141.0.169:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
f18f77e8-db9a-4e91-9160-401ae4b2f15b, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30357265647 scarlet.trading.imc.intra
2014-09-02 15:19:11,506 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.172:35460, dest: /10.141.0.158:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
66b3bcb2-36a8-4684-878f-2b90cec9d558, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30357606668 gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating gilneas.trading.imc.intra
2014-09-02 15:19:11,506 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=LAST_IN_PIPELINE, downstreams=0:[] terminating scarlet.trading.imc.intra
2014-09-02 15:19:11,507 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.171:45532, dest: /10.141.0.172:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
3d587891-c434-4a45-acb1-e41122ab71f9, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30357881405 silverpine.trading.imc.intra
2014-09-02 15:19:11,507 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.173:55624, dest: /10.141.0.171:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
3b0fcf75-78d9-4bcd-b5e0-941a8dd40da8, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30358283146 shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.174:45775, dest: /10.141.0.173:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
60d4ad85-90d0-42e1-958e-b73ff55a4d2c, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30358543752 sorrows.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating shimmering.trading.imc.intra
2014-09-02 15:19:11,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating silverpine.trading.imc.intra
2014-09-02 15:19:11,508 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.155:52119, dest: /10.141.0.174:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
8bc724e7-0f5c-4347-be32-7b9812286c5c, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30358704352 tirisfal.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating sorrows.trading.imc.intra
2014-09-02 15:19:11,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating tirisfal.trading.imc.intra
2014-09-02 15:19:11,509 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.156:43356, dest: /10.141.0.170:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
6f0e01d4-81a0-4289-b21a-4ca03bc6a191, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30359566398 searing.trading.imc.intra
2014-09-02 15:19:11,509 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.170:35201, dest: /10.141.0.155:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
08b14350-3484-4a0e-8520-8167733c62bd, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30359128561 duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating duskwood.trading.imc.intra
2014-09-02 15:19:11,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating searing.trading.imc.intra
2014-09-02 15:19:11,510 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.157:36874, dest: /10.141.0.156:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
ddcdea62-71e0-4279-a96c-aeea90580249, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30359733153 elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.159:46759, dest: /10.141.0.168:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
c953d2ac-57ba-47a4-ad49-24161a18770b, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30360356635 redridge.trading.imc.intra
2014-09-02 15:19:11,510 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.168:37742, dest: /10.141.0.157:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
ab89db9e-c99c-4a6e-9e87-cec30626cef9, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30360119140 eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating elwynn.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating eversong.trading.imc.intra
2014-09-02 15:19:11,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating redridge.trading.imc.intra
2014-09-02 15:19:11,511 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/10.141.0.87:55341, dest: /10.141.0.159:50010, bytes: 536000000, op: 
HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1486044046_1, offset: 0, srvID: 
9ba59389-bce8-4cf0-815c-b96a20699d13, blockid: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, duration: 
30328724294 hinterlands.trading.imc.intra
2014-09-02 15:19:11,511 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder: 
BP-403847522-10.141.0.153-1407450167649:blk_1080927884_7187587, 
type=HAS_DOWNSTREAM_IN_PIPELINE terminating hinterlands.trading.imc.intra




From: Stanley Shi <s...@pivotal.io<mailto:s...@pivotal.io>>
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" 
<user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Date: Tuesday, September 2, 2014 at 4:30 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" 
<user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Cc: Julien Lehuen <julien.leh...@imc.nl<mailto:julien.leh...@imc.nl>>, Tyler 
McDougall <tyler.mcdoug...@imc.nl<mailto:tyler.mcdoug...@imc.nl>>
Subject: Re: Replication factor affecting write performance

What's the network setup and topology?
Also, the size of the cluster?


On Mon, Sep 1, 2014 at 4:10 PM, Laurens Bronwasser 
<laurens.bronwas...@imc.nl<mailto:laurens.bronwas...@imc.nl>> wrote:
And now with the right label on the Y-axis.

[cid:7D64F387-C6F1-4B37-9894-0166EC949EF9]

From: Microsoft Office User 
<laurens.bronwas...@imc.nl<mailto:laurens.bronwas...@imc.nl>>
Date: Monday, September 1, 2014 at 9:56 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" 
<user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Cc: Julien Lehuen <julien.leh...@imc.nl<mailto:julien.leh...@imc.nl>>, Tyler 
McDougall <tyler.mcdoug...@imc.nl<mailto:tyler.mcdoug...@imc.nl>>
Subject: Replication factor affecting write performance

Hi,
We have a setup with two clusters.
On cluster shows a very strong degradation when we increase the replication 
factor.
Another cluster shows hardly any degradation with increased replication factor.

Any idea how to find out the bottleneck in the slower cluster?



________________________________

The information in this e-mail is intended only for the person or entity to 
which it is addressed.

It may contain confidential and /or privileged material. If someone other than 
the intended recipient should receive this e-mail, he / she shall not be 
entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by 
"reply" and then delete it from your system. Although this information has been 
compiled with great care, neither IMC Financial Markets & Asset Management nor 
any of its related entities shall accept any responsibility for any errors, 
omissions or other inaccuracies in this information or for the consequences 
thereof, nor shall it be bound in any way by the contents of this e-mail or its 
attachments. In the event of incomplete or incorrect transmission, please 
return the e-mail to the sender and permanently delete this message and any 
attachments.

Messages and attachments are scanned for all known viruses. Always scan 
attachments before opening them.



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

________________________________

The information in this e-mail is intended only for the person or entity to 
which it is addressed.

It may contain confidential and /or privileged material. If someone other than 
the intended recipient should receive this e-mail, he / she shall not be 
entitled to read, disseminate, disclose or duplicate it.

If you receive this e-mail unintentionally, please inform us immediately by 
"reply" and then delete it from your system. Although this information has been 
compiled with great care, neither IMC Financial Markets & Asset Management nor 
any of its related entities shall accept any responsibility for any errors, 
omissions or other inaccuracies in this information or for the consequences 
thereof, nor shall it be bound in any way by the contents of this e-mail or its 
attachments. In the event of incomplete or incorrect transmission, please 
return the e-mail to the sender and permanently delete this message and any 
attachments.

Messages and attachments are scanned for all known viruses. Always scan 
attachments before opening them.

Reply via email to