Perhaps turning on fsimage compression may help. See documentation of dfs.image.compress at http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml.
You can also try to throttle the bandwidth it uses via dfs.image.transfer.bandwidthPerSec. On Tue, Aug 13, 2013 at 3:45 PM, lei liu <liulei...@gmail.com> wrote: > I write one programm to test NameNode performance. Please see the > EditLogPerformance.java > > I use 60 threads to execute the EditLogPerformance.javacode, the testing > result is below content: > 2013-08-13 17:43:01,479 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055 > 2013-08-13 17:43:11,482 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725 > 2013-08-13 17:43:21,484 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2 > 2013-08-13 17:43:31,487 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6 > 2013-08-13 17:43:41,490 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6 > 2013-08-13 17:43:51,493 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53 > 2013-08-13 17:44:01,496 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10422220 speed:676 > 2013-08-13 17:44:11,499 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149 > 2013-08-13 17:44:21,502 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997 > 2013-08-13 17:44:31,505 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072 > 2013-08-13 17:44:41,508 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008 > 2013-08-13 17:44:51,511 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994 > 2013-08-13 17:45:01,514 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071 > 2013-08-13 17:45:11,517 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090 > 2013-08-13 17:45:21,520 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170 > 2013-08-13 17:45:31,523 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059 > 2013-08-13 17:45:41,526 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076 > 2013-08-13 17:45:51,529 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021 > 2013-08-13 17:46:01,532 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10677774 speed:1067 > 2013-08-13 17:46:11,534 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066 > 2013-08-13 17:46:21,537 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093 > 2013-08-13 17:46:31,540 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023 > 2013-08-13 17:46:41,543 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971 > 2013-08-13 17:46:51,546 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041 > 2013-08-13 17:47:01,549 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031 > 2013-08-13 17:47:11,552 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079 > 2013-08-13 17:47:21,555 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069 > 2013-08-13 17:47:31,558 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009 > 2013-08-13 17:47:41,561 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978 > 2013-08-13 17:47:51,564 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028 > 2013-08-13 17:48:01,567 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062 > 2013-08-13 17:48:11,570 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979 > 2013-08-13 17:48:21,573 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005 > 2013-08-13 17:48:31,576 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012 > 2013-08-13 17:48:41,579 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034 > 2013-08-13 17:48:51,581 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045 > 2013-08-13 17:49:01,584 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045 > 2013-08-13 17:49:11,587 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054 > 2013-08-13 17:49:21,590 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850 > 2013-08-13 17:49:31,593 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0 > 2013-08-13 17:49:41,596 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11087546 speed:6 > 2013-08-13 17:49:51,599 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11087716 speed:8 > 2013-08-13 17:50:01,602 INFO my.EditLogPerformance > (EditLogPerformance.java:run(37)) - totalCount:11091608 speed:194 > > > > The speed is less than ten sometimes. I find when Active NameNode download > the fsimage file, the speed is less than ten. So I think download fsimage > file that affects the performance of Active NameNode. > > > There are below info in Standby NameNode: > 2013-08-13 17:48:12,412 INFO > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering > checkpoint because there have been 2558038 txns since the last checkpoint, > which exceeds the configured threshold 1000000 > 2013-08-13 17:48:12,413 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: > Saving image file > /home/musa.ll/hadoop2/cluster-data/name/current/fsimage.ckpt_0000000000521186406 > using no compression > 2013-08-13 17:49:19,085 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: > Image file of size 3385425100 saved in 66 seconds. > 2013-08-13 17:49:19,655 INFO > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection > to > http://10.232.98.77:20021/getimage?putimage=1&txid=521186406&port=20021&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600 > 2013-08-13 17:53:21,107 INFO > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took > 241.45s at 0.00 KB/s > 2013-08-13 17:53:21,107 INFO > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with > txid 521186406 to namenode at 10.232.98.77:20021 > > > There are below info in Active NameNode: > 2013-08-13 17:49:19,659 INFO > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Opening connection > to > http://dw78.kgb.sqa.cm4:20021/getimage?getimage=1&txid=521186406&storageInfo=-40:1499625118:0:CID-921af0aa-b831-4828-965c-3b71a5149600 > 2013-08-13 17:53:20,610 INFO > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took > 240.95s at 13720.96 KB/s > 2013-08-13 17:53:20,610 INFO > org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file > fsimage.ckpt_0000000000521186406 size 3385425100 bytes. > > > > > > > > > > > > > 2013/8/13 Jitendra Yadav <jeetuyadav200...@gmail.com> >> >> Hi, >> >> Can you please let me know that how you identified the slowness between >> primary and standby namnode? >> >> Also please share the network connection bandwidth between these two >> servers. >> >> Thanks >> >> On Tue, Aug 13, 2013 at 11:52 AM, lei liu <liulei...@gmail.com> wrote: >>> >>> The fsimage file size is 1658934155 >>> >>> >>> 2013/8/13 Harsh J <ha...@cloudera.com> >>>> >>>> How large are your checkpointed fsimage files? >>>> >>>> On Mon, Aug 12, 2013 at 3:42 PM, lei liu <liulei...@gmail.com> wrote: >>>> > When Standby Namenode is doing checkpoint, upload the image file to >>>> > Active >>>> > NameNode, the Active NameNode is very slow. What is reason result to >>>> > the >>>> > Active NameNode is slow? >>>> > >>>> > >>>> > Thanks, >>>> > >>>> > LiuLei >>>> > >>>> >>>> >>>> >>>> -- >>>> Harsh J >>> >>> >> > -- Harsh J