[jira] [Created] (HDFS-15602) Support new Instance by non default constructor by ReflectionUtils
maobaolong created HDFS-15602: - Summary: Support new Instance by non default constructor by ReflectionUtils Key: HDFS-15602 URL: https://issues.apache.org/jira/browse/HDFS-15602 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.4.0 Reporter: maobaolong Assignee: maobaolong -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15399) Support include or exclude datanode by configure file
maobaolong created HDFS-15399: - Summary: Support include or exclude datanode by configure file Key: HDFS-15399 URL: https://issues.apache.org/jira/browse/HDFS-15399 Project: Hadoop HDFS Issue Type: New Feature Components: datanode Reporter: maobaolong Assignee: maobaolong When i dislike a datanode, or just want to let specific datanode join to SCM, i want to have this feature to limit datanode list. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-9411) HDFS NodeLabel support
[ https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120730#comment-17120730 ] maobaolong edited comment on HDFS-9411 at 6/1/20, 2:52 AM: --- +1. Feature is pretty good. Is there any update on this? Maybe huawei had done this feature in their product, how about the open source version? was (Author: maobaolong): +1. Feature is pretty good. Is there any update on this? > HDFS NodeLabel support > -- > > Key: HDFS-9411 > URL: https://issues.apache.org/jira/browse/HDFS-9411 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS Node Labels-21-08-2017.pdf, > HDFSNodeLabels-15-09-2016.pdf, HDFSNodeLabels-20-06-2016.pdf, > HDFS_ZoneLabels-16112015.pdf > > > HDFS currently stores data blocks on different datanodes chosen by > BlockPlacement Policy. These datanodes are random within the > scope(local-rack/different-rack/nodegroup) of network topology. > In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant > can be on any datanodes. > Based on applications of different tenant, sometimes datanode might get busy > making the other tenant's application to slow down. It would be better if > admin's have a provision to logically divide the cluster among multi-tenants. > NodeLabels adds more options to user to specify constraints to select > specific nodes with specific requirements. > High level design doc to follow soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-9411) HDFS NodeLabel support
[ https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-9411: - Comment: was deleted (was: +1. Feature is pretty good. Is there any update on this? ) > HDFS NodeLabel support > -- > > Key: HDFS-9411 > URL: https://issues.apache.org/jira/browse/HDFS-9411 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS Node Labels-21-08-2017.pdf, > HDFSNodeLabels-15-09-2016.pdf, HDFSNodeLabels-20-06-2016.pdf, > HDFS_ZoneLabels-16112015.pdf > > > HDFS currently stores data blocks on different datanodes chosen by > BlockPlacement Policy. These datanodes are random within the > scope(local-rack/different-rack/nodegroup) of network topology. > In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant > can be on any datanodes. > Based on applications of different tenant, sometimes datanode might get busy > making the other tenant's application to slow down. It would be better if > admin's have a provision to logically divide the cluster among multi-tenants. > NodeLabels adds more options to user to specify constraints to select > specific nodes with specific requirements. > High level design doc to follow soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9411) HDFS NodeLabel support
[ https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120730#comment-17120730 ] maobaolong commented on HDFS-9411: -- +1. Feature is pretty good. Is there any update on this? > HDFS NodeLabel support > -- > > Key: HDFS-9411 > URL: https://issues.apache.org/jira/browse/HDFS-9411 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS Node Labels-21-08-2017.pdf, > HDFSNodeLabels-15-09-2016.pdf, HDFSNodeLabels-20-06-2016.pdf, > HDFS_ZoneLabels-16112015.pdf > > > HDFS currently stores data blocks on different datanodes chosen by > BlockPlacement Policy. These datanodes are random within the > scope(local-rack/different-rack/nodegroup) of network topology. > In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant > can be on any datanodes. > Based on applications of different tenant, sometimes datanode might get busy > making the other tenant's application to slow down. It would be better if > admin's have a provision to logically divide the cluster among multi-tenants. > NodeLabels adds more options to user to specify constraints to select > specific nodes with specific requirements. > High level design doc to follow soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9411) HDFS NodeLabel support
[ https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120729#comment-17120729 ] maobaolong commented on HDFS-9411: -- +1. Feature is pretty good. Is there any update on this? > HDFS NodeLabel support > -- > > Key: HDFS-9411 > URL: https://issues.apache.org/jira/browse/HDFS-9411 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Vinayakumar B >Assignee: Vinayakumar B >Priority: Major > Attachments: HDFS Node Labels-21-08-2017.pdf, > HDFSNodeLabels-15-09-2016.pdf, HDFSNodeLabels-20-06-2016.pdf, > HDFS_ZoneLabels-16112015.pdf > > > HDFS currently stores data blocks on different datanodes chosen by > BlockPlacement Policy. These datanodes are random within the > scope(local-rack/different-rack/nodegroup) of network topology. > In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant > can be on any datanodes. > Based on applications of different tenant, sometimes datanode might get busy > making the other tenant's application to slow down. It would be better if > admin's have a provision to logically divide the cluster among multi-tenants. > NodeLabels adds more options to user to specify constraints to select > specific nodes with specific requirements. > High level design doc to follow soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error
[ https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083820#comment-17083820 ] maobaolong commented on HDFS-15240: --- What a great job you did, look forward to your patch in the trunk branch. > Erasure Coding: dirty buffer causes reconstruction block error > -- > > Key: HDFS-15240 > URL: https://issues.apache.org/jira/browse/HDFS-15240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding >Reporter: HuangTao >Assignee: HuangTao >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, > HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch > > > When read some lzo files we found some blocks were broken. > I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from > DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') > blocks. And find the longest common sequenece(LCS) between b6'(decoded) and > b6(read from DN)(b7'/b7 and b8'/b8). > After selecting 6 blocks of the block group in combinations one time and > iterating through all cases, I find one case that the length of LCS is the > block length - 64KB, 64KB is just the length of ByteBuffer used by > StripedBlockReader. So the corrupt reconstruction block is made by a dirty > buffer. > The following log snippet(only show 2 of 28 cases) is my check program > output. In my case, I known the 3th block is corrupt, so need other 5 blocks > to decode another 3 blocks, then find the 1th block's LCS substring is block > length - 64kb. > It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the > dirty buffer was used before read the 1th block. > Must be noted that StripedBlockReader read from the offset 0 of the 1th block > after used the dirty buffer. > {code:java} > decode from [0, 2, 3, 4, 5, 7] -> [1, 6, 8] > Check Block(1) first 131072 bytes longest common substring length 4 > Check Block(6) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4 > decode from [0, 2, 3, 4, 5, 6] -> [1, 7, 8] > Check Block(1) first 131072 bytes longest common substring length 65536 > CHECK AGAIN: Block(1) all 27262976 bytes longest common substring length > 27197440 # this one > Check Block(7) first 131072 bytes longest common substring length 4 > Check Block(8) first 131072 bytes longest common substring length 4{code} > Now I know the dirty buffer causes reconstruction block error, but how does > the dirty buffer come about? > After digging into the code and DN log, I found this following DN log is the > root reason. > {code:java} > [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel > java.nio.channels.SocketChannel[connected local=/:52586 > remote=/:50010]. 18 millis timeout left. > [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped > block: BP-714356632--1519726836856:blk_-YY_3472979393 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94) > at > org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > Reading from DN may timeout(hold by a future(F)) and output the INFO log, but > the futures that contains the future(F) is cleared, > {code:java} > return new StripingChunkReadResult(futures.remove(future), > StripingChunkReadResult.CANCELLED); {code} > futures.remove(future) cause NPE. So the EC reconstruction is failed. In the > finally phase, the code snippet in *getStripedReader().close()* > {code:java} > reconstructor.freeBuffer(reader.getReadBuffer()); > reader.freeReadBuffer(); > reader.closeBlockReader(); {code} > free buffer firstly, but the StripedBlockReader still holds the buffer and > write it. --
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032130#comment-17032130 ] maobaolong commented on HDFS-15133: --- There are some work todo. - how to measure the size of the inodeTable? for the heapStore, the size is easy to get, but how to get the size of the rocksdb table? - I want to use protobuf to serialisation and deserialisation, so there are a big work to write serval message of proto, and a PBHelper is needed, serval Codec of the sub class of INode should be implemented. If everything goes well, we will see the result soon. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032122#comment-17032122 ] maobaolong commented on HDFS-15133: --- [~hemanthboyina] Thank you very much for your comments. 1) I don't begin to do this work, we can use protobuf to do serialisation and deserialisation 2) Indeed, we need to add our inodeTable. 3) Yeah, so careless i am. Thank you twice for your carefully review, i have push a PR, look forward to your review again. https://github.com/maobaolong/hadoop/pull/1 > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028299#comment-17028299 ] maobaolong commented on HDFS-15133: --- [~brahmareddy] Ok, i've make the inodeStore to be configurable, and default value is HeapInodeStore. New commit has been put to my repository https://github.com/maobaolong/hadoop/tree/rocks-metastore > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025530#comment-17025530 ] maobaolong edited comment on HDFS-15133 at 1/29/20 2:01 AM: https://github.com/maobaolong/hadoop/tree/rocks-metastore This is a POC for my idea, it cannot works, just a prototype, but you can see my works now was (Author: maobaolong): https://github.com/maobaolong/hadoop/tree/rocks-metastore > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025530#comment-17025530 ] maobaolong commented on HDFS-15133: --- https://github.com/maobaolong/hadoop/tree/rocks-metastore > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-15133: -- Comment: was deleted (was: https://github.com/maobaolong/hadoop/tree/rocks-metastore) > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025531#comment-17025531 ] maobaolong commented on HDFS-15133: --- https://github.com/maobaolong/hadoop/tree/rocks-metastore > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025528#comment-17025528 ] maobaolong commented on HDFS-15133: --- [~hemanthboyina] Thank you for your question. Indeed, if the RocksInodeStore works perfect, it is no need to have a HeapInodeStore, but everything has two sides, although rocksInodeStore mode can enlarge the maximum of inode, and we will try to use cache to improve the performance, but i think in some scenarios, HeapInodeStore can work well, and i think some company use the origin heap mode well, they don't have to try the rocksInodeStore, so we must provide a way to let them run the namenode in the origin way. HeapInodeStore is the origin way to manage inode. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025012#comment-17025012 ] maobaolong edited comment on HDFS-15133 at 1/29/20 1:44 AM: [~ayushtkn] As far as i know, i think all of the inode stored in the INodeMap.map, isn't it? I am not very sure. was (Author: maobaolong): As far as i know, i think all of the inode stored in the INodeMap.map, isn't it? I am not very sure. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025012#comment-17025012 ] maobaolong commented on HDFS-15133: --- As far as i know, i think all of the inode stored in the INodeMap.map, isn't it? I am not very sure. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025011#comment-17025011 ] maobaolong commented on HDFS-15133: --- [~hemanthboyina] Thank you for your help. - I found the TypedTable.iterator of HDDS can call RocksDB. newIterator() Indirectly. - Yeah, i agree with "delete all the entries in rocksdb , so we should be deleting using DeleteFilesInRange()" Answer your question 1) HeapInodeStore and RocksInodeStore are the two independent implementation of InodeStore, so there are no link between HeapInodeStore with RocksInodeStore. 2) No changes INodeMap tree structure, i honor the origin structure. 3) I think InodeMap is a manage class, the true structure for the store inode is the member InodeMap.map, I think all inode in this container, so i try to put all of the inode into the rocksdb by replace the InodeMap.map into a inodestore. Finally I am not sure i explain exactly, so please see my commit diff of rocks-metastore branch, my repo is https://github.com/maobaolong/hadoop. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024858#comment-17024858 ] maobaolong edited comment on HDFS-15133 at 1/28/20 4:30 AM: [~brahmareddy] [~ayushtkn], After some work these day, i give some update now. - I copy the package `org.apache.hadoop.hdds.utils.db` from hadoop-ozone project to the package `org.apache.hadoop.hdds.utils.db` of the module hadoop-common of the project hadoop. In fact i think i have done the sub task HDFS-15137 - I'm doing a refactor for INodeMap, i try to abstract the `GSet map` into a concept named `InodeStore`. It have two implementation, one named HeapInodeStore, as the name, it is the origin mode, the other named RocksInodeStore, it work as rocksdb mode. In fact, i'm doing the part of https://issues.apache.org/jira/browse/HDFS-15138. But, i met some problem, - It is hard to measure the size of a rocksdb? - how to implements the method getMapIterator graceful for the RocksInodeStore. - Clear method will be implements to reset the rocksdb for the RocksInodeStore. Future Job - I will commit my work to my branch named rocks-metastore, my repo is https://github.com/maobaolong/hadoop. As now, it contains compile error now, so i will commit several days later. - Feel free to discuss with my here or in the slack hdfs channel - Feel free to take these jira task or help me to correct my mistake - Doing some minor test. This picture is the relationship of the new class mainly. !image-2020-01-28-12-30-33-015.png! was (Author: maobaolong): [~brahmareddy] [~ayushtkn], After some work these day, i give some update now. - I copy the package `org.apache.hadoop.hdds.utils.db` from hadoop-ozone project to the package `org.apache.hadoop.hdds.utils.db` of the module hadoop-common of the project hadoop. In fact i think i have done the sub task HDFS-15137 - I'm doing a refactor for INodeMap, i try to abstract the `GSet map` into a concept named `InodeStore`. It have two implementation, one named HeapInodeStore, as the name, it is the origin mode, the other named RocksInodeStore, it work as rocksdb mode. In fact, i'm doing the part of https://issues.apache.org/jira/browse/HDFS-15138. But, i met some problem, - It is hard to measure the size of a rocksdb? - how to implements the method getMapIterator graceful for the RocksInodeStore. - Clear method will be implements to reset the rocksdb for the RocksInodeStore. Future Job - I will commit my work to my branch named rocks-metastore, my repo is https://github.com/maobaolong/hadoop. As now, it contains compile error now, so i will commit several days later. - Feel free to discuss with my here or in the slack hdfs channel - Feel free to take these jira task or help me to correct my mistake - Doing some minor test. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: image-2020-01-28-12-30-33-015.png > > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024858#comment-17024858 ] maobaolong commented on HDFS-15133: --- [~brahmareddy] [~ayushtkn], After some work these day, i give some update now. - I copy the package `org.apache.hadoop.hdds.utils.db` from hadoop-ozone project to the package `org.apache.hadoop.hdds.utils.db` of the module hadoop-common of the project hadoop. In fact i think i have done the sub task HDFS-15137 - I'm doing a refactor for INodeMap, i try to abstract the `GSet map` into a concept named `InodeStore`. It have two implementation, one named HeapInodeStore, as the name, it is the origin mode, the other named RocksInodeStore, it work as rocksdb mode. In fact, i'm doing the part of https://issues.apache.org/jira/browse/HDFS-15138. But, i met some problem, - It is hard to measure the size of a rocksdb? - how to implements the method getMapIterator graceful for the RocksInodeStore. - Clear method will be implements to reset the rocksdb for the RocksInodeStore. Future Job - I will commit my work to my branch named rocks-metastore, my repo is https://github.com/maobaolong/hadoop. As now, it contains compile error now, so i will commit several days later. - Feel free to discuss with my here or in the slack hdfs channel - Feel free to take these jira task or help me to correct my mistake - Doing some minor test. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15139) Use RDBStore and TypedTable to manage the blockinfo of namenode
maobaolong created HDFS-15139: - Summary: Use RDBStore and TypedTable to manage the blockinfo of namenode Key: HDFS-15139 URL: https://issues.apache.org/jira/browse/HDFS-15139 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.3.0 Reporter: maobaolong replace the BlockManager.BlocksMap.blocks from GSet to rocksdb -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15138) Use RDBStore and TypedTable to manage the inode of namenode
maobaolong created HDFS-15138: - Summary: Use RDBStore and TypedTable to manage the inode of namenode Key: HDFS-15138 URL: https://issues.apache.org/jira/browse/HDFS-15138 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.3.0 Reporter: maobaolong Replace FSDirectory.inodeMap.map from GSet to rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15137) Move RDBStore logic from apache-ozone into hadoop-commons module of apache-hadoop
maobaolong created HDFS-15137: - Summary: Move RDBStore logic from apache-ozone into hadoop-commons module of apache-hadoop Key: HDFS-15137 URL: https://issues.apache.org/jira/browse/HDFS-15137 Project: Hadoop HDFS Issue Type: Sub-task Reporter: maobaolong -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020727#comment-17020727 ] maobaolong commented on HDFS-15133: --- [~iwasakims] [~hemanthboyina] After review the HDFS-5389 and HDFS-8286, i saw the similar design, but they are all aimed to invent a new approach to achieve the goal. This jira mainly hope to reuse the approach of how to manage racksdb which implemented in HDDS. The RDBStore and TypedTable can be responsible for the kv store manager, so we can starts all work by the moving the code of RDBStore related to hadoop-common, so that ozone and hdfs or yarn and other component can use this wonderful feature without any more effort. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-15133: -- Description: Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can achieve the same request. This is ozone and alluxio way to manage meta data of master node. was:Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can achieve the same request. > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. > This is ozone and alluxio way to manage meta data of master node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
[ https://issues.apache.org/jira/browse/HDFS-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-15133: -- Component/s: namenode Affects Version/s: 3.3.0 > Use rocksdb to store NameNode inode and blockInfo > - > > Key: HDFS-15133 > URL: https://issues.apache.org/jira/browse/HDFS-15133 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > > Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can > achieve the same request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15133) Use rocksdb to store NameNode inode and blockInfo
maobaolong created HDFS-15133: - Summary: Use rocksdb to store NameNode inode and blockInfo Key: HDFS-15133 URL: https://issues.apache.org/jira/browse/HDFS-15133 Project: Hadoop HDFS Issue Type: Improvement Reporter: maobaolong Maybe we don't need checkpoint to a fsimage file, the rocksdb checkpoint can achieve the same request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14586) Trash missing delete the folder which near timeout checkpoint
[ https://issues.apache.org/jira/browse/HDFS-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870692#comment-16870692 ] maobaolong commented on HDFS-14586: --- [~hexiaoqiao] i have another viewpoint, we need to keep checkpoint generated timestamp to measure something, so he do the check when delete, i think the two ways are both correct, but this way can keep the timestamp. > Trash missing delete the folder which near timeout checkpoint > - > > Key: HDFS-14586 > URL: https://issues.apache.org/jira/browse/HDFS-14586 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hu yongfa >Assignee: hu yongfa >Priority: Major > Attachments: HDFS-14586.001.patch > > > when trash timeout checkpoint coming, trash will delete the old folder first, > then create a new checkpoint folder. > as the delete action may spend a long time, such as 2 minutes, so the new > checkpoint folder created late. > at the next trash timeout checkpoint, trash will skip delete the new > checkpoint folder, because the new checkpoint folder is > less than a checkpoint interval. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14586) Trash missing delete the folder which near timeout checkpoint
[ https://issues.apache.org/jira/browse/HDFS-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869107#comment-16869107 ] maobaolong commented on HDFS-14586: --- [~huyongfa] This is what our company most want? Thank you for your contribution! > Trash missing delete the folder which near timeout checkpoint > - > > Key: HDFS-14586 > URL: https://issues.apache.org/jira/browse/HDFS-14586 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hu yongfa >Priority: Major > Attachments: HDFS-14586.001.patch > > > when trash timeout checkpoint coming, trash will delete the old folder first, > then create a new checkpoint folder. > as the delete action may spend a long time, such as 2 minutes, so the new > checkpoint folder created late. > at the next trash timeout checkpoint, trash will skip delete the new > checkpoint folder, because the new checkpoint folder is > less than a checkpoint interval. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-1606) ozone s3g cannot started caused by NoInitialContextException: xxx java.naming.factory.initial
maobaolong created HDDS-1606: Summary: ozone s3g cannot started caused by NoInitialContextException: xxx java.naming.factory.initial Key: HDDS-1606 URL: https://issues.apache.org/jira/browse/HDDS-1606 Project: Hadoop Distributed Data Store Issue Type: Bug Affects Versions: 0.5.0 Environment: ozone-site.xml {code:xml} ozone.enabled true ozone.metadata.dirs /data0/disk1/meta ozone.scm.datanode.id /data0/disk1/meta/node/datanode.id ozone.om.address ozonemanager.hadoop.apache.org ozone.om.db.dirs /data0/om-db-dirs ozone.scm.names 172.16.150.142 ozone.om.address 172.16.150.142 hdds.datanode.http.enabled true ozone.s3g.domain.name s3g.internal {code} Reporter: maobaolong $ ozone s3g /software/servers/jdk1.8.0_121/bin/java -Dproc_s3g -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5008 -Dhadoop.log.dir=/software/servers/ozone-0.5.0-SNAPSHOT/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/software/servers/ozone-0.5.0-SNAPSHOT -Dhadoop.id.str=hadp -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.ozone.s3.Gateway 2019-05-29 16:46:28,056 INFO hdfs.DFSUtil: Starting Web-server for s3gateway at: http://0.0.0.0:9878 2019-05-29 16:46:28,079 INFO util.log: Logging initialized @8123ms 2019-05-29 16:46:28,164 INFO server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets. 2019-05-29 16:46:28,178 INFO http.HttpRequestLog: Http request log for http.requests.s3gateway is not defined 2019-05-29 16:46:28,188 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2019-05-29 16:46:28,191 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context s3gateway 2019-05-29 16:46:28,191 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 2019-05-29 16:46:28,191 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 2019-05-29 16:46:28,206 [main] INFO - Starting Ozone S3 gateway 2019-05-29 16:46:28,212 INFO http.HttpServer2: Jetty bound to port 9878 2019-05-29 16:46:28,213 INFO server.Server: jetty-9.3.24.v20180605, build timestamp: 2018-06-06T01:11:56+08:00, git hash: 84205aa28f11a4f31f2a3b86d1bba2cc8ab69827 2019-05-29 16:46:28,241 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@68f4865{/logs,file:///software/servers/ozone-0.5.0-SNAPSHOT/logs/,AVAILABLE} 2019-05-29 16:46:28,242 INFO handler.ContextHandler: Started o.e.j.s.ServletContextHandler@39d9314d{/static,jar:file:/software/servers/ozone-0.5.0-SNAPSHOT/share/ozone/lib/hadoop-ozone-s3gateway-0.5.0-SNAPSHOT.jar!/webapps/static,AVAILABLE} ERROR StatusLogger No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions on how to configure Log4j 2 2019-05-29 16:46:28,974 WARN webapp.WebAppContext: Failed startup of context o.e.j.w.WebAppContext@7487b142{/,file:///tmp/jetty-0.0.0.0-9878-s3gateway-_-any-2799631504400193724.dir/webapp/,UNAVAILABLE}{/s3gateway} org.jboss.weld.exceptions.DefinitionException: Exception List with 1 exceptions: Exception 0 : java.lang.RuntimeException: javax.naming.NoInitialContextException: Need to specify class name in environment or system property, or as an applet parameter, or in an application resource file: java.naming.factory.initial at com.sun.jersey.server.impl.cdi.CDIExtension.initialize(CDIExtension.java:201) at com.sun.jersey.server.impl.cdi.CDIExtension.beforeBeanDiscovery(CDIExtension.java:302) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.jboss.weld.injection.StaticMethodInjectionPoint.invoke(StaticMethodInjectionPoint.java:88) at org.jboss.weld.injection.MethodInvocationStrategy$SpecialParamPlusBeanManagerStrategy.invoke(MethodInvocationStrategy.java:144) at org.jboss.weld.event.ObserverMethodImpl.sendEvent(ObserverMethodImpl.java:29
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842801#comment-16842801 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Thanking for remind me, i have corrected it. PTAL. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, > HDFS-14353.009.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.009.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, > HDFS-14353.009.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840906#comment-16840906 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Thank you for the code you given, i use it to replace my code. PTAL. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.008.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839371#comment-16839371 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Thank you for your advice, it's really a good idea, i have modified it, thank you. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.007.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, HDFS-14353.007.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833254#comment-16833254 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Thank you for your advice, i've put a comment to the relative line. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.006.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > HDFS-14353.006.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829882#comment-16829882 ] maobaolong edited comment on HDFS-14353 at 4/30/19 2:19 AM: [~elgoiri] I guess he will not reply you these day, because i find his last activity is 21/Dec/18 19:28. And, his last activity at hadoop is 09/May/18 18:38 was (Author: maobaolong): [~elgoiri] I guess he will not reply you these day, because i find his last activity is 21/Dec/18 19:28. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829882#comment-16829882 ] maobaolong commented on HDFS-14353: --- [~elgoiri] I guess he will not reply you these day, because i find his last activity is 21/Dec/18 19:28. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829816#comment-16829816 ] maobaolong commented on HDFS-14353: --- [~elgoiri] This issue is caused by HDFS-12482. [~eddyxu] Please take a look. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827786#comment-16827786 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Thank you to point this. but, i am sorry to tell you the truth, i just following the logic from ErasureCodingWorker#processErasureCodingTasks. And after i try to think about it, i think the origin author(Lei Xu) want to adapt the method getDatanode().incrementXmitsInProcess(int), i think he don't want to see a zero here. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825640#comment-16825640 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Thank you for your advise. The following is new diff code block: {code:java} float xmitWeight = getErasureCodingWorker().getXmitWeight(); // if the xmits is smaller than 1, the xmitsSubmitted should be set to 1 int xmitsSubmitted = Math.max((int) (getXmits() * xmitWeight), 1); {code} Is it ok? > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.005.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.004.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, HDFS-14353.004.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823615#comment-16823615 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Of curse, thank you for remind me, i upload a new patch, PTAL after the jenkins report. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.003.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > HDFS-14353.003.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16822599#comment-16822599 ] maobaolong commented on HDFS-14353: --- [~elgoiri] [~knanasi] These failed tests relates to NameNode, but all my changes relates to DN. So i think these failed test don't caused by my patch. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16822293#comment-16822293 ] maobaolong commented on HDFS-14353: --- [~elgoiri] [~knanasi] Thank you for watching my patch. I see the old utest testErasureCodingWorkerXmitsWeight can not verify XmitsInProgress to be zero after reconstruction. So a modify it, before my source modification, the test failed and XmitsInProgress is negative, after my source modification, the tests passed. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.002.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch, > screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.001.patch > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001.patch, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: (was: HDFS-14353.001) > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16816966#comment-16816966 ] maobaolong commented on HDFS-14353: --- [~elgoiri] Would you like to help me to review this patch? i think this patch is useful to resolve a bug. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10648) Expose Balancer metrics through Metrics2
[ https://issues.apache.org/jira/browse/HDFS-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16816957#comment-16816957 ] maobaolong commented on HDFS-10648: --- [~mwagner] So, we need to make balancer to be a long service process. > Expose Balancer metrics through Metrics2 > > > Key: HDFS-10648 > URL: https://issues.apache.org/jira/browse/HDFS-10648 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover, metrics >Reporter: Mark Wagner >Priority: Major > Labels: metrics > > The Balancer currently prints progress information to the console. For > deployments that run the balancer frequently, it would be helpful to collect > those metrics for publishing to the available sinks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13804) DN maxDataLength is useless except DN webui. I suggest to get maxDataLength from NN heartbeat.
[ https://issues.apache.org/jira/browse/HDFS-13804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16816955#comment-16816955 ] maobaolong commented on HDFS-13804: --- [~jojochuang] If DN can get the maxDataLength after DN register to NN, NN return the maxDataLength to the DN, so that, DN never need to config the maxDataLength. Do you think this make sense? > DN maxDataLength is useless except DN webui. I suggest to get maxDataLength > from NN heartbeat. > -- > > Key: HDFS-13804 > URL: https://issues.apache.org/jira/browse/HDFS-13804 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode >Reporter: maobaolong >Priority: Major > Labels: DataNode > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14344) Erasure Coding: Miss EC block after decommission and restart NN
[ https://issues.apache.org/jira/browse/HDFS-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800432#comment-16800432 ] maobaolong commented on HDFS-14344: --- [~ayushtkn] Do you think HDFS-14353 can cause this issue? > Erasure Coding: Miss EC block after decommission and restart NN > --- > > Key: HDFS-14344 > URL: https://issues.apache.org/jira/browse/HDFS-14344 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ec, erasure-coding, namenode >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Critical > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794038#comment-16794038 ] maobaolong commented on HDFS-14353: --- [~ayushtkn]Please take a look at this issue, maybe it can lead to missing block. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790129#comment-16790129 ] maobaolong commented on HDFS-14353: --- [~linyiqun] Would you like to take a look at this patch?. After we use Hadoop3.x with EC feature, we met this issue, thank you advance. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Fix Version/s: 3.3.0 Status: Patch Available (was: Open) > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14353.001, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: HDFS-14353.001 > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-14353.001, screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong reassigned HDFS-14353: - Assignee: maobaolong > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12482) Provide a configuration to adjust the weight of EC recovery tasks to adjust the speed of recovery
[ https://issues.apache.org/jira/browse/HDFS-12482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789448#comment-16789448 ] maobaolong commented on HDFS-12482: --- [~eddyxu] Please take a look HDFS-14353. I think the StripedBlockReconstructor#run should be: {code:java} @Override public void run() { try { x } catch (Throwable e) { x } finally { getDatanode().decrementXmitsInProgress(Math.max((int)(getXmits() * xmitWeight), 1)); x } } {code} > Provide a configuration to adjust the weight of EC recovery tasks to adjust > the speed of recovery > - > > Key: HDFS-12482 > URL: https://issues.apache.org/jira/browse/HDFS-12482 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: 3.0.0-alpha4 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: hdfs-ec-3.0-nice-to-have > Fix For: 3.0.0 > > Attachments: HDFS-12482.00.patch, HDFS-12482.01.patch, > HDFS-12482.02.patch, HDFS-12482.03.patch, HDFS-12482.04.patch, > HDFS-12482.05.patch > > > The relative speed of EC recovery comparing to 3x replica recovery is a > function of (EC codec, number of sources, NIC speed, and CPU speed, and etc). > Currently the EC recovery has a fixed {{xmitsInProgress}} of {{max(# of > sources, # of targets)}} comparing to {{1}} for 3x replica recovery, and NN > uses {{xmitsInProgress}} to decide how much recovery tasks to schedule to the > DataNode this we can add a coefficient for user to tune the weight of EC > recovery tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789445#comment-16789445 ] maobaolong commented on HDFS-14353: --- [~eddyxu] Please take a look. I think the StripedBlockReconstructor#run should be: {code:java} @Override public void run() { try { x } catch (Throwable e) { x } finally { getDatanode().decrementXmitsInProgress(Math.max((int)(getXmits() * xmitWeight), 1)); x } } {code} > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789407#comment-16789407 ] maobaolong commented on HDFS-14353: --- I think the problem caused by the following logic: increase: xmitsSubmitted = Math.max((int)(task.getXmits() * xmitWeight), 1); getDatanode().incrementXmitsInProcess(xmitsSubmitted); descrease: In StripedBlockReconstructor#run: getDatanode().decrementXmitsInProgress(getXmits()); In my case, increase 1, decrease 3. My operation is append 3 datanode host name into the exclude_datanode_hosts file, execute “hdfs dfsadmin -refreshNodes”. PS. the 3 exclude nodes are stored the EC blocks for the RS(3,2) file. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789208#comment-16789208 ] maobaolong edited comment on HDFS-14353 at 3/11/19 6:47 AM: The suspect code i think is around the class ErasureCodingWorker and StripedBlockReconstructor. {code:java} public void processErasureCodingTasks( Collection ecTasks) { for (BlockECReconstructionInfo reconInfo : ecTasks) { int xmitsSubmitted = 0; try { StripedReconstructionInfo stripedReconInfo = new StripedReconstructionInfo( reconInfo.getExtendedBlock(), reconInfo.getErasureCodingPolicy(), reconInfo.getLiveBlockIndices(), reconInfo.getSourceDnInfos(), reconInfo.getTargetDnInfos(), reconInfo.getTargetStorageTypes(), reconInfo.getTargetStorageIDs()); // It may throw IllegalArgumentException from task#stripedReader // constructor. final StripedBlockReconstructor task = new StripedBlockReconstructor(this, stripedReconInfo); if (task.hasValidTargets()) { // See HDFS-12044. We increase xmitsInProgress even the task is only // enqueued, so that // 1) NN will not send more tasks than what DN can execute and // 2) DN will not throw away reconstruction tasks, and instead keeps // an unbounded number of tasks in the executor's task queue. xmitsSubmitted = Math.max((int)(task.getXmits() * xmitWeight), 1); getDatanode().incrementXmitsInProcess(xmitsSubmitted); stripedReconstructionPool.submit(task); } else { LOG.warn("No missing internal block. Skip reconstruction for task:{}", reconInfo); } } catch (Throwable e) { getDatanode().decrementXmitsInProgress(xmitsSubmitted); LOG.warn("Failed to reconstruct striped block {}", reconInfo.getExtendedBlock().getLocalBlock(), e); } } } {code} was (Author: maobaolong): The suspect code i think is around the class ErasureCodingWorker and StripedBlockReconstructor. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789208#comment-16789208 ] maobaolong commented on HDFS-14353: --- The suspect code i think is around the class ErasureCodingWorker and StripedBlockReconstructor. > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-14353: -- Attachment: screenshot-1.png > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
[ https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789194#comment-16789194 ] maobaolong commented on HDFS-14353: --- !screenshot-1.png! > Erasure Coding: metrics xmitsInProgress become to negative. > --- > > Key: HDFS-14353 > URL: https://issues.apache.org/jira/browse/HDFS-14353 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, erasure-coding >Affects Versions: 3.3.0 >Reporter: maobaolong >Priority: Major > Attachments: screenshot-1.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14353) Erasure Coding: metrics xmitsInProgress become to negative.
maobaolong created HDFS-14353: - Summary: Erasure Coding: metrics xmitsInProgress become to negative. Key: HDFS-14353 URL: https://issues.apache.org/jira/browse/HDFS-14353 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, erasure-coding Affects Versions: 3.3.0 Reporter: maobaolong -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14344) Erasure Coding: Miss EC block after decommission and restart NN
maobaolong created HDFS-14344: - Summary: Erasure Coding: Miss EC block after decommission and restart NN Key: HDFS-14344 URL: https://issues.apache.org/jira/browse/HDFS-14344 Project: Hadoop HDFS Issue Type: Sub-task Components: ec, erasure-coding, namenode Affects Versions: 3.3.0 Reporter: maobaolong -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14247) Repeat adding node description into network topology
[ https://issues.apache.org/jira/browse/HDFS-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756957#comment-16756957 ] maobaolong commented on HDFS-14247: --- [~marvelrock] Thank you for this improvement, your effort make the source code clearly. > Repeat adding node description into network topology > > > Key: HDFS-14247 > URL: https://issues.apache.org/jira/browse/HDFS-14247 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.3.0 >Reporter: HuangTao >Priority: Minor > Attachments: HDFS-14247.001.patch > > > I find there is a duplicate code to add nodeDescr to networktopology in the > DatanodeManager.java#registerDatanode. > It firstly call networktopology.add(nodeDescr), and then call > addDatanode(nodeDescr) to add nodeDescr again -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11409) DatanodeInfo getNetworkLocation and setNetworkLocation shoud use volatile instead of synchronized
[ https://issues.apache.org/jira/browse/HDFS-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745815#comment-16745815 ] maobaolong commented on HDFS-11409: --- [~vagarychen] HI Liang, Thank you for your improvement for performance. I think we can do something better. So, if we remove the volatile, we get huge performance improvement. After we study and deep think about the keyword volatile, its main effect is keep visibility, if this scenario, we need performance more than the visibility. > DatanodeInfo getNetworkLocation and setNetworkLocation shoud use volatile > instead of synchronized > - > > Key: HDFS-11409 > URL: https://issues.apache.org/jira/browse/HDFS-11409 > Project: Hadoop HDFS > Issue Type: Improvement > Components: performance >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.4 > > Attachments: HDFS-11409.001.patch > > > {{DatanodeInfo}} has synchronized methods {{getNetworkLocation}} and > {{setNetworkLocation}}. While they doing nothing more than setting and > getting variable {{location}}. > Since {{location}} is not being modified based on its current value and is > independent from any other variables. This JIRA propose to remove > synchronized methods but only make {{location}} volatile. Such that threads > will not be blocked on get/setNetworkLocation. > Thanks [~szetszwo] for the offline disscussion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14199) make output of "dfs -getfattr -R -d " differentiate folder, file and symbol link
[ https://issues.apache.org/jira/browse/HDFS-14199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745688#comment-16745688 ] maobaolong commented on HDFS-14199: --- [~ZangLin] Thank you for this improvement. This patch is a better way to distinguish dir or file. LGTM. > make output of "dfs -getfattr -R -d " differentiate folder, file and symbol > link > - > > Key: HDFS-14199 > URL: https://issues.apache.org/jira/browse/HDFS-14199 > Project: Hadoop HDFS > Issue Type: Improvement > Components: shell >Affects Versions: 3.2.0, 3.3.0 >Reporter: Zang Lin >Priority: Minor > Attachments: HDFS-14199.001 > > > The current output of "hdfs dfs -getfattr -R -d" print all type of file > with "file:" , it doesn't differentiate the type such as folder, symbol link. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14182) Datanode usage histogram is clicked to show ip list
[ https://issues.apache.org/jira/browse/HDFS-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739964#comment-16739964 ] maobaolong commented on HDFS-14182: --- [~linyiqun] Hi, Do you mind to review this patch? It help our SRE easy to work with Hadoop. Thank you advance. > Datanode usage histogram is clicked to show ip list > --- > > Key: HDFS-14182 > URL: https://issues.apache.org/jira/browse/HDFS-14182 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: fengchuang >Assignee: fengchuang >Priority: Major > Attachments: HDFS-14182.001.patch, HDFS-14182.002.patch, showip.jpeg > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14182) Datanode usage histogram is clicked to show ip list
[ https://issues.apache.org/jira/browse/HDFS-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16732918#comment-16732918 ] maobaolong commented on HDFS-14182: --- LGTM +1 > Datanode usage histogram is clicked to show ip list > --- > > Key: HDFS-14182 > URL: https://issues.apache.org/jira/browse/HDFS-14182 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: fengchuang >Assignee: fengchuang >Priority: Major > Attachments: HDFS-14182.001.patch, HDFS-14182.002.patch, showip.jpeg > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14171) Performance improvement in Tailing EditLog
[ https://issues.apache.org/jira/browse/HDFS-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729909#comment-16729909 ] maobaolong commented on HDFS-14171: --- LGTM. It make our cluster possible to use the newest Hadoop package from trunk version. Thank you. [~jojochuang] Would you like to take a look? Thank you. > Performance improvement in Tailing EditLog > -- > > Key: HDFS-14171 > URL: https://issues.apache.org/jira/browse/HDFS-14171 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Kenneth Yang >Priority: Minor > Attachments: HDFS-14171.000.patch > > > Stack: > {code:java} > Thread 456 (Edit log tailer): > State: RUNNABLE > Blocked count: 1139 > Waited count: 12 > Stack: > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getNumLiveDataNodes(DatanodeManager.java:1259) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.areThresholdsMet(BlockManagerSafeMode.java:570) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.checkSafeMode(BlockManagerSafeMode.java:213) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.adjustBlockTotals(BlockManagerSafeMode.java:265) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:1087) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:1118) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:1126) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:468) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:258) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:892) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > Thread 455 (pool-16-thread-1): > {code} > code: > {code:java} > private boolean areThresholdsMet() { > assert namesystem.hasWriteLock(); > int datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes(); > synchronized (this) { > return blockSafe >= blockThreshold && datanodeNum >= datanodeThreshold; > } > } > {code} > According to the code, each time the method areThresholdsMet() is called, the > value of {color:#ff}datanodeNum{color} is need to be calculated. > However, in the scenario of {color:#ff}datanodeThreshold{color} is equal > to 0(0 is the default value of the configuration), This expression > datanodeNum >= datanodeThreshold always returns true. > Calling the method {color:#ff}getNumLiveDataNodes(){color} is time > consuming at a scale of 10,000 datanode clusters. Therefore, we add the > judgment condition, and only when the datanodeThreshold is greater than 0, > the datanodeNum is calculated, which improves the perfomance greatly. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14171) Performance improvement in Tailing EditLog
[ https://issues.apache.org/jira/browse/HDFS-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728387#comment-16728387 ] maobaolong commented on HDFS-14171: --- [~kennethlnnn] In my company, we have 10k nodes, we upgrade a cluster from Hadoop-2.7.1 to Hadoop-3.2, the performance goes downside seriously. > Performance improvement in Tailing EditLog > -- > > Key: HDFS-14171 > URL: https://issues.apache.org/jira/browse/HDFS-14171 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Kenneth Yang >Priority: Minor > Attachments: HDFS-14171.000.patch > > > Stack: > {code:java} > Thread 456 (Edit log tailer): > State: RUNNABLE > Blocked count: 1139 > Waited count: 12 > Stack: > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getNumLiveDataNodes(DatanodeManager.java:1259) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.areThresholdsMet(BlockManagerSafeMode.java:570) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.checkSafeMode(BlockManagerSafeMode.java:213) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.adjustBlockTotals(BlockManagerSafeMode.java:265) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:1087) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:1118) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:1126) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:468) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:258) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:892) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > Thread 455 (pool-16-thread-1): > {code} > code: > {code:java} > private boolean areThresholdsMet() { > assert namesystem.hasWriteLock(); > int datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes(); > synchronized (this) { > return blockSafe >= blockThreshold && datanodeNum >= datanodeThreshold; > } > } > {code} > According to the code, each time the method areThresholdsMet() is called, the > value of {color:#ff}datanodeNum{color} is need to be calculated. > However, in the scenario of {color:#ff}datanodeThreshold{color} is equal > to 0(0 is the default value of the configuration), This expression > datanodeNum >= datanodeThreshold always returns true. > Calling the method {color:#ff}getNumLiveDataNodes(){color} is time > consuming at a scale of 10,000 datanode clusters. Therefore, we add the > judgment condition, and only when the datanodeThreshold is greater than 0, > the datanodeNum is calculated, which improves the perfomance greatly. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14171) Performance improvement in Tailing EditLog
[ https://issues.apache.org/jira/browse/HDFS-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728387#comment-16728387 ] maobaolong edited comment on HDFS-14171 at 12/24/18 12:47 PM: -- [~kennethlnnn] In my company, we have 10k nodes, we upgrade a cluster from Hadoop-2.7.1 to Hadoop-3.2, the performance goes downside seriously. I think this is the reason. In our cluster, we really config the key datanodeThreshold to 0, because we think enough block can exit safe mode safety. Thank you for your improvement. it is a best way to resolve our performance problem. was (Author: maobaolong): [~kennethlnnn] In my company, we have 10k nodes, we upgrade a cluster from Hadoop-2.7.1 to Hadoop-3.2, the performance goes downside seriously. I think this is the reason. In our cluster, we really config the key datanodeThreshold to 0, because we think enough block can exit safe mode safety. > Performance improvement in Tailing EditLog > -- > > Key: HDFS-14171 > URL: https://issues.apache.org/jira/browse/HDFS-14171 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Kenneth Yang >Priority: Minor > Attachments: HDFS-14171.000.patch > > > Stack: > {code:java} > Thread 456 (Edit log tailer): > State: RUNNABLE > Blocked count: 1139 > Waited count: 12 > Stack: > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getNumLiveDataNodes(DatanodeManager.java:1259) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.areThresholdsMet(BlockManagerSafeMode.java:570) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.checkSafeMode(BlockManagerSafeMode.java:213) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.adjustBlockTotals(BlockManagerSafeMode.java:265) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:1087) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:1118) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:1126) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:468) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:258) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:892) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > Thread 455 (pool-16-thread-1): > {code} > code: > {code:java} > private boolean areThresholdsMet() { > assert namesystem.hasWriteLock(); > int datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes(); > synchronized (this) { > return blockSafe >= blockThreshold && datanodeNum >= datanodeThreshold; > } > } > {code} > According to the code, each time the method areThresholdsMet() is called, the > value of {color:#ff}datanodeNum{color} is need to be calculated. > However, in the scenario of {color:#ff}datanodeThreshold{color} is equal > to 0(0 is the default value of the configuration), This expression > datanodeNum >= datanodeThreshold always returns true. > Calling the method {color:#ff}getNumLiveDataNodes(){color} is time > consuming at a scale of 10,000 datanode clusters. Therefore, we add the > judgment condition, and only when the datanodeThreshold is greater than 0, > the datanodeNum is calculated, which improves the perfomance greatly. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14171) Performance improvement in Tailing EditLog
[ https://issues.apache.org/jira/browse/HDFS-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728387#comment-16728387 ] maobaolong edited comment on HDFS-14171 at 12/24/18 12:46 PM: -- [~kennethlnnn] In my company, we have 10k nodes, we upgrade a cluster from Hadoop-2.7.1 to Hadoop-3.2, the performance goes downside seriously. I think this is the reason. In our cluster, we really config the key datanodeThreshold to 0, because we think enough block can exit safe mode safety. was (Author: maobaolong): [~kennethlnnn] In my company, we have 10k nodes, we upgrade a cluster from Hadoop-2.7.1 to Hadoop-3.2, the performance goes downside seriously. I think this is the reason. In our cluster, we really config the key to 0, because we think enough block can exit safe mode safety. > Performance improvement in Tailing EditLog > -- > > Key: HDFS-14171 > URL: https://issues.apache.org/jira/browse/HDFS-14171 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Kenneth Yang >Priority: Minor > Attachments: HDFS-14171.000.patch > > > Stack: > {code:java} > Thread 456 (Edit log tailer): > State: RUNNABLE > Blocked count: 1139 > Waited count: 12 > Stack: > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getNumLiveDataNodes(DatanodeManager.java:1259) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.areThresholdsMet(BlockManagerSafeMode.java:570) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.checkSafeMode(BlockManagerSafeMode.java:213) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.adjustBlockTotals(BlockManagerSafeMode.java:265) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:1087) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:1118) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:1126) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:468) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:258) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:892) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > Thread 455 (pool-16-thread-1): > {code} > code: > {code:java} > private boolean areThresholdsMet() { > assert namesystem.hasWriteLock(); > int datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes(); > synchronized (this) { > return blockSafe >= blockThreshold && datanodeNum >= datanodeThreshold; > } > } > {code} > According to the code, each time the method areThresholdsMet() is called, the > value of {color:#ff}datanodeNum{color} is need to be calculated. > However, in the scenario of {color:#ff}datanodeThreshold{color} is equal > to 0(0 is the default value of the configuration), This expression > datanodeNum >= datanodeThreshold always returns true. > Calling the method {color:#ff}getNumLiveDataNodes(){color} is time > consuming at a scale of 10,000 datanode clusters. Therefore, we add the > judgment condition, and only when the datanodeThreshold is greater than 0, > the datanodeNum is calculated, which improves the perfomance greatly. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14171) Performance improvement in Tailing EditLog
[ https://issues.apache.org/jira/browse/HDFS-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728387#comment-16728387 ] maobaolong edited comment on HDFS-14171 at 12/24/18 12:45 PM: -- [~kennethlnnn] In my company, we have 10k nodes, we upgrade a cluster from Hadoop-2.7.1 to Hadoop-3.2, the performance goes downside seriously. I think this is the reason. In our cluster, we really config the key to 0, because we think enough block can exit safe mode safety. was (Author: maobaolong): [~kennethlnnn] In my company, we have 10k nodes, we upgrade a cluster from Hadoop-2.7.1 to Hadoop-3.2, the performance goes downside seriously. > Performance improvement in Tailing EditLog > -- > > Key: HDFS-14171 > URL: https://issues.apache.org/jira/browse/HDFS-14171 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Kenneth Yang >Priority: Minor > Attachments: HDFS-14171.000.patch > > > Stack: > {code:java} > Thread 456 (Edit log tailer): > State: RUNNABLE > Blocked count: 1139 > Waited count: 12 > Stack: > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getNumLiveDataNodes(DatanodeManager.java:1259) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.areThresholdsMet(BlockManagerSafeMode.java:570) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.checkSafeMode(BlockManagerSafeMode.java:213) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.adjustBlockTotals(BlockManagerSafeMode.java:265) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:1087) > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:1118) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:1126) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:468) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:258) > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161) > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:892) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427) > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423) > Thread 455 (pool-16-thread-1): > {code} > code: > {code:java} > private boolean areThresholdsMet() { > assert namesystem.hasWriteLock(); > int datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes(); > synchronized (this) { > return blockSafe >= blockThreshold && datanodeNum >= datanodeThreshold; > } > } > {code} > According to the code, each time the method areThresholdsMet() is called, the > value of {color:#ff}datanodeNum{color} is need to be calculated. > However, in the scenario of {color:#ff}datanodeThreshold{color} is equal > to 0(0 is the default value of the configuration), This expression > datanodeNum >= datanodeThreshold always returns true. > Calling the method {color:#ff}getNumLiveDataNodes(){color} is time > consuming at a scale of 10,000 datanode clusters. Therefore, we add the > judgment condition, and only when the datanodeThreshold is greater than 0, > the datanodeNum is calculated, which improves the perfomance greatly. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13902) Add jmx conf and stacks menus to the datanode page
[ https://issues.apache.org/jira/browse/HDFS-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606667#comment-16606667 ] maobaolong commented on HDFS-13902: --- [~fengchuang] Look good, we can easy to open the stacks jmx and conf page link. thank you. > Add jmx conf and stacks menus to the datanode page > --- > > Key: HDFS-13902 > URL: https://issues.apache.org/jira/browse/HDFS-13902 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.3 >Reporter: fengchuang >Priority: Minor > Attachments: HDFS-13902.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7663) Erasure Coding: Append on striped file
[ https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604149#comment-16604149 ] maobaolong commented on HDFS-7663: -- [~walter.k.su] Is there any update? It is a important feature. > Erasure Coding: Append on striped file > -- > > Key: HDFS-7663 > URL: https://issues.apache.org/jira/browse/HDFS-7663 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0-alpha1 >Reporter: Jing Zhao >Assignee: Walter Su >Priority: Major > Attachments: HDFS-7663.00.txt, HDFS-7663.01.patch > > > Append should be easy if we have variable length block support from > HDFS-3689, i.e., the new data will be appended to a new block. We need to > revisit whether and how to support appending data to the original last block. > 1. Append to a closed striped file, with NEW_BLOCK flag enabled (this) > 2. Append to a under-construction striped file, with NEW_BLOCK flag enabled > (HDFS-9173) > 3. Append to a striped file, by appending to last block group (follow-on) > This jira attempts to implement the #1, and also track #2, #3. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13881) Export or Import a dirImage
[ https://issues.apache.org/jira/browse/HDFS-13881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13881: -- Description: Now, we want to copy a directory meta tree from a cluster's namenode to another cluster's namenode. So, i suggest that namenode can support export a directory to a image file, and another namenode can import it. > Export or Import a dirImage > --- > > Key: HDFS-13881 > URL: https://issues.apache.org/jira/browse/HDFS-13881 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode >Affects Versions: 3.1.1 >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > > Now, we want to copy a directory meta tree from a cluster's namenode to > another cluster's namenode. > So, i suggest that namenode can support export a directory to a image file, > and another namenode can import it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13881) Export or Import a dirImage
maobaolong created HDFS-13881: - Summary: Export or Import a dirImage Key: HDFS-13881 URL: https://issues.apache.org/jira/browse/HDFS-13881 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Affects Versions: 3.1.1 Reporter: maobaolong Assignee: maobaolong -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13804) DN maxDataLength is useless except DN webui. I suggest to get maxDataLength from NN heartbeat.
maobaolong created HDFS-13804: - Summary: DN maxDataLength is useless except DN webui. I suggest to get maxDataLength from NN heartbeat. Key: HDFS-13804 URL: https://issues.apache.org/jira/browse/HDFS-13804 Project: Hadoop HDFS Issue Type: New Feature Components: datanode Reporter: maobaolong -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13269) After too many open file exception occurred, the standby NN never do checkpoint
[ https://issues.apache.org/jira/browse/HDFS-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567653#comment-16567653 ] maobaolong commented on HDFS-13269: --- It is indeed a to be improve item. > After too many open file exception occurred, the standby NN never do > checkpoint > --- > > Key: HDFS-13269 > URL: https://issues.apache.org/jira/browse/HDFS-13269 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.2.0 >Reporter: maobaolong >Priority: Major > > do saveNameSpace in dfsadmin. > The output as following: > > {code:java} > saveNamespace: No image directories available! > {code} > The Namenode log show: > > > {code:java} > [2018-01-13T10:32:19.903+08:00] [INFO] [Standby State Checkpointer] : > Triggering checkpoint because there have been 10159265 txns since the last > checkpoint, which exceeds the configured threshold 1000 > [2018-01-13T10:32:19.903+08:00] [INFO] [Standby State Checkpointer] : Save > namespace ... > ... > [2018-01-13T10:37:10.539+08:00] [WARN] [1985938863@qtp-61073295-1 - Acceptor0 > HttpServer2$SelectChannelConnectorWithSafeStartup@HOST_A:50070] : EXCEPTION > java.io.IOException: Too many open files > at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) > at > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) > at > sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) > at > org.mortbay.jetty.nio.SelectChannelConnector$1.acceptChannel(SelectChannelConnector.java:75) > at > org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:686) > at > org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:192) > at > org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124) > at > org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708) > at > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > [2018-01-13T10:37:15.421+08:00] [ERROR] [FSImageSaver for /data0/nn of type > IMAGE_AND_EDITS] : Unable to save image for /data0/nn > java.io.FileNotFoundException: > /data0/nn/current/fsimage_40247283317.md5.tmp (Too many open files) > at java.io.FileOutputStream.open0(Native Method) > at java.io.FileOutputStream.open(FileOutputStream.java:270) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:162) > at > org.apache.hadoop.hdfs.util.AtomicFileOutputStream.(AtomicFileOutputStream.java:58) > at > org.apache.hadoop.hdfs.util.MD5FileUtils.saveMD5File(MD5FileUtils.java:157) > at > org.apache.hadoop.hdfs.util.MD5FileUtils.saveMD5File(MD5FileUtils.java:149) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:990) > at > org.apache.hadoop.hdfs.server.namenode.FSImage$FSImageSaver.run(FSImage.java:1039) > at java.lang.Thread.run(Thread.java:745) > [2018-01-13T10:37:15.421+08:00] [ERROR] [Standby State Checkpointer] : Error > reported on storage directory Storage Directory /data0/nn > [2018-01-13T10:37:15.421+08:00] [WARN] [Standby State Checkpointer] : About > to remove corresponding storage: /data0/nn > [2018-01-13T10:37:15.429+08:00] [ERROR] [Standby State Checkpointer] : > Exception in doCheckpoint > java.io.IOException: Failed to save in any storage directories while saving > namespace. > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1176) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1107) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:185) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.doWork(StandbyCheckpointer.java:353) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.access$700(StandbyCheckpointer.java:260) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread$1.run(StandbyCheckpointer.java:280) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.run(StandbyCheckpointer.java:276) > ... > [2018-01-13T15:52:33.783+08:00] [INFO] [Standby State Checkpointer] : Save > namespace ... > [2018-01-13T15:52:33.783+08:00] [ERROR] [Standby Stat
[jira] [Created] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.
maobaolong created HDFS-13783: - Summary: Balancer: make balancer to be a long service process for easy to monitor it. Key: HDFS-13783 URL: https://issues.apache.org/jira/browse/HDFS-13783 Project: Hadoop HDFS Issue Type: New Feature Components: balancer & mover Affects Versions: 3.0.3 Reporter: maobaolong If we have a long service process of balancer, like namenode, datanode, we can get metrics of balancer, the metrics can tell us the status of balancer, the amount of block it has moved, We can get or set the balance plan by the balancer webUI. So many things we can do if we have a long balancer service process. So, shall we start to plan the new Balancer? Hope this feature can enter the next release of hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16480440#comment-16480440 ] maobaolong commented on HDFS-13480: --- [~linyiqun] [~elgoiri] Thank you for your these days review, i have remove some useless import cause by me. PTAL. > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch, > HDFS-13480.002.patch, HDFS-13480.003.patch, HDFS-13480.004.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13480: -- Attachment: HDFS-13480.004.patch > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch, > HDFS-13480.002.patch, HDFS-13480.003.patch, HDFS-13480.004.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478607#comment-16478607 ] maobaolong commented on HDFS-13480: --- [~linyiqun] I've formatted my code-style, add the java doc of assertRouterHeartbeater, improved the documents. PTAL. Thank you for your comments. > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch, > HDFS-13480.002.patch, HDFS-13480.003.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13480: -- Attachment: HDFS-13480.003.patch > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch, > HDFS-13480.002.patch, HDFS-13480.003.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477176#comment-16477176 ] maobaolong commented on HDFS-13480: --- [~elgoiri] Thank you for your great review, it make my test look more beautiful. I've addressed your comments, PTAL. > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch, > HDFS-13480.002.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13480: -- Attachment: HDFS-13480.002.patch > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch, > HDFS-13480.002.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474030#comment-16474030 ] maobaolong commented on HDFS-13480: --- [~linyiqun] [~elgoiri] Thank you for your review. * It make sense to keep the existed config name. * I've updated the hdfs-rbf-default.xml * Add a simple test to test the two switch * Updated the documentation(I am not good at this, if i lost some place to modify, please feel free to tell me) I've add new patch. PTAL. > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13480: -- Attachment: HDFS-13480.002.patch > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch, HDFS-13480.002.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471511#comment-16471511 ] maobaolong commented on HDFS-13480: --- [~elgoiri] [~linyiqun] Please take a look. Thank you. > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13480: -- Status: Patch Available (was: Open) > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13480) RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key
[ https://issues.apache.org/jira/browse/HDFS-13480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13480: -- Attachment: HDFS-13480.001.patch > RBF: Separate namenodeHeartbeat and routerHeartbeat to different config key > --- > > Key: HDFS-13480 > URL: https://issues.apache.org/jira/browse/HDFS-13480 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: maobaolong >Assignee: maobaolong >Priority: Major > Attachments: HDFS-13480.001.patch > > > Now, if i enable the heartbeat.enable, but i do not want to monitor any > namenode, i get an ERROR log like: > {code:java} > [2018-04-19T14:00:03.057+08:00] [ERROR] > federation.router.Router.serviceInit(Router.java 214) [main] : Heartbeat is > enabled but there are no namenodes to monitor > {code} > and if i disable the heartbeat.enable, we cannot get any mounttable update, > because the following logic in Router.java: > {code:java} > if (conf.getBoolean( > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE, > RBFConfigKeys.DFS_ROUTER_HEARTBEAT_ENABLE_DEFAULT)) { > // Create status updater for each monitored Namenode > this.namenodeHeartbeatServices = createNamenodeHeartbeatServices(); > for (NamenodeHeartbeatService hearbeatService : > this.namenodeHeartbeatServices) { > addService(hearbeatService); > } > if (this.namenodeHeartbeatServices.isEmpty()) { > LOG.error("Heartbeat is enabled but there are no namenodes to > monitor"); > } > // Periodically update the router state > this.routerHeartbeatService = new RouterHeartbeatService(this); > addService(this.routerHeartbeatService); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13245) RBF: State store DBMS implementation
[ https://issues.apache.org/jira/browse/HDFS-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13245: -- Attachment: (was: HDFS-13245.001) > RBF: State store DBMS implementation > > > Key: HDFS-13245 > URL: https://issues.apache.org/jira/browse/HDFS-13245 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: maobaolong >Assignee: Yiran Wu >Priority: Major > Attachments: HDFS-13245.001.patch, HDFS-13245.002.patch, > HDFS-13245.003.patch, HDFS-13245.004.patch, HDFS-13245.005.patch, > HDFS-13245.006.patch, HDFS-13245.007.patch, HDFS-13245.008.patch, > HDFS-13245.009.patch, HDFS-13245.010.patch > > > Add a DBMS implementation for the State Store. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13245) RBF: State store DBMS implementation
[ https://issues.apache.org/jira/browse/HDFS-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HDFS-13245: -- Attachment: HDFS-13245.001 > RBF: State store DBMS implementation > > > Key: HDFS-13245 > URL: https://issues.apache.org/jira/browse/HDFS-13245 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: maobaolong >Assignee: Yiran Wu >Priority: Major > Attachments: HDFS-13245.001, HDFS-13245.001.patch, > HDFS-13245.002.patch, HDFS-13245.003.patch, HDFS-13245.004.patch, > HDFS-13245.005.patch, HDFS-13245.006.patch, HDFS-13245.007.patch, > HDFS-13245.008.patch, HDFS-13245.009.patch, HDFS-13245.010.patch > > > Add a DBMS implementation for the State Store. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org