[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603955#comment-17603955 ] ZanderXu commented on HDFS-8789: After quickly looking HDFS-14053, it can migrate the old blocks to satisfy the new block placement policy. But if all the old blocks are migrated by namenode, it will affect the processing performance of Namenode even if we can limit the speed of migration. It would be nice to have a peripheral migration tool that can migrate old blocks automatically, efficiently, and with minimal impact. Beside this migrator, do you have disabled migrating the blocks after namenode become active? [~sodonnell] . Because after namenode become active, it will processMisReplicatedBlocks. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Major > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603948#comment-17603948 ] ZanderXu commented on HDFS-8789: Thanks [~sodonnell] for your timely comment. I will look into HDFS-14053. Thanks > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Major > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603945#comment-17603945 ] Stephen O'Donnell commented on HDFS-8789: - I don't think this tool is needed as er have HDFS-14053 committed since this Jira was opened, which allows you to migrate blocks on a path by path basis. There are no plans from our side to move this forward. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Major > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603933#comment-17603933 ] ZanderXu commented on HDFS-8789: We plan to use the upgrade domain in our prod environment, so this tool will be necessary for us before deploying the upgrade domain. After looking into the upgradeDomain(UD) and this migrator tool, maybe there are some improvements we can do: * After deploying the UD, namenode should not try to migrate the old blocks after become the active during processMisReplicatesAsync. We should migrate the old blocks by this migrator tool. * Maybe we can integrate this migrator to mover. And we can just simply add one new processFile method in mover to achieve the goals that migrating the blocks that didn't satisfy the block placement policy. [~weichiu] [~sodonnell] [~ctrezzo] Do you have plans to push this PR forward? If have, I have done some works and interested in carrying it forward. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Major > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911798#comment-16911798 ] Wei-Chiu Chuang commented on HDFS-8789: --- Very quickly went through the patch. (1) if this tool runs for an extended period of time, say more than a day, it may fail in a Kerberized environment since it does not renew kerberos credentials. Just like HDFS-11741. (2) I'm still not clear about the process of running this tool -- is an administrator supposed to run this tool before restarting NN and enabling BlockPlacementPolicyWithUpgradeDomain? But NameNode would treat the new replica as mis-replicated block replica, and recalculate, wouldn't it? The migrator has to finish migration faster than NameNode can move replicas back. (3) What if I simply skip the migrator step, and enable UD. NameNode would see a huge number of mis-replicated blocks, but it should eventually move the replicas to the correct location. Of course, we need to fix HDFS-14637 first. (4) i wonder if it makes sense to integrate migrator to balancer/mover. There are already balancer/mover/diskbalancer. Having another similar tool in the arsenal can be a support burden. From what I see, balancer doesn't support anything other than the default placement policy (BlockPlacementPolicyDefault). Is the migrator tool sort of like a balancer designed to support UD? > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Major > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908008#comment-16908008 ] Wei-Chiu Chuang commented on HDFS-8789: --- Hi [~ctrezzo]! Thanks for the tool. This'll be useful for us. You mentioned about this tool when we met. I am interested in carrying this forward, making this into an Apache Hadoop release. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Major > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880506#comment-16880506 ] Stephen O'Donnell commented on HDFS-8789: - I've been looking into upgradeDomain (UD). To try to answer the question: {quote} What happens on NN restart (to be precise, during the replication queues initialization)? Each block will be checked against {{BlockPlacementPolicy#verifyBlockPlacement()}} and will get added to the replication queue. When calculating repl work, {{chooseTarget()}} is supposed to help correct any violation. This happens when the network topology or the placement policy changes. Does it also work for upgrade domain block placement policy? {quote} What I have found is that if you start with a cluster with no UD, and then enable UD, then on restart the NN does notice all the blocks violate the placement policy and it adds them to the replication queue. However I believe there are some issues in the logic used to correct the problems in that area of the code There are at least two issues I have come across: # With UD enabled, but not racks configured the queued replication work never makes any progress, as in blockManager.validateReconstructionWork(), it checks to see if the new replica increases the number of racks, and if it does not, it skips it and tries again later. # In blockManager.scheduleReconstruction there is some logic that says if `numReplicas.liveReplicas() >= requiredRedundancy` then we need only 1 new replica. This would also be the case for rack redundancy (we always want 2 racks), but for UD, we may need 2 new replicas if all 3 existing are on the same UD. I will open a new Jira for this to see if we can get it fixed, but it may be slightly trickier than it sounds with the current code structure. Note that we also have HDFS-14053 committed since this Jira was opened, which allows miss-replicated blocks to be processed via fsck on a path by path basis. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Major > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004804#comment-16004804 ] Kihwal Lee commented on HDFS-8789: -- What happens on NN restart (to be precise, during the replication queues initialization)? Each block will be checked against {{BlockPlacementPolicy#verifyBlockPlacement()}} and will get added to the replication queue. When calculating repl work, {{chooseTarget()}} is supposed to help correct any violation. This happens when the network topology or the placement policy changes. Does it also work for upgrade domain block placement policy? It will stress namenode if there are a lot of replicas to be migrated, but will do the job. I.e. the option number 2 might be already there. If we are worried about overwhelming NN, we could have a separate thread to do throttled scan of block placement policy violations. The replication candidate generation rate would also be throttled. The repl queue init would only deal with truly under-replicated cases. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001194#comment-16001194 ] Chris Trezzo commented on HDFS-8789: [~kihwal] Thanks for the comment! As I said, this is just a strawman and is no where near ready for commit. I just posted it as a starting point for conversation. I.e. where should the functionality of block placement policy migration be supported and what are some high level approaches people would be interested in? I can see four options off the top of my head for supporting block placement policy: 1. Client side 2. Namenode 3. Additional server side daemon 4. No where (i.e. we leave it up to users to build their own tools) > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001021#comment-16001021 ] Kihwal Lee commented on HDFS-8789: -- This can be fatal to namenode. {code} blocksWlocs = nnc.getBlocks(node.datanode, Long.MAX_VALUE); {code} > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995424#comment-15995424 ] Chris Trezzo commented on HDFS-8789: [~kihwal] [~zhz] I am not sure what your plans are to support migration from one block placement policy to another (i.e. migrating from default to upgrade domains), but I have a patch posted on this jira which contains the migrator that we used to convert our clusters. It is definitely does not handle all use cases, so I would be interested to hear your approaches as well. Thanks! > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180649#comment-15180649 ] Ming Ma commented on HDFS-8789: --- Maybe we can use migrator as the first scenario for doing block movement scheduling inside namenode? Changing balancer from client-side to server-side requires more work to make the command line backward compatible given it is used by admins and automation. Migrator is a brand new tool and we can use it as the starting point for the proper design to accommodate migrator, balancer and other scenarios done inside namenode. Other useful statistics provided by the migrator tool such as block size distribution, block rack diversity distribution and block replication distribution is somewhat independent of migrator. Maybe we can have another tool to generate those stats, or maybe fsck. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135414#comment-15135414 ] Chris Trezzo commented on HDFS-8789: Also, here is the usage message for the command: {noformat} USAGE for Migrator: -help Displays this help message. -placementpolicy The target BlockPlacementPolicy that the migrator will migrate blocks to. If not specified the migrator will use BlockPlacementPolicyDefault. -fixblocks Phase 3 dispatches block moves on the cluster. If this flag is not specified, Phase 3 simply prints out the dry run moves, but does not dispatch anything. -blockpool [blockpoolid] Only run the migrator on the specified block pool. If not specified, the migrator will run on all blockpools. -datanode [hostname] Only run the migrator on the specified datanode. If not specified, the migrator will run on all datanodes. {noformat} > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch > > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738070#comment-14738070 ] Xu Chen commented on HDFS-8789: --- HI [~ctrezzo] As your say, what that "robust tool " looks like ? command line ? RaidNode? or Code inside? did block migrate is automatically with client's access? > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks between placement > policies. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8789) Block Placement policy migrator
[ https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630162#comment-14630162 ] Chris Trezzo commented on HDFS-8789: I have an initial patch for a migrator that we can use as a starting point. I will post it shortly. > Block Placement policy migrator > --- > > Key: HDFS-8789 > URL: https://issues.apache.org/jira/browse/HDFS-8789 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Chris Trezzo >Assignee: Chris Trezzo > > As we start to add new block placement policies to HDFS, it will be necessary > to have a robust tool that can migrate HDFS blocks to the new placement > policy. This jira is for the design and implementation of that tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)