[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2022-09-14 Thread ZanderXu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603955#comment-17603955
 ] 

ZanderXu commented on HDFS-8789:


After quickly looking HDFS-14053, it can migrate the old blocks to satisfy the 
new block placement policy. But if all the old blocks are migrated by namenode, 
it will affect the processing performance of Namenode even if we can limit the 
speed of migration. It would be nice to have a peripheral migration tool that 
can migrate old blocks automatically, efficiently, and with minimal impact.

Beside this migrator, do you have disabled migrating the blocks after namenode 
become active? [~sodonnell] . Because after namenode become active, it will 
processMisReplicatedBlocks. 

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2022-09-14 Thread ZanderXu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603948#comment-17603948
 ] 

ZanderXu commented on HDFS-8789:


Thanks [~sodonnell] for your timely comment. I will look into HDFS-14053. Thanks

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2022-09-14 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603945#comment-17603945
 ] 

Stephen O'Donnell commented on HDFS-8789:
-

I don't think this tool is needed as er have HDFS-14053 committed since this 
Jira was opened, which allows you to migrate blocks on a path by path basis.

There are no plans from our side to move this forward.

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2022-09-14 Thread ZanderXu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603933#comment-17603933
 ] 

ZanderXu commented on HDFS-8789:


We plan to use the upgrade domain in our prod environment, so this tool will be 
necessary for us before deploying the upgrade domain. 

After looking into the upgradeDomain(UD) and this migrator tool, maybe there 
are some improvements we can do:
 * After deploying the UD, namenode should not try to migrate the old blocks 
after become the active during processMisReplicatesAsync. We should migrate the 
old blocks by this migrator tool.
 * Maybe we can integrate this migrator to mover. And we can just simply add 
one new processFile method in mover to achieve the goals that migrating the 
blocks that didn't satisfy the block placement policy.

[~weichiu] [~sodonnell] [~ctrezzo] Do you have plans to push this PR forward?  
If have, I have done some works and interested in carrying it forward.

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2019-08-20 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911798#comment-16911798
 ] 

Wei-Chiu Chuang commented on HDFS-8789:
---

Very quickly went through the patch.

(1) if this tool runs for an extended period of time, say more than a day, it 
may fail in a Kerberized environment since it does not renew kerberos 
credentials. Just like HDFS-11741.
(2) I'm still not clear about the process of running this tool -- is an 
administrator supposed to run this tool before restarting NN and enabling 
BlockPlacementPolicyWithUpgradeDomain?  But NameNode would treat the new 
replica as mis-replicated block replica, and recalculate, wouldn't it? The 
migrator has to finish migration faster than NameNode can move replicas back.
(3) What if I simply skip the migrator step, and enable UD. NameNode would see 
a huge number of mis-replicated blocks, but it should eventually move the 
replicas to the correct location. Of course, we need to fix HDFS-14637 first.
(4) i wonder if it makes sense to integrate migrator to balancer/mover. There 
are already balancer/mover/diskbalancer. Having another similar tool in the 
arsenal can be a support burden. From what I see, balancer doesn't support 
anything other than the default placement policy (BlockPlacementPolicyDefault). 
Is the migrator tool sort of like a balancer designed to support UD?

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2019-08-15 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908008#comment-16908008
 ] 

Wei-Chiu Chuang commented on HDFS-8789:
---

Hi [~ctrezzo]! 
Thanks for the tool. This'll be useful for us.
You mentioned about this tool when we met. I am interested in carrying this 
forward, making this into an Apache Hadoop release.

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2019-07-08 Thread Stephen O'Donnell (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880506#comment-16880506
 ] 

Stephen O'Donnell commented on HDFS-8789:
-

I've been looking into upgradeDomain (UD). To try to answer the question:

{quote}

What happens on NN restart (to be precise, during the replication queues 
initialization)? 
 Each block will be checked against 
{{BlockPlacementPolicy#verifyBlockPlacement()}} and will get added to the 
replication queue. When calculating repl work, {{chooseTarget()}} is supposed 
to help correct any violation. This happens when the network topology or the 
placement policy changes. Does it also work for upgrade domain block placement 
policy?

{quote}

What I have found is that if you start with a cluster with no UD, and then 
enable UD, then on restart the NN does notice all the blocks violate the 
placement policy and it adds them to the replication queue. However I believe 
there are some issues in the logic used to correct the problems in that area of 
the code

There are at least two issues I have come across:
 # With UD enabled, but not racks configured the queued replication work never 
makes any progress, as in blockManager.validateReconstructionWork(), it checks 
to see if the new replica increases the number of racks, and if it does not, it 
skips it and tries again later.
 # In blockManager.scheduleReconstruction there is some logic that says if 
`numReplicas.liveReplicas() >= requiredRedundancy` then we need only 1 new 
replica. This would also be the case for rack redundancy (we always want 2 
racks), but for UD, we may need 2 new replicas if all 3 existing are on the 
same UD.

I will open a new Jira for this to see if we can get it fixed, but it may be 
slightly trickier than it sounds with the current code structure.

Note that we also have HDFS-14053 committed since this Jira was opened, which 
allows miss-replicated blocks to be processed via fsck on a path by path basis.

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2017-05-10 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004804#comment-16004804
 ] 

Kihwal Lee commented on HDFS-8789:
--

What happens on NN restart (to be precise, during the replication queues 
initialization)? 
Each block will be checked against 
{{BlockPlacementPolicy#verifyBlockPlacement()}} and will get added to the 
replication queue. When calculating repl work, {{chooseTarget()}} is supposed 
to help correct any violation. This happens when the network topology or the 
placement policy changes.  Does it also work for upgrade domain block placement 
policy? It will stress namenode if there are a lot of replicas to be migrated, 
but will do the job.  I.e. the option number 2 might be already there.

If we are worried about overwhelming NN, we could have a separate thread to do 
throttled scan of block placement policy violations. The replication candidate 
generation rate would also be throttled. The repl queue init would only deal 
with truly under-replicated cases. 

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2017-05-08 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001194#comment-16001194
 ] 

Chris Trezzo commented on HDFS-8789:


[~kihwal] Thanks for the comment! As I said, this is just a strawman and is no 
where near ready for commit. I just posted it as a starting point for 
conversation. I.e. where should the functionality of block placement policy 
migration be supported and what are some high level approaches people would be 
interested in?

I can see four options off the top of my head for supporting block placement 
policy:
1. Client side
2. Namenode
3. Additional server side daemon
4. No where (i.e. we leave it up to users to build their own tools)

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2017-05-08 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001021#comment-16001021
 ] 

Kihwal Lee commented on HDFS-8789:
--

This can be fatal to namenode. 
{code}
blocksWlocs = nnc.getBlocks(node.datanode, Long.MAX_VALUE);
{code}

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2017-05-03 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995424#comment-15995424
 ] 

Chris Trezzo commented on HDFS-8789:


[~kihwal] [~zhz] I am not sure what your plans are to support migration from 
one block placement policy to another (i.e. migrating from default to upgrade 
domains), but I have a patch posted on this jira which contains the migrator 
that we used to convert our clusters. It is definitely does not handle all use 
cases, so I would be interested to hear your approaches as well. Thanks!

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2016-03-04 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180649#comment-15180649
 ] 

Ming Ma commented on HDFS-8789:
---

Maybe we can use migrator as the first scenario for doing block movement 
scheduling inside namenode? Changing balancer from client-side to server-side 
requires more work to make the command line backward compatible given it is 
used by admins and automation. Migrator is a brand new tool and we can use it 
as the starting point for the proper design to accommodate migrator, balancer 
and other scenarios done inside namenode.

Other useful statistics provided by the migrator tool such as block size 
distribution, block rack diversity distribution and block replication 
distribution is somewhat independent of migrator. Maybe we can have another 
tool to generate those stats, or maybe fsck.

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2016-02-05 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135414#comment-15135414
 ] 

Chris Trezzo commented on HDFS-8789:


Also, here is the usage message for the command:
{noformat}
USAGE for Migrator:
-help
Displays this help message.
-placementpolicy
The target BlockPlacementPolicy that the migrator will migrate blocks 
to. If not specified the migrator will use BlockPlacementPolicyDefault.
-fixblocks
Phase 3 dispatches block moves on the cluster. If this flag is not 
specified, Phase 3 simply prints out the dry run moves, but does not dispatch 
anything.
-blockpool [blockpoolid]
Only run the migrator on the specified block pool. If not specified, 
the migrator will run on all blockpools.
-datanode [hostname]
Only run the migrator on the specified datanode. If not specified, the 
migrator will run on all datanodes.
{noformat}

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2015-09-09 Thread Xu Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738070#comment-14738070
 ] 

Xu Chen commented on HDFS-8789:
---

HI [~ctrezzo]  As your say, what that "robust tool "  looks like ? command line 
? RaidNode? or Code inside? did block migrate  is automatically with client's 
access?

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks between placement 
> policies. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8789) Block Placement policy migrator

2015-07-16 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630162#comment-14630162
 ] 

Chris Trezzo commented on HDFS-8789:


I have an initial patch for a migrator that we can use as a starting point. I 
will post it shortly.

> Block Placement policy migrator
> ---
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>
> As we start to add new block placement policies to HDFS, it will be necessary 
> to have a robust tool that can migrate HDFS blocks to the new placement 
> policy. This jira is for the design and implementation of that tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)