[ 
https://issues.apache.org/jira/browse/HDDS-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679802#comment-17679802
 ] 

Stephen O'Donnell commented on HDDS-7776:
-----------------------------------------

[~kerneltime] We are working on moving to a ReplicationManager model where we 
can limit the number of inflight replications in the system on a per node 
basis. Due to long lived pipelines and EC decommission, the bottleneck for 
replication is often going to be the source node, and we would like to make 
sure a source node is not over-subscribed.

In the current model, commands are sent to a target and pull from the sources. 
If there are a large number of under replicated containers with only a few 
sources (eg decommissioning an EC host, or one node goes does out a long lived 
pipeline), there will be many potential targets to replicate to, but only a few 
sources. If we schedule X commands on say, 20 targets, then 2 sources may need 
to service all those commands in parallel. It is easier if we can decide in RM 
how many commands a source can handle and have the source push the data to 
various targets. That way, we just need to count the commands assigned to a DN 
to limit the load. Targets are more likely to be random while sources are not 
too random in Ozone generally. Moving the replication to this model will let us 
better manage the load in RM and hopefully avoid over-subscribing nodes.

> Container replication in push model
> -----------------------------------
>
>                 Key: HDDS-7776
>                 URL: https://issues.apache.org/jira/browse/HDDS-7776
>             Project: Apache Ozone
>          Issue Type: Task
>          Components: Ozone Datanode, SCM
>            Reporter: Attila Doroszlai
>            Assignee: Attila Doroszlai
>            Priority: Major
>              Labels: pull-request-available
>
> Container replication works in a pull model: SCM asks the target datanode to 
> download the container, listing available source datanodes.
> The goal of this task is to add an alternative implementation, where the 
> source datanode pushes the container to some target datanode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to