[jira] [Updated] (HDFS-17868) introduce BlockPlacementPolicyCrossDC for multi datacenter stretched hdfs cluster

YUBI LEE (Jira) Fri, 02 Jan 2026 00:38:09 -0800


     [ 
https://issues.apache.org/jira/browse/HDFS-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


YUBI LEE updated HDFS-17868:
----------------------------
    Description: 
I was inspired by the ideas presented in https://dan.naver.com/25/sessions/692 
and https://www.youtube.com/watch?v=1h4k_Dbt0t8 and implemented a new block 
placement policy called BlockPlacementPolicyCrossDC.
Thanks to [~acedia28] for the original ideas. It would be great if [~acedia28] 
could share an improved or more mature version of this block placement policy.

This implementation introduces the following configuration options
(default values are shown in parentheses):

{code}
dfs.block.replicator.cross.dc.async.enabled (false)
dfs.block.replicator.cross.dc.preferred.datacenter
dfs.block.replicator.cross.dc.bandwidth.limit.mb (5120)
dfs.block.replicator.cross.dc.bandwidth.refill.period.sec (1)
dfs.block.replicator.cross.dc.sync.paths
dfs.block.replicator.cross.dc.limited.sync.paths
{code}

Based on the ideas from the session mentioned above, this policy supports three 
different HDFS block write modes:

1. Synchronous write
The standard HDFS behavior, where block replicas are synchronously written to 
all target DataNodes.

2. Limited synchronous write
Uses bucket4j to control cross-datacenter traffic.
Writes below the configured threshold are performed synchronously, while writes 
exceeding the threshold fall back to asynchronous replication.

3. Asynchronous write
The client is initially given only DataNode candidates located in the same 
datacenter.
Any under-replicas are created later through background HDFS replication to 
other datacenters.

  was:
I got ideas from https://dan.naver.com/25/sessions/692, 
https://www.youtube.com/watch?v=1h4k_Dbt0t8, I implemented 
"BlockPlacementPolicyCrossDC" policy. Thanks to [~acedia28].
It would be better if [~acedia28] shares the better version of the block 
placement policy.

It introduces some configurations:
(default value written in parenthesis)

{code}
dfs.block.replicator.cross.dc.async.enabled (false)
dfs.block.replicator.cross.dc.preferred.datacenter
dfs.block.replicator.cross.dc.bandwidth.limit.mb (5120)
dfs.block.replicator.cross.dc.bandwidth.refill.period.sec (1)
dfs.block.replicator.cross.dc.sync.paths
dfs.block.replicator.cross.dc.limited.sync.paths
{code}

According to ideas from the session I mentioned above, this policy introduces 3 
ways to write hdfs block.

- sync write: the original hdfs way
- limited sync write: using bucket4j, sync write < threshold, async write > 
threshold.
- async write: return datanode candidates only which locate the same datacenter 
to hdfs client, under replicated blocks will replicated later in asynchronous 
way.




> introduce BlockPlacementPolicyCrossDC for multi datacenter stretched hdfs 
> cluster
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-17868
>                 URL: https://issues.apache.org/jira/browse/HDFS-17868
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: block placement
>            Reporter: YUBI LEE
>            Priority: Major
>              Labels: pull-request-available
>
> I was inspired by the ideas presented in 
> https://dan.naver.com/25/sessions/692 and 
> https://www.youtube.com/watch?v=1h4k_Dbt0t8 and implemented a new block 
> placement policy called BlockPlacementPolicyCrossDC.
> Thanks to [~acedia28] for the original ideas. It would be great if 
> [~acedia28] could share an improved or more mature version of this block 
> placement policy.
> This implementation introduces the following configuration options
> (default values are shown in parentheses):
> {code}
> dfs.block.replicator.cross.dc.async.enabled (false)
> dfs.block.replicator.cross.dc.preferred.datacenter
> dfs.block.replicator.cross.dc.bandwidth.limit.mb (5120)
> dfs.block.replicator.cross.dc.bandwidth.refill.period.sec (1)
> dfs.block.replicator.cross.dc.sync.paths
> dfs.block.replicator.cross.dc.limited.sync.paths
> {code}
> Based on the ideas from the session mentioned above, this policy supports 
> three different HDFS block write modes:
> 1. Synchronous write
> The standard HDFS behavior, where block replicas are synchronously written to 
> all target DataNodes.
> 2. Limited synchronous write
> Uses bucket4j to control cross-datacenter traffic.
> Writes below the configured threshold are performed synchronously, while 
> writes exceeding the threshold fall back to asynchronous replication.
> 3. Asynchronous write
> The client is initially given only DataNode candidates located in the same 
> datacenter.
> Any under-replicas are created later through background HDFS replication to 
> other datacenters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDFS-17868) introduce BlockPlacementPolicyCrossDC for multi datacenter stretched hdfs cluster

Reply via email to