[ 
https://issues.apache.org/jira/browse/HBASE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14714485#comment-14714485
 ] 

Yu Li commented on HBASE-6617:
------------------------------

Hi [~zjushch],

Thanks for the review.

I've considered your point carefully, but I still think one replication source 
per wal group is a better way, for below reasons:

1. w.r.t semantic of ReplicationSource, I believe it's "many-one" rather than 
"one-one" relationship between source and peer. One replication source stands 
for one kind of source, and no matter how many kinds of source, we need to 
replicate them all to the specified peer. Before multi wal it's a special case 
that there's only one kind of source. Just think about the heterogeneous 
storage implementation in HDFS, after supporting different kinds of disks, the 
block report granularity has changed from node-level to disk-level. I think 
multiple wal is quite similar to that.

2. w.r.t business point of view, one wal group may stand for one business. In 
our scenario we created a grouping strategy based on namespace which allows 
regions of the same business writing into the same log group. In this case one 
source per group could allow us to know the replication latency of each 
business, per regionserver/cluster level. 

3. w.r.t deleting ReplicationSource instance, you could find the logic in 
ReplicationSourceManager#removePeer, where the source would be terminated first 
and then removed from the source list.

4. w.r.t source metrics, we will use "peerId@groupId" as the id, and when 
reporting, the metrics name would be like 
"source.<peerId@groupId>.ageOfLastShippedOp", you can find the whole logic in 
constructor of MetricsSource. If you'd still prefer to have a metrics 
collection to track like "per regionserver level latency to one peer", we could 
add a "MetricsReplicationPeerSourceSource" similar to 
MetricsReplicationGlobalSourceSource, when using strategy like randomly bounded 
region group.

Feel free to let me know your thoughts.

> ReplicationSourceManager should be able to track multiple WAL paths
> -------------------------------------------------------------------
>
>                 Key: HBASE-6617
>                 URL: https://issues.apache.org/jira/browse/HBASE-6617
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>            Reporter: Ted Yu
>            Assignee: Yu Li
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: HBASE-6617.patch, HBASE-6617_v2.patch, 
> HBASE-6617_v3.patch
>
>
> Currently ReplicationSourceManager uses logRolled() to receive notification 
> about new HLog and remembers it in latestPath.
> When region server has multiple WAL support, we need to keep track of 
> multiple Path's in ReplicationSourceManager



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to