[ 
https://issues.apache.org/jira/browse/HDDS-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18063091#comment-18063091
 ] 

Ivan Andika edited comment on HDDS-3936 at 3/5/26 2:32 AM:
-----------------------------------------------------------

The ideal way is to use OM ID (not OM node ID) similar to SCM ID (which we 
already use to identify unique SCM ID in transfer leadership). OM ID should be 
unique for each OMs whereas OM node IDs do not prevent mismatch between 
client-side configuration and server-side (e.g. the same OM server can be 
referred as om1 in client-side conf but it might be om4 in server-side). This 
requires client to fetch the OM IDs information from the OM service, but since 
RpcClient already calls getServiceList, we can simply include the OM ID in it 
and use it for any suggested leader info.

Also see HDDS-14769 on why the client cannot parse the suggested leader info.


was (Author: JIRAUSER298977):
The ideal way is to use OM ID (not OM node ID) similar to SCM ID (which we 
already use to identify unique SCM ID in transfer leadership). OM ID should be 
unique for each OMs whereas OM node IDs do not prevent mismatch between 
client-side configuration and server-side (e.g. the same OM server can be 
referred as om1 in client-side conf but it might be om4 in server-side). This 
requires client to fetch the OM IDs information from the OM service, but since 
RpcClient already calls getServiceList, we can simply include the OM ID in it 
and use it for any suggested leader info.

> OM client failover ignores suggested leader info
> ------------------------------------------------
>
>                 Key: HDDS-3936
>                 URL: https://issues.apache.org/jira/browse/HDDS-3936
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OM HA
>    Affects Versions: 1.0.0
>            Reporter: Attila Doroszlai
>            Priority: Major
>
> If OM client hits follower OM, failover is performed sequentially, ignoring 
> suggested leader info:
> {code}
> 2020-07-08 17:20:05,249 [main] DEBUG Hadoop3OmTransport:140 - RetryProxy: 
> OM:om1 is not the leader. Suggested leader is OM:om3.
> 2020-07-08 17:20:05,277 [main] DEBUG Hadoop3OmTransport:140 - RetryProxy: 
> OM:om2 is not the leader. Suggested leader is OM:om3.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to