[jira] [Updated] (HDDS-3902) OM HA client failover switcher to a wrong OM server
[ https://issues.apache.org/jira/browse/HDDS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-3902: -- Labels: 0.7.0 (was: ) > OM HA client failover switcher to a wrong OM server > --- > > Key: HDDS-3902 > URL: https://issues.apache.org/jira/browse/HDDS-3902 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: OM HA >Reporter: Marton Elek >Priority: Major > Labels: 0.7.0 > > Found this problem with the PR/branch HDDS-3878, but it seems to be > independent. > 1. ozone sh volume create /vol1 works well with HA > 2. ozone freon omkg (rpc client) doesn't work > {code} > ozone freon omkg | grep "Failing over" > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 1, nodeId: om2 > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:15:34 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 0, nodeId: omNodeIdDummy > {code} > om2 seems to be the leader but for some reason the failover logic switching > back to an unknown node (?) > {code} > 2020-06-30 14:16:35 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:16:35 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:35 DEBUG Client:497 - The ping interval is 6 ms. > 2020-06-30 14:16:35 DEBUG Client:795 - Connecting to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 > 2020-06-30 14:16:35 DEBUG Client:1074 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root: > starting, having connections 3 > 2020-06-30 14:16:35 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #0 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #0 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took > 439ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #1 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #1 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 2ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #2 org.apache.hadoop.ozone.om.pro > tocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #2 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #3 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #3 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:36 DEBUG Groups:312 - GroupCacheLoader - load. > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #5 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #11 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #8 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending
[jira] [Updated] (HDDS-3902) OM HA client failover switcher to a wrong OM server
[ https://issues.apache.org/jira/browse/HDDS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-3902: -- Issue Type: Bug (was: Improvement) > OM HA client failover switcher to a wrong OM server > --- > > Key: HDDS-3902 > URL: https://issues.apache.org/jira/browse/HDDS-3902 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: OM HA >Reporter: Marton Elek >Priority: Blocker > Labels: 0.7.0 > > Found this problem with the PR/branch HDDS-3878, but it seems to be > independent. > 1. ozone sh volume create /vol1 works well with HA > 2. ozone freon omkg (rpc client) doesn't work > {code} > ozone freon omkg | grep "Failing over" > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 1, nodeId: om2 > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:15:34 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 0, nodeId: omNodeIdDummy > {code} > om2 seems to be the leader but for some reason the failover logic switching > back to an unknown node (?) > {code} > 2020-06-30 14:16:35 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:16:35 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:35 DEBUG Client:497 - The ping interval is 6 ms. > 2020-06-30 14:16:35 DEBUG Client:795 - Connecting to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 > 2020-06-30 14:16:35 DEBUG Client:1074 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root: > starting, having connections 3 > 2020-06-30 14:16:35 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #0 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #0 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took > 439ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #1 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #1 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 2ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #2 org.apache.hadoop.ozone.om.pro > tocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #2 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #3 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #3 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:36 DEBUG Groups:312 - GroupCacheLoader - load. > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #5 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #11 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #8 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root >
[jira] [Updated] (HDDS-3902) OM HA client failover switcher to a wrong OM server
[ https://issues.apache.org/jira/browse/HDDS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-3902: -- Priority: Blocker (was: Major) > OM HA client failover switcher to a wrong OM server > --- > > Key: HDDS-3902 > URL: https://issues.apache.org/jira/browse/HDDS-3902 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: OM HA >Reporter: Marton Elek >Priority: Blocker > Labels: 0.7.0 > > Found this problem with the PR/branch HDDS-3878, but it seems to be > independent. > 1. ozone sh volume create /vol1 works well with HA > 2. ozone freon omkg (rpc client) doesn't work > {code} > ozone freon omkg | grep "Failing over" > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 1, nodeId: om2 > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:15:34 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 0, nodeId: omNodeIdDummy > {code} > om2 seems to be the leader but for some reason the failover logic switching > back to an unknown node (?) > {code} > 2020-06-30 14:16:35 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:16:35 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:35 DEBUG Client:497 - The ping interval is 6 ms. > 2020-06-30 14:16:35 DEBUG Client:795 - Connecting to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 > 2020-06-30 14:16:35 DEBUG Client:1074 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root: > starting, having connections 3 > 2020-06-30 14:16:35 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #0 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #0 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took > 439ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #1 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #1 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 2ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #2 org.apache.hadoop.ozone.om.pro > tocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #2 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #3 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #3 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:36 DEBUG Groups:312 - GroupCacheLoader - load. > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #5 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #11 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #8 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root
[jira] [Updated] (HDDS-3902) OM HA client failover switcher to a wrong OM server
[ https://issues.apache.org/jira/browse/HDDS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-3902: -- Target Version/s: (was: 0.6.0) > OM HA client failover switcher to a wrong OM server > --- > > Key: HDDS-3902 > URL: https://issues.apache.org/jira/browse/HDDS-3902 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: OM HA >Reporter: Marton Elek >Priority: Major > > Found this problem with the PR/branch HDDS-3878, but it seems to be > independent. > 1. ozone sh volume create /vol1 works well with HA > 2. ozone freon omkg (rpc client) doesn't work > {code} > ozone freon omkg | grep "Failing over" > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 1, nodeId: om2 > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:15:34 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 0, nodeId: omNodeIdDummy > {code} > om2 seems to be the leader but for some reason the failover logic switching > back to an unknown node (?) > {code} > 2020-06-30 14:16:35 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:16:35 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:35 DEBUG Client:497 - The ping interval is 6 ms. > 2020-06-30 14:16:35 DEBUG Client:795 - Connecting to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 > 2020-06-30 14:16:35 DEBUG Client:1074 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root: > starting, having connections 3 > 2020-06-30 14:16:35 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #0 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #0 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took > 439ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #1 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #1 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 2ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #2 org.apache.hadoop.ozone.om.pro > tocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #2 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #3 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #3 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:36 DEBUG Groups:312 - GroupCacheLoader - load. > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #5 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #11 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #8 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #12 >
[jira] [Updated] (HDDS-3902) OM HA client failover switcher to a wrong OM server
[ https://issues.apache.org/jira/browse/HDDS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-3902: -- Priority: Major (was: Blocker) > OM HA client failover switcher to a wrong OM server > --- > > Key: HDDS-3902 > URL: https://issues.apache.org/jira/browse/HDDS-3902 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: OM HA >Reporter: Marton Elek >Priority: Major > > Found this problem with the PR/branch HDDS-3878, but it seems to be > independent. > 1. ozone sh volume create /vol1 works well with HA > 2. ozone freon omkg (rpc client) doesn't work > {code} > ozone freon omkg | grep "Failing over" > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 1, nodeId: om2 > 2020-06-30 14:15:31 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:15:34 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 0, nodeId: omNodeIdDummy > {code} > om2 seems to be the leader but for some reason the failover logic switching > back to an unknown node (?) > {code} > 2020-06-30 14:16:35 DEBUG OMFailoverProxyProvider:271 - Failing over OM proxy > to index: 2, nodeId: om3 > 2020-06-30 14:16:35 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:35 DEBUG Client:497 - The ping interval is 6 ms. > 2020-06-30 14:16:35 DEBUG Client:795 - Connecting to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 > 2020-06-30 14:16:35 DEBUG Client:1074 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root: > starting, having connections 3 > 2020-06-30 14:16:35 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #0 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #0 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took > 439ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #1 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #1 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 2ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #2 org.apache.hadoop.ozone.om.pro > tocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #2 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root > sending #3 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1191 - IPC Client (363509958) connection to > ozone-om-2.ozone-om.default.svc.cluster.local/10.42.0.175:9862 from root got > value #3 > 2020-06-30 14:16:36 DEBUG ProtobufRpcEngine:254 - Call: submitRequest took 1ms > 2020-06-30 14:16:36 DEBUG Client:63 - getting client out of cache: > org.apache.hadoop.ipc.Client@f5acb9d > 2020-06-30 14:16:36 DEBUG Groups:312 - GroupCacheLoader - load. > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #5 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #11 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #8 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest > 2020-06-30 14:16:36 DEBUG Client:1137 - IPC Client (363509958) connection to > ozone-om-0.ozone-om.default.svc.cluster.local/10.42.0.173:9862 from root > sending #12 >