[ https://issues.apache.org/jira/browse/HDDS-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arpit Agarwal updated HDDS-3022: -------------------------------- Target Version/s: 0.6.0 (was: 0.5.0) Labels: TriagePending (was: ) > Datanode unable to close Pipeline after disk out of space > --------------------------------------------------------- > > Key: HDDS-3022 > URL: https://issues.apache.org/jira/browse/HDDS-3022 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode > Affects Versions: 0.5.0 > Reporter: Vivek Ratnavel Subramanian > Assignee: Shashikant Banerjee > Priority: Major > Labels: TriagePending > Attachments: ozone_logs.zip > > > Datanode gets into a loop and keeps throwing errors while trying to close > pipeline > {code:java} > 2020-02-14 00:25:10,208 INFO org.apache.ratis.server.impl.RaftServerImpl: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07: changes role from > FOLLOWER to CANDIDATE at term 6240 for changeToCandidate > 2020-02-14 00:25:10,208 ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=02e7e10e-2d50-4ace-a18b-701265ec9f07.Reason : > 285cac09-7622-45e6-be02-b3c68ebf8b10 is in candidate state for 31898494ms > 2020-02-14 00:25:10,208 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: start LeaderElection > 2020-02-14 00:25:10,223 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07-LeaderElection37032: > begin an election at term 6241 for 0: > [d432c890-5ec4-4cf1-9078-28497a08ab85:10.65.6.227:9858, > 285cac09-7622-45e6-be02-b3c68ebf8b10:10.65.24.80:9858, > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e:10.65.8.165:9858], old=null > 2020-02-14 00:25:10,259 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07-LeaderElection37032 > got exception when requesting votes: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > d432c890-5ec4-4cf1-9078-28497a08ab85: group-701265EC9F07 not found. > 2020-02-14 00:25:10,270 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07-LeaderElection37032 > got exception when requesting votes: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e: group-701265EC9F07 not found. > 2020-02-14 00:25:10,270 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07-LeaderElection37032: > Election REJECTED; received 0 response(s) [] and 2 exception(s); > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07:t6241, leader=null, > voted=285cac09-7622-45e6-be02-b3c68ebf8b10, > raftlog=285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07-SegmentedRaftLog:OPENED:c4,f4,i14, > conf=0: [d432c890-5ec4-4cf1-9078-28497a08ab85:10.65.6.227:9858, > 285cac09-7622-45e6-be02-b3c68ebf8b10:10.65.24.80:9858, > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e:10.65.8.165:9858], old=null > 2020-02-14 00:25:10,270 INFO org.apache.ratis.server.impl.LeaderElection: > Exception 0: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > d432c890-5ec4-4cf1-9078-28497a08ab85: group-701265EC9F07 not found. > 2020-02-14 00:25:10,270 INFO org.apache.ratis.server.impl.LeaderElection: > Exception 1: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e: group-701265EC9F07 not found. > 2020-02-14 00:25:10,270 INFO org.apache.ratis.server.impl.RaftServerImpl: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-701265EC9F07: changes role from > CANDIDATE to FOLLOWER at term 6241 for DISCOVERED_A_NEW_TERM > 2020-02-14 00:25:10,270 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: shutdown LeaderElection > 2020-02-14 00:25:10,270 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: start FollowerState > 2020-02-14 00:25:10,680 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-DD847EC75388->d432c890-5ec4-4cf1-9078-28497a08ab85-GrpcLogAppender: > HEARTBEAT appendEntries Timeout, > request=AppendEntriesRequest:cid=12669,entriesCount=0,lastEntry=null > 2020-02-14 00:25:10,752 ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=7ad5ce51-d3fa-4e71-99f2-dd847ec75388.Reason : > 285cac09-7622-45e6-be02-b3c68ebf8b10 has not seen follower/s > d432c890-5ec4-4cf1-9078-28497a08ab85 for 31623987ms > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e for 31618878ms > 2020-02-14 00:25:10,894 INFO org.apache.ratis.server.impl.FollowerState: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9-FollowerState: change > to CANDIDATE, lastRpcTime:5021ms, electionTimeout:5017ms > 2020-02-14 00:25:10,894 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: shutdown FollowerState > 2020-02-14 00:25:10,894 INFO org.apache.ratis.server.impl.RaftServerImpl: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9: changes role from > FOLLOWER to CANDIDATE at term 6220 for changeToCandidate > 2020-02-14 00:25:10,894 ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=179ac1d0-e5d5-4898-bef7-0068fd2ea2c9.Reason : > 285cac09-7622-45e6-be02-b3c68ebf8b10 is in candidate state for 31805092ms > 2020-02-14 00:25:10,894 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: start LeaderElection > 2020-02-14 00:25:10,917 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9-LeaderElection37033: > begin an election at term 6221 for 0: > [d432c890-5ec4-4cf1-9078-28497a08ab85:10.65.6.227:9858, > 285cac09-7622-45e6-be02-b3c68ebf8b10:10.65.24.80:9858, > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e:10.65.8.165:9858], old=null > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9-LeaderElection37033 > got exception when requesting votes: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e: group-0068FD2EA2C9 not found. > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9-LeaderElection37033 > got exception when requesting votes: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > d432c890-5ec4-4cf1-9078-28497a08ab85: group-0068FD2EA2C9 not found. > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.LeaderElection: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9-LeaderElection37033: > Election REJECTED; received 0 response(s) [] and 2 exception(s); > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9:t6221, leader=null, > voted=285cac09-7622-45e6-be02-b3c68ebf8b10, > raftlog=285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9-SegmentedRaftLog:OPENED:c0,f0,i8, > conf=0: [d432c890-5ec4-4cf1-9078-28497a08ab85:10.65.6.227:9858, > 285cac09-7622-45e6-be02-b3c68ebf8b10:10.65.24.80:9858, > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e:10.65.8.165:9858], old=null > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.LeaderElection: > Exception 0: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e: group-0068FD2EA2C9 not found. > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.LeaderElection: > Exception 1: java.util.concurrent.ExecutionException: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: > d432c890-5ec4-4cf1-9078-28497a08ab85: group-0068FD2EA2C9 not found. > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.RaftServerImpl: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-0068FD2EA2C9: changes role from > CANDIDATE to FOLLOWER at term 6221 for DISCOVERED_A_NEW_TERM > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: shutdown LeaderElection > 2020-02-14 00:25:10,921 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: start FollowerState > 2020-02-14 00:25:11,134 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-DD847EC75388->cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e-GrpcLogAppender: > HEARTBEAT appendEntries Timeout, > request=AppendEntriesRequest:cid=12669,entriesCount=0,lastEntry=null > 2020-02-14 00:25:11,218 ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=7ad5ce51-d3fa-4e71-99f2-dd847ec75388.Reason : > 285cac09-7622-45e6-be02-b3c68ebf8b10 has not seen follower/s > d432c890-5ec4-4cf1-9078-28497a08ab85 for 31624453ms > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e for 31619344ms > 2020-02-14 00:25:11,347 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-2338B042C07B->d432c890-5ec4-4cf1-9078-28497a08ab85-GrpcLogAppender: > HEARTBEAT appendEntries Timeout, > request=AppendEntriesRequest:cid=12579,entriesCount=0,lastEntry=null > 2020-02-14 00:25:11,361 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-2338B042C07B->cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e-GrpcLogAppender: > HEARTBEAT appendEntries Timeout, > request=AppendEntriesRequest:cid=12577,entriesCount=0,lastEntry=null > 2020-02-14 00:25:11,399 ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=6a851c59-0345-4ad8-ac31-2338b042c07b.Reason : > 285cac09-7622-45e6-be02-b3c68ebf8b10 has not seen follower/s > d432c890-5ec4-4cf1-9078-28497a08ab85 for 31396085ms > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e for 31391530ms > 2020-02-14 00:25:11,406 ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=6a851c59-0345-4ad8-ac31-2338b042c07b.Reason : > 285cac09-7622-45e6-be02-b3c68ebf8b10 has not seen follower/s > d432c890-5ec4-4cf1-9078-28497a08ab85 for 31396092ms > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e for 31391537ms > 2020-02-14 00:25:11,423 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-BA1E8724EE74->d432c890-5ec4-4cf1-9078-28497a08ab85-GrpcLogAppender: > HEARTBEAT appendEntries Timeout, > request=AppendEntriesRequest:cid=12817,entriesCount=0,lastEntry=null > 2020-02-14 00:25:11,490 ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=1ed1be53-b526-41af-bdf9-ba1e8724ee74.Reason : > 285cac09-7622-45e6-be02-b3c68ebf8b10 has not seen follower/s > d432c890-5ec4-4cf1-9078-28497a08ab85 for 31946345ms > cabbdef8-ed6c-4fc7-b7b2-d1ddd07da47e for 31945978ms > 2020-02-14 00:25:11,909 INFO org.apache.ratis.server.impl.FollowerState: > 285cac09-7622-45e6-be02-b3c68ebf8b10@group-D506E1A1894E-FollowerState: change > to CANDIDATE, lastRpcTime:5094ms, electionTimeout:5093ms > 2020-02-14 00:25:11,909 INFO org.apache.ratis.server.impl.RoleInfo: > 285cac09-7622-45e6-be02-b3c68ebf8b10: shutdown FollowerState > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org