From: Ligade, Shailesh (ITADD) (CON) <[email protected]>
Sent: Tuesday, October 11, 2022 8:44 AM
To: [email protected]; Ligade, Shailesh (ITADD) (CON) <[email protected]>
Subject: RE: accumu 1.10.0 master log connection refised error
Looking at the fate print/dump
I do see repo: {
"org.apache.accumulo.master.tableOps.CompactRange" {
tableId: xx
namespace: default
}
}
Does that mean it is stuck on table compact operation but can't finish it for
whatever reason and hence I it drops tserver connection?
Is it safe to fail/delete this fate? What are the alternatives, if any?
Appreciate your help
-S
From: Shailesh Ligade via user
<[email protected]<mailto:[email protected]>>
Sent: Tuesday, October 11, 2022 8:09 AM
To: [email protected]<mailto:[email protected]>
Subject: [EXTERNAL EMAIL] - accumu 1.10.0 master log connection refised error
Hello,
I have 25 node cluster with two masters. Time to time (every 4/5 hours) I get
on different tserver
Org.apache.thrift.transport.TTransportException: java.net.ConnectionException:
Connection refused
Error closing output stream
Java.ioException: The stream is closed
SocketOutputStream.write(SocketOutputStream.java:118)
...
master.LiveTServerSet$TServerConnection.compact(LiveTServerSet.java:214)
master.tableOps.CompactionDriver.isReady(CompactionDriver:168)
master.tableOps.CompactionDriver.isReady(CompactionDriver:54)
master.tableOps.Tracerepo.isReady(Tracerepo.java:47)
fate.Fate$TransactionRunner.run(Fate.java:72)
Everytime its same exception? What may be an issue? Is it stuck in some fate
operation?
After this tserver restarts (I have it system, with auto restart flag)
How to debug this further.
Appreciate any response.
-S