Jean-Daniel Cryans created KUDU-1525:
----------------------------------------
Summary: Create metrics for errors
Key: KUDU-1525
URL: https://issues.apache.org/jira/browse/KUDU-1525
Project: Kudu
Issue Type: Improvement
Components: supportability
Reporter: Jean-Daniel Cryans
There's a class of issue that can be hard to debug, namely when things fail
semi-silently on the client-side. We currently have glog_warning_messages and
glog_error_messages, but it could be good to have more granular metrics. A few
I have in mind:
- rpc errors, basically any "recv error"
- server-level errors, like when it says TOO BUSY.
- any kind of insert rejection, right now we have row key duplicates and
memory pressure, but we're missing things like txn_tracker rejections, "not a
leader".
- raft errors like dropping a follower because we don't have the WALs around
and it's lagging too much.
There's probably more but the above would be a good start.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)