[jira] [Commented] (CASSANDRA-2405) should expose 'time since last successful repair' for easier aes monitoring

Pavel Yaskevich (JIRA) Fri, 03 Jun 2011 04:02:43 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043297#comment-13043297
 ]


Pavel Yaskevich commented on CASSANDRA-2405:
--------------------------------------------

bq. It seems we store the time in microseconds but then, when computing the 
time since last repair we use System.currentTimeMillis() - stored_time.

Please take a look at the storeLastSuccessfulRepairTime method - it is storing 
System.currentTimeMillis() and timeSinceLastSuccessfulRepair is also using 
System.currentTimeMillis()

bq. I would be in favor of calling the system table REPAIR_INFO, because the 
truth is I think it would make sense to record a number of other statistics on 
repair and it doesn't hurt to make the system table less specific. That also 
means we should probably not force any type for the value (though that can be 
easily changed later, so it's not a bit deal for this patch).

Will do

bq. I think we usually put the code to query the system table in SystemTable, 
so I would move it from AntiEntropy to there.

Will do

About history: I like the idea of keeping a history for each of the successful 
repairs, I will check if that is possible to track end time of each of the 
nodes easily but anyway will do an start/end time tracking for coordinator.

> should expose 'time since last successful repair' for easier aes monitoring
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2405
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2405
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Pavel Yaskevich
>            Priority: Minor
>             Fix For: 0.8.1
>
>         Attachments: CASSANDRA-2405-v2.patch, CASSANDRA-2405.patch
>
>
> The practical implementation issues of actually ensuring repair runs is 
> somewhat of an undocumented/untreated issue.
> One hopefully low hanging fruit would be to at least expose the time since 
> last successful repair for a particular column family, to make it easier to 
> write a correct script to monitor for lack of repair in a non-buggy fashion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2405) should expose 'time since last successful repair' for easier aes monitoring

Reply via email to