[ https://issues.apache.org/jira/browse/CASSANDRA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021188#comment-13021188 ]
Peter Schuller commented on CASSANDRA-2405: ------------------------------------------- The best solution I can think of is to populate the information on CF creation with the timestamp that represents the time the CF was created on the node. If the node was bootstrapped as usual, that would have happened after the local CF creation. If it was not (e.g. forcefully inserted into the ring), then some operator has explicitly made the choice of entering it into the ring "inconsistently" anyway so it doesn't matter. If this is easy to do, I think it would make for a really clean solution from the point of view of the user. The nodetool command would always return valid data except if something is truly broken; not even a single edge case to deal with. Simplicity rocks for this type of thing (for writing a monitoring script to trigger an alarm). If that's overkill/non-easy, I dunno - slight preference for throwing an exception just because I really dislike silent failures and returning an out-of-band integer seems more likely to go unnoticed if somehow it never changes because repair is *never* run, for example. I.e, either your monitoring script treats -1 as an error anyway (so it's no worse in terms of triggering the alarm unnecessarily than an exception), or it doesn't - in which case you have a silent failure mode in the case of perpetual lack of repair running. > should expose 'time since last successful repair' for easier aes monitoring > --------------------------------------------------------------------------- > > Key: CASSANDRA-2405 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2405 > Project: Cassandra > Issue Type: Improvement > Reporter: Peter Schuller > Assignee: Pavel Yaskevich > Priority: Minor > Fix For: 0.7.5 > > Attachments: CASSANDRA-2405.patch > > > The practical implementation issues of actually ensuring repair runs is > somewhat of an undocumented/untreated issue. > One hopefully low hanging fruit would be to at least expose the time since > last successful repair for a particular column family, to make it easier to > write a correct script to monitor for lack of repair in a non-buggy fashion. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira