I'm implementing a health check for a service of mine that uses riak. I've seen this code from https://github.com/basho/riak-java-client/issues/456:

RiakCluster cluster = clientInstance.getRiakCluster();
List<RiakNode> nodes = cluster.getNodes();
for (RiakNode node : nodes)
{
  State state = node.getNodeState();
}

and it's great. From what I can tell, it depends on some background processing that keeps track of the state of the nodes. I did a quick test though, and if I run 'riak stop' from the command line and then this loop with no intervening operations, the nodes report RUNNING. Even after some time passes (more than three minutes), still RUNNING.

However, if I run do run an intervening operation (some actual query of data for example) that fails, the nodes then report HEALTH_CHECKING. Then, after 'riak start', the nodes report RUNNING again. I suppose that's good.

So, I'm trying to decide how to implement the health check. The above loop doesn't seem to be enough, but do I really need to do something like:

final RiakFuture<Void, Void> future = cluster.execute(new PingOperation());

try {
  future.await();
  future.get();
} catch (ExecutionException | InterruptedException e) {
  // bad
}
// good

Maybe it's sufficient to only do this if all the nodes report RUNNING? I suppose there's always a small window in time where a node could report bad, but via a ping I'd learn it was up...so I'm torn. Any suggestions for whether pinging every time is correct, or there's something more efficient (and safe)?

Thanks for your help.

-DB

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to