Hi,

We are using an Ignite ComputeGrid, and it is mostly working nicely. 

Recently we had a Node with "Noisy Neighbors" in AWS that wrecked havoc in
our ComputeGrid.
Even though that Node was quite slow, it was never removed from the
map/reduce – slowing down all computes.

We have already built a system that allows us to add/subtract Nodes to the
ComputeGrid based on when they are actually “ready to compute”, 
Because our Nodes take considerable time to be truly ready for computation
(i.e. quite a bit of prepreparation is required).
So, to accomplish this, we use a dynamic Ignite ClusterGroup when we create
the compute.

```
ClusterGroup readyNodes =
readyForComputeMonitor.getNodesReadyForCompute(ignite.cluster());
log.debug(dumpClusterGroup(readyNodes));
return ignite.compute(readyNodes);
```

So. My question.
Does Ignite keep any information that we can use to determine if a Node is
healthy?
I.e. some way that we can locate any outliers in the ComputeGrid?

For example, the Node in our recent incident was at 100% CPU and was much,
much slower in the reduce phase.

Any help/advise would be much appreciated.

Thanks, 
-- Chris 





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to