[
https://issues.apache.org/jira/browse/GOSSIP-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931881#comment-15931881
]
ASF GitHub Bot commented on GOSSIP-74:
--------------------------------------
Github user edwardcapriolo commented on a diff in the pull request:
https://github.com/apache/incubator-gossip/pull/43#discussion_r106815075
--- Diff: src/test/java/org/apache/gossip/accrual/FailureDetectorTest.java
---
@@ -17,59 +17,104 @@
*/
package org.apache.gossip.accrual;
-import java.net.URI;
-
-import org.apache.gossip.LocalMember;
+import org.apache.gossip.GossipSettings;
import org.junit.Assert;
-import org.junit.Ignore;
import org.junit.jupiter.api.Test;
import org.junit.platform.runner.JUnitPlatform;
import org.junit.runner.RunWith;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Random;
+
+
@RunWith(JUnitPlatform.class)
public class FailureDetectorTest {
-
+
+ @FunctionalInterface
+ interface TriConsumer<A, B, C> {
+ void accept(A a, B b, C c);
+ }
+
+ static final Double failureThreshold = new
GossipSettings().getConvictThreshold();
+
+ List<Integer> generateTimeList(int begin, int end, int step) {
+ List<Integer> values = new ArrayList<>();
+
+ Random rand = new Random();
+
+ for (int i = begin; i < end; i += step) {
+ int delta = (int) ((rand.nextDouble() - 0.5) * step / 2);
+
+ values.add(i + delta);
+ }
+ return values;
+ }
+
@Test
- public void aNormalTest(){
- int samples = 1;
- int windowSize = 1000;
- LocalMember member = new LocalMember("",
URI.create("udp://127.0.0.1:1000"),
- "", 0L, null, windowSize, samples, "normal");
- member.recordHeartbeat(5);
- member.recordHeartbeat(10);
- Assert.assertEquals(new Double(0.3010299956639812), member.detect(10));
+ public void normalDistribution() {
+ FailureDetector fd = new FailureDetector(1, 1000, "normal");
+
+ List<Integer> values = generateTimeList(0, 10000, 100);
+
+ Double deltaSum = 0.0;
+ Integer deltaCount = 0;
+ for (int i = 0; i < values.size() - 1; i++) {
+ fd.recordHeartbeat(values.get(i));
+ if (i != 0) {
+ deltaSum += values.get(i) - values.get(i - 1);
+ deltaCount++;
+ }
+ }
+ Integer lastRecorded = values.get(values.size() - 2);
+
+ //after "step" delay we need to be considered UP
+ Assert.assertTrue(fd.computePhiMeasure(values.get(values.size() - 1))
< failureThreshold);
+
+ //if we check phi-measure after mean delay we get value for 0.5
probability(normal distribution)
+ Assert.assertEquals(fd.computePhiMeasure(lastRecorded +
Math.round(deltaSum / deltaCount)), -Math.log10(0.5), 0.1);
}
@Test
- public void aTest(){
- int samples = 1;
--- End diff --
I would like you to keep a small test that looks something like this. it is
easier for me to understand with the algorithm is doing when there is some
fixed numbers here. Can you add a small test back with some baked numbers.
> Critical bugs in FailureDetector
> --------------------------------
>
> Key: GOSSIP-74
> URL: https://issues.apache.org/jira/browse/GOSSIP-74
> Project: Gossip
> Issue Type: Bug
> Reporter: Maxim Rusak
> Assignee: Maxim Rusak
>
> Now FailureDetector have (at least) 2 bugs (in comparation to original paper):
> 1. latestHeartbeatMs don't update on each HeartBeat. So we have
> descriptiveStatistics consisted not from deltas between heartbeats but from
> time periods from first heartbeats.
> 2. when we create normalDistribution we pass variation, not standard
> deviation.
> They make FailureDetector totally indifferent due to extremely high deviation.
> Example: http://pastebin.com/xaeF52PP
> Here we send 100 heartbeats, one per second(for example), then we check the
> state after 2000 seconds, and comparing to threshold it's still alive.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)