[ 
https://issues.apache.org/jira/browse/GOSSIP-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931881#comment-15931881
 ] 

ASF GitHub Bot commented on GOSSIP-74:
--------------------------------------

Github user edwardcapriolo commented on a diff in the pull request:

    https://github.com/apache/incubator-gossip/pull/43#discussion_r106815075
  
    --- Diff: src/test/java/org/apache/gossip/accrual/FailureDetectorTest.java 
---
    @@ -17,59 +17,104 @@
      */
     package org.apache.gossip.accrual;
     
    -import java.net.URI;
    -
    -import org.apache.gossip.LocalMember;
    +import org.apache.gossip.GossipSettings;
     import org.junit.Assert;
    -import org.junit.Ignore;
     import org.junit.jupiter.api.Test;
     import org.junit.platform.runner.JUnitPlatform;
     import org.junit.runner.RunWith;
     
    +import java.util.ArrayList;
    +import java.util.List;
    +import java.util.Random;
    +
    +
     @RunWith(JUnitPlatform.class)
     public class FailureDetectorTest {
    -  
    +
    +  @FunctionalInterface
    +  interface TriConsumer<A, B, C> {
    +    void accept(A a, B b, C c);
    +  }
    +
    +  static final Double failureThreshold = new 
GossipSettings().getConvictThreshold();
    +
    +  List<Integer> generateTimeList(int begin, int end, int step) {
    +    List<Integer> values = new ArrayList<>();
    +
    +    Random rand = new Random();
    +
    +    for (int i = begin; i < end; i += step) {
    +      int delta = (int) ((rand.nextDouble() - 0.5) * step / 2);
    +
    +      values.add(i + delta);
    +    }
    +    return values;
    +  }
    +
       @Test
    -  public void aNormalTest(){
    -    int samples = 1;
    -    int windowSize = 1000;
    -    LocalMember member = new LocalMember("", 
URI.create("udp://127.0.0.1:1000"), 
    -            "", 0L, null, windowSize, samples, "normal");
    -    member.recordHeartbeat(5);
    -    member.recordHeartbeat(10);
    -    Assert.assertEquals(new Double(0.3010299956639812), member.detect(10));
    +  public void normalDistribution() {
    +    FailureDetector fd = new FailureDetector(1, 1000, "normal");
    +
    +    List<Integer> values = generateTimeList(0, 10000, 100);
    +
    +    Double deltaSum = 0.0;
    +    Integer deltaCount = 0;
    +    for (int i = 0; i < values.size() - 1; i++) {
    +      fd.recordHeartbeat(values.get(i));
    +      if (i != 0) {
    +        deltaSum += values.get(i) - values.get(i - 1);
    +        deltaCount++;
    +      }
    +    }
    +    Integer lastRecorded = values.get(values.size() - 2);
    +
    +    //after "step" delay we need to be considered UP
    +    Assert.assertTrue(fd.computePhiMeasure(values.get(values.size() - 1)) 
< failureThreshold);
    +
    +    //if we check phi-measure after mean delay we get value for 0.5 
probability(normal distribution)
    +    Assert.assertEquals(fd.computePhiMeasure(lastRecorded + 
Math.round(deltaSum / deltaCount)), -Math.log10(0.5), 0.1);
       }
     
       @Test
    -  public void aTest(){
    -    int samples = 1;
    --- End diff --
    
    I would like you to keep a small test that looks something like this. it is 
easier for me to understand with the algorithm is doing when there is some 
fixed numbers here. Can you add a small test back with some baked numbers.


> Critical bugs in FailureDetector
> --------------------------------
>
>                 Key: GOSSIP-74
>                 URL: https://issues.apache.org/jira/browse/GOSSIP-74
>             Project: Gossip
>          Issue Type: Bug
>            Reporter: Maxim Rusak
>            Assignee: Maxim Rusak
>
> Now FailureDetector have (at least) 2 bugs (in comparation to original paper):
> 1. latestHeartbeatMs don't update on each HeartBeat. So we have 
> descriptiveStatistics consisted not from deltas between heartbeats but from 
> time periods from first heartbeats.
> 2. when we create normalDistribution we pass variation, not standard 
> deviation.
> They make FailureDetector totally indifferent due to extremely high deviation.
> Example: http://pastebin.com/xaeF52PP
> Here we send 100 heartbeats, one per second(for example), then we check the 
> state after 2000 seconds, and comparing to threshold it's still alive.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to