Geometric mean is still vulnerable to this, just less so than arithmethic mean. Besides, it is likely much more expensive computationally - or do modern day chips come with instructions to calculate the Nth root (could be, I haven't really examined their capabilities) ?
Is there any particular advantage geometric mean has over median ? On Mon, 10 Apr 2006 20:06:25 -0700, Ian Clarke wrote: > I believe another way to suppress the influence of outliers is to use a > geometric mean, ie. rather than adding N numbers and dividing my N, you > multiply the N numbers, and take the Nth root of the result. > > Ian. > > On 10 Apr 2006, at 18:02, John B?ckstrand wrote: > >> I was thinking about the mean ping times that are used for handling node >> load as I understood it, and had the idea to use a median instead. >> >> Using an average means one single node can artificially push the avg >> ping very high, unless you have more than 13 other active peers (the >> influence from a 30s ping (is this the max, btw?) with 13 peers is >> 2000ms, so even with ping=0 for the other 13 peers, the mean ping will >> be 2000ms. This could be a very easy way to DoS the network, no? Just >> create a very popular node (read: SkarX) and then answer pings/packets >> very slowly. >> >> A median suppresses outliers, and outliers in a latency calculation will >> mostly be, I suspect, those with high latency due to DoS:ing or just >> having very bad connections (or absurd overutilization). Another form of >> latency outliers is having local nodes, which could have very low >> latency not affected by internet connection load. >> >> Anyway, attached is a patch to print the median aswell. I had guessed >> that a few nodes would have "insanely" high ping and be one reason for >> overloads, and that seems to be the case atleast on my darknet node, >> which has more active connections. The difference isnt major or anythng >> though: mean is 353ms, and median is 311ms, caused mainly by one node >> with a latency of about 3000ms. I will try to graph these two aswell for >> some extended period of time for my node. >> >> --- >> John B?ckstrand >> Index: src/freenet/node/NodePinger.java >> =================================================================== --- >> src/freenet/node/NodePinger.java (revision 8505) +++ >> src/freenet/node/NodePinger.java (working copy) @@ -1,5 +1,9 @@ >> package freenet.node; >> >> +import java.util.ArrayList; >> +import java.util.Collections; >> +import java.lang.Double; >> + >> import freenet.support.Logger; >> >> public class NodePinger implements Runnable { >> @@ -30,20 +34,32 @@ >> void recalculateMean(PeerNode[] peers) { >> int peerCount = 0; >> double total = 1.0; >> + ArrayList pingTimes = new ArrayList(); >> for(int i=0;i<peers.length;i++) { >> PeerNode peer = peers[i]; >> if(!peer.isConnected()) continue; >> peerCount++; >> double avgPingTime = peer.averagePingTime(); >> + pingTimes.add(new Double(avgPingTime)); >> double avgThrottledPacketSendTime = >> peer.throttledPacketSendAverage.currentValue(); >> double value = Math.max(avgPingTime, >> avgThrottledPacketSendTime); >> Logger.minor(this, "Peer: "+peer.getPeer()+", >> avgPingTime: >> "+avgPingTime+", avg throttled send time: "+avgThrottledPacketSendTime); >> total *= value; >> } >> + >> + Collections.sort(pingTimes); >> + >> + for(int i=0; i<pingTimes.size(); i++){ + >> Logger.minor(this, "at " + >> i + " " + pingTimes.get(i)); + } >> + >> if(peerCount > 0) { >> total = Math.pow(total, 1.0 / peerCount); meanPing = >> total; >> - Logger.minor(this, "Mean ping: "+meanPing+"ms"); + >> + double median = ((Double) >> pingTimes.get((pingTimes.size()-1)/ >> 2)).doubleValue(); >> + >> + Logger.minor(this, "Mean ping: "+meanPing+"ms, median: >> " + median + >> "ms"); >> } >> } >> } >> _______________________________________________ Devl mailing list >> Devl at freenetproject.org >> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl