Hi,

I decided to take a look at routing to see if it could be improved before NG routing
is implemented.  Think what I have come up with works better that the current
code.

I have done a few things.  First I instrumented the module.  This way we can see what
events are occurring.  Then I made the number for ip level and application level 
connects
be accounted independently, both globally and per node.  I then changed the model used
to estimate CP to use the above numbers.  Here is how it all works.  Every event is 
assigned
a weight and this is used to update the CP as follows:

CP = CP + .1 * (weight - CP)

let 

RC = route_connect_successes / route_connect_trials
AC = application_connect_successes / application_connect_trials

then

routeConnected 
        increments route_connect_trials
routeAccepted 
        increments route_connect_successes 
routeSucceeded 
        increments application_connect_successes and application_connect_trials 
        adjust CP with a weight of 1.0
connectFailed
        increments route_connect_trials
        adjust CP with a weight of 0.0
authFailed
        adjust CP with a weight of 0.0
        Note: what we do here does not really mater since this event does not happen 
(I have yet to see it)
timedOut
        adjust CP with a weight of (0.05)
        Note: this may also increment route_connect_trials
transferFailed
        adjust CP with a weight of (0.05)
        Note: what we do here does not really mater since this event does not often
verityFailed
        adjust CP with a weight of (0.0)
        Note: what we do here does not really mater since this event does not happen 
(I have yet to see it)
queryRejected
        increments application_connect_trials
        if this a cached connection
                adjust CP with a weight of max(AC, 0.01)
        else
                adjust CP with a weight of max(AC*RC, 0.01)
        Note: the minimum is there since, in ark terms, only a weight of 0.0 in 
interesting.  Anything else
                implies we found a node...

In order to conserve nodes with lots of references two things were done.   On my test 
nodes RC=0.6 and 
AC=.7,  so I set the starting CP =0.4=(RC*AC).   Nodes reset the data source to 
themselves to ask for more 
traffic.  So the second thing I do is intercept 'reference' and adjust  CP towards 1.0 
depending on the 
number of references for the node using:

CP = CP + (.9-.8*((max_refs-refs)/max_refs)**2)*(1.0 - CP)

The net affect of the above changes is that, using 64 entries in the routing table 
with a max of 64 refs
I find an average of 12 refs per node after a few days of uptime.

Feedback from some busy nodes trying this would be really nice.  The patch applies to 
6043 and 6503.

Comments?
Ed Tomlinson

---------------
--- CPAlgoRoutingTable.java.orig        2003-06-06 21:42:30.000000000 -0400
+++ CPAlgoRoutingTable.java     2003-06-15 11:23:00.000000000 -0400
@@ -1,4 +1,4 @@
-package freenet.node.rt;
+Package freenet.node.rt;
 
 import freenet.BadAddressException;
 import freenet.Address;
@@ -66,12 +66,19 @@
        this.minDowntimeARKLookup = minDowntimeARKLookup;
        this.minCP = minCP;
        this.maxLookupThreads = maxLookupThreads;
+       this.cachedAttempts = 0;
+       this.cachedSuccesses = 0;
+       aFailed = 0;
+       aTimedOut = 0;
+       tFailed = 0;
     }
     
     int consecutiveFailuresLookupARK;
     int minDowntimeARKLookup;
     float minCP;
     int maxLookupThreads;
+    int cachedAttempts, cachedSuccesses;
+    int aFailed, aTimedOut, tFailed;
 
     public RTDiagSnapshot getSnapshot() {
         long nowMs = System.currentTimeMillis();
@@ -80,6 +87,7 @@
         totalRefs = 0;
         contactedRefs = 0;
        requestingRefs = 0;
+       totalKeys = 0;
 
         // ^^^ the fact that you have to do this suggests these
         //     should be local variables not instance variables  -tc
@@ -111,6 +119,7 @@
                }
                
                ReferenceSet refSet = ReferenceSet.getProperty(memory, "keys");
+               totalKeys += refSet.size();
                Enumeration refs = refSet.references();
                while (refs.hasMoreElements()) {
                    if(shouldLog)
@@ -153,10 +162,16 @@
         Object[] props = new Object[TABLE_PROPERTIES.length];
         props[0] = new Integer(totalRefs);
         props[1] = new Integer(contactedRefs);
-        props[2] = new Integer(requestingRefs);
-        props[3] = new Integer(totalAttempts);
-        props[4] = new Integer(totalSuccesses);
-        props[5] = getClass().getName();
+        props[2] = new Integer(totalKeys);
+        props[3] = new Integer(requestingRefs);
+        props[4] = new Integer(totalAttempts);
+        props[5] = new Integer(totalSuccesses);
+        props[6] = new Integer(cachedAttempts);
+        props[7] = new Integer(cachedSuccesses);
+        props[8] = new Integer(aFailed);
+        props[9] = new Integer(aTimedOut);
+        props[10] = new Integer(tFailed);
+        props[11] = getClass().getName();
         
        Key[] keyList = (Key[])keys.toArray(new Key[keys.size()]);
        if(shouldLog)
@@ -195,21 +210,11 @@
         }
     }
     
-
-    private final synchronized void incTrials(RoutingMemory mem) {
-        CPAlgoData cpd = getProperty(mem, RMKEY, node);
-        cpd.trials++;
-        // We don't call mem.setProperty() because trials isn't
-        // persisted.
-    }
-    
-    private final synchronized void fail(RoutingMemory mem) {
-       fail(mem, false);
-    }
-    
-    private final synchronized void fail(RoutingMemory mem, boolean weak) {
+    private final synchronized void fail(RoutingMemory mem, float done, boolean 
cached) {
         CPAlgoData cpd = getProperty(mem, RMKEY, node);
-        if (cpd.decreaseContactProbability(weak)) {
+       if (done == 1.0f)
+           cpd.increaseContactProbability(done);
+       else if (cpd.decreaseContactProbability(done)) {
            remove(mem, null);
             return;
         }
@@ -221,52 +226,30 @@
        remove((RoutingMemory)(routingStore.getNode(i)), i);
     }
 
-    private final synchronized void succeed(RoutingMemory mem, boolean cached,
-                                           boolean attenuated, boolean weak) {
-        CPAlgoData cpd = getProperty(mem, RMKEY, node);
-        cpd.successes++;
-        if (!cached) {
-            // Uncached connections don't count.
-            cpd.increaseContactProbability(weak);
-        }
-        mem.setProperty(RMKEY, cpd);
-    }
-
-    /**
-     * Succeed and/or fail on weak or strong in a single step
-     * @param strongSuccess whether we strongly (connection level) succeded
-     * @param weakSuccess whether we weakly (query level) succeeded
-     * @param ignoreStrong whether to leave the strong component unchanged
-     * - generally set when we are dealing with a cached connection
+    /*
+     * Nodes set references to themselves when they want more
+     * traffic - lets take that into account.  This should help 
+     * conserve nodes with lots of keys, which should lead to  
+     * better route selections.
      */
-    protected final synchronized void succeedFail(RoutingMemory mem,
-                                                 boolean strongSuccess,
-                                                 boolean weakSuccess,
-                                                 boolean ignoreStrong) {
-       CPAlgoData cpd = getProperty(mem, RMKEY, node);
-       if(!ignoreStrong) {
-           if(!strongSuccess) {
-               if(cpd.decreaseContactProbability(false)) {
-                   remove(mem, null);
-                   return;
-               }
-           } else {
-               cpd.increaseContactProbability(false);
+    public synchronized void reference(Key k, NodeReference nr) {
+       super.reference(k, nr);
+       RoutingMemory mem = routingStore.getNode(nr.getIdentity());
+       if (mem != null) {
+           ReferenceSet refSet = ReferenceSet.getProperty(mem, "keys");
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+           if (refSet.size() > 1) {
+               cpd.perturbCP(refSet.size());  
+               mem.setProperty(RMKEY, cpd);
+               Core.logger.log(this, "key reference perturbed CP",
+                   Core.logger.DEBUG);
            }
-       }
-       if(weakSuccess) {
-           cpd.increaseContactProbability(true);
        } else {
-           if(cpd.decreaseContactProbability(true)) {
-               remove(mem, null);
-               return;
-           }
+           Core.logger.log(this, "key reference found null mem", 
+               Core.logger.DEBUG);
        }
-       if(weakSuccess || strongSuccess)
-           cpd.successes++;
-       mem.setProperty(RMKEY, cpd);
     }
-    
+
     protected boolean isRoutable(RoutingMemory mem,
                                 boolean desperate) {
        return isRoutable(mem, desperate, "");
@@ -332,43 +315,96 @@
                                length>500 ? Core.logger.MINOR : Core.logger.DEBUG);
        }
     }
-    
+   
+    // 33% done. 
     protected synchronized void routeConnected(RoutingMemory mem) {
-        incTrials(mem);
-        totalAttempts++;
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+            cpd.trials++;
+            totalAttempts++;
+           cpd.state = 1;
+       }
     }
 
+    // 66% done (routed at ip, auth ok)
     protected synchronized void routeAccepted(RoutingMemory mem) {
-        // hmm.. what?
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+           if (cpd.state == 1) {
+               cpd.successes++;
+               totalSuccesses++;
+               cpd.state = 0;
+           }
+       }
     }
 
+    // 100% done here
     protected synchronized void routeSucceeded(RoutingMemory mem, boolean cached) {
-        // Don't reward cached connections.
-        succeedFail(mem, true, true, cached);
-        totalSuccesses++;
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+           if (!cached && cpd.state == 1) {
+               cpd.successes++;
+               totalSuccesses++;
+               cpd.state = 0;
+           }
+           cpd.cTrials++;
+           cpd.cSuccesses++;
+           cachedAttempts++;
+           cachedSuccesses++;
+       }
+        fail(mem,1.0f,cached);
     }
 
+    // 0% done (did not get routeConnected)
     protected synchronized void connectFailed(RoutingMemory mem) {
-        incTrials(mem);
-        totalAttempts++;
-        fail(mem);
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+            cpd.trials++;
+            totalAttempts++;
+           cpd.state = 0;
+       }
+       aFailed++; 
+        fail(mem,0.0f,false);
     }
 
+    // 0% done if this happens (have routeConnected but routeAccepted cannot occur)
     protected synchronized void authFailed(RoutingMemory mem) {
-       fail(mem);
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+           cpd.state = 0;
+       }
+       fail(mem,0.0f,false);
        // don't deref it completely in case this is corruption or something
-       // don't ignore it because we want to clear out nodes that change identity 
reasonably quickly
+       // don't ignrore it because we want to clear out nodes that change identity 
reasonably quickly
     }
 
+    // 5% done - seems only to be called during setup of the route.  We give it 5% 
since
+    // we did find something, but it was too busy to respond
     protected synchronized void timedOut(RoutingMemory mem) {
-        fail(mem);
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+           if (cpd.state == 0) {
+                cpd.trials++;
+                totalAttempts++;
+           } 
+           cpd.state = 0;
+       }
+       aTimedOut++;
+        fail(mem,0.05f,false);
     }
 
+    // 5% done - suspect this happens after routeSucceeded, no counts are affected.  
This 
+    // does not happen enought to really matter...
     protected synchronized void transferFailed(RoutingMemory mem) {
-        fail(mem);
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+           cpd.state = 0;
+       }
+       tFailed++;
+        fail(mem,0.05f,false);
     }
 
-    // this is not a tyop!
+    // this is not a tyop!  No Fail call since we are removing the node...
     protected synchronized void verityFailed(RoutingMemory mem) {
         // dereference node
         remove(mem, null);
@@ -404,6 +440,8 @@
     // Reduce the chances of routing to nodes that always 
     // replies with QueryRejected.
     protected synchronized void queryRejected(RoutingMemory mem, boolean cached, long 
attenuation) {
+       float rp = 0.5f;
+       float cp = 0.7f;
         
         // I would have preferred to make queryRejected() in addition
         // to routeSucceeded() to keep transport and application
@@ -412,9 +450,29 @@
         // Counter-intutitive.  The routing has succeeded at the 
         // transport layer,  but we got a QueryRejected 
         // back at the application layer.
-        //
-       succeedFail(mem, true, false, cached);
-        totalSuccesses++;
+       
+       synchronized(mem) {
+           CPAlgoData cpd = getProperty(mem, RMKEY, node);
+           if (!cached && cpd.state == 1) {
+               cpd.successes++;
+               totalSuccesses++;
+               cpd.state = 0;
+           }
+           cpd.cTrials++;
+           cachedAttempts++;
+
+           // we bottom out at 0.01 since this is not a routing failure
+           if (cpd.trials > 0)
+                rp = java.lang.Math.max(((float) cpd.successes) / ((float) 
cpd.trials), 0.01f);
+
+           if (cpd.cTrials > 0)
+               cp = java.lang.Math.max(((float) cpd.cSuccesses) / ((float) 
cpd.cTrials), 0.01f);
+        }
+
+       if (cached)
+           fail(mem, cp, cached);      // it was cached, next try probably will be too
+       else
+           fail(mem, rp*cp, cached);   // we started from scratch probably will next 
time 
 
         // Disable routing attenuation. 20020421 --gj
         //         
@@ -452,9 +510,10 @@
           "ARK version", "Fetching ARK", "ARK URL"};
 
     private final static String[] TABLE_PROPERTIES 
-        = {"Number of node references", "Contacted node references",
-           "Node references requesting ARKs", "Total Trials", 
-          "Total Successes", "Implementation" };
+        = {"Number of node references", "Contacted node references", "Total Reference 
Keys",
+           "Node references requesting ARKs", "Route Trials", 
+          "Route Successes", "Cached Trials", "Cached Successes", 
+          "Route Failures", "Route TimedOut", "TransferFailed", "Implementation" };
 
     // ^^^ i don't understand why you are doing this here
     //     instead of just passing the NodeReference back  -tc
@@ -492,6 +551,7 @@
     // Diagnostic stats set by makeRefInfo
     private int totalRefs = 0;
     private int requestingRefs = 0;
+    private int totalKeys = 0;
     private int contactedRefs = 0;
 
     private final Long LONG_MINUS_ONE = new Long(-1);
@@ -634,12 +694,17 @@
      **/
     class CPAlgoData implements DataObject {
        
-       // persisted.
-       float contactProbability = 1.0f;
+       // persisted.  Initial is rp=0.55 x cp=0.70 or about 0.4
+       float contactProbability = 0.4f;
+
+       // better, but not foolproof timeout handling (0=initial, 1=routed)
+       int state = 0;
        
        // Diagnostic info
        long successes = 0;
        long trials = 0;
+       long cSuccesses = 0;
+       long cTrials = 0;
        
        // Number of consecutive failures
        long consecutiveFailures = 0;
@@ -745,50 +810,57 @@
            }
            return contactProbability;
        }
+
+       final void perturbCP(int keys) {
+           float t = (float)(Node.rtMaxRefs-keys)/(float)Node.rtMaxRefs;
+           nextCP(1.0f, 0.9f-0.8f*t*t); 
+       }
+
+       final void nextCP(float value) {
+           nextCP(value, 0.1f);
+       }
        
+       final void nextCP(float value, float k) {
+           contactProbability = contactProbability + k*(value - contactProbability);
+       }
+
        // returns true if the node should be dereferenced.
-       final boolean decreaseContactProbability(boolean weak) {
-           contactProbability = 
-               (float)(weak ? ((96.0 * contactProbability) / 100.0) :
-                       (((4.0 * contactProbability) /* + 0.0 */) / 5.0));
-           // Decrease by 4% for weak, 20% for strong
-           if(contactProbability < minCP) contactProbability = minCP;
-           consecutiveFailures++;
-           // When consecutiveFailures reaches some value, start an ARK lookup
-           if(consecutiveFailures >= consecutiveFailuresLookupARK &&
-              lastSuccessfulContact + minDowntimeARKLookup <
-              System.currentTimeMillis()) {
-               synchronized(lookupLock) {
-                   if(node == null) throw new IllegalStateException
+       final boolean decreaseContactProbability(float done) {
+           nextCP(done);
+           // When consecutiveFailures reaches some value, start an ARK lookup and
+           // its only a failure, in ARK terms, if the passed probability is zero.
+           if(done > 0f) 
+               consecutiveFailures = 0;
+           else {      
+               consecutiveFailures++;
+               if (consecutiveFailures >= consecutiveFailuresLookupARK &&
+                  lastSuccessfulContact + minDowntimeARKLookup <
+                  System.currentTimeMillis()) {
+                       synchronized(lookupLock) {
+                       if(node == null) throw new IllegalStateException
                                         ("creating LookupARK with null node");
-                   if(lookup == null && getARKURI() != null) {
-                       synchronized(lookupThreadsLock) {
-                           if(lookupThreads < maxLookupThreads)
-                               lookup = new LookupARK(getARKURI());
-                       }
+                       if(lookup == null && getARKURI() != null) {
+                           synchronized(lookupThreadsLock) {
+                               if(lookupThreads < maxLookupThreads)
+                                   lookup = new LookupARK(getARKURI());
+                           }
                        // FIXME: Kill the LookupARK whose ref is least recently used
+                       }
                    }
-               }
+               }
            }
            lastRetryMs = System.currentTimeMillis();
            return false;
        }
        
-       final void increaseContactProbability(boolean weak) {
+       final void increaseContactProbability(float done) {
            if(justFetchedARK) {
                node.diagnostics.occurrenceCounting("successLookupARK",1);
                justFetchedARK = false;
            }
-           if(contactProbability < minCP) contactProbability = minCP;
+           nextCP(done);
+           lastSuccessfulContact = lastRetryMs = System.currentTimeMillis();
            consecutiveFailures = 0;
-           float f = (float)
-               (weak ? (((19 * contactProbability) + 1.0) / 20.0) :
-                (((3.0 * contactProbability) + 1.0) / 4.0));
-           // Move closer to 1.0 by 5% for weak
-           // 25% for strong
-           if(f>1) f = 1;
-           contactProbability = f;
-           lastRetryMs = lastSuccessfulContact = System.currentTimeMillis();
            synchronized(lookupLock) {
                if(lookup != null) {
                    lookup.kill();
@@ -796,7 +868,7 @@
                }
            }
        }
-       
+
        final boolean fetchingARK() {
            return lookup != null;
        }
----------------
_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Reply via email to