[sorry, I sent this to gtk-gnutella-users, gtk-gnutella-devel is the
better forum].
> Bill Pringlemeir wrote:
>> Examples,
>> Three hop ultra cycle Four hop ultra cycle
>>
>> U1 <-------> U2 U1 <---------> U2
>> \ / ^ ^
>> \ / | |
>> \ / | |
>> \ / | |
>> \ / v v
>> U3 U4 <---------> U3
>> If U1 is a gtkg ultra, any query could be routed back to itself.
>> If this is detected and the ultra relaying the query is
>> disconnected, then these cycles don't happen... but I think they
>> do.
On 31 Oct 2006, [EMAIL PROTECTED] wrote:
> If a certain amount of dupes is seen from one connection, the peer
> gets disconnected.
How is "dupes" defined? I thought it was two similar queries from the
same ultra. I didn't think gtkg tracked queries initiated from itself
(or its leaves). I guess that the change from 7 hops to 4 hops will
have lessened the amount of cycles in the network. However, I think
it also increased the effect of the cycles. It seems that horizon
estimates have went down since the hop count was decreased to four.
Perhaps this is due to Gtkg ultras not connecting to each other (or
hsep nodes). It also seems that searches have been less fruitful, but
that may be due to other factors (like LimeWire discarding SHA
searches, making bitzi references hard to find).
Host caching and sharing of cache information keeps nodes localized.
Of course, host caching is beneficial, but it will increase cycles. A
close formulae for the number of leaves connected is,
Lc (1 + Uc + Uc^2 (1 - n3) + Uc^3 ( 1 - n3 - n4) ( 1 - n3))
where n3, n4 are the cycle probability, Lc is an average leaf count
and Uc is an average ultra count. If the cycles are zero and there
are 100 leaf nodes and 35 ultras, then there are 100 ( 1 + 35 (1 + 35
( 1 + 35) ) ) = 4 413 600 possible leaves in any search.
Here is a table with probabilities of cycles...
n3 == n4,
n3/n4 | leaves | full %
------+---------+---------
0.00 | 4413600 | 100 %
0.05 | 3785788 | 85.8 %
0.10 | 3200850 | 72.5 %
0.15 | 2658788 | 60 %
0.20 | 2159600 | 49 %
0.25 | 1703288 | 38.6 %
0.30 | 1236300 | 28 %
0.35 | 919288 | 21 %
0.40 | 591600 | 13.4 %
0.45 | 306788 | 7 %
0.50 | 64850 | 1.4 %
The table looks overly pessimistic as the occurrence of .1, means 10%
are three cycle and 10% are four cycle connected. However, it does
show that they combine to be more than linear decrease in available
nodes to search.
This is rather optimistic as it assumes that no leaves are sharing any
of the ultras in the tree. However, that wouldn't make a difference
to the "full %" column. This is probably not the real behavior (n3
== n4) as there will likely be more cycles with four nodes.
These scenarios seem more likely with the more broad based GNET as the
host cache is populate by nodes at most 4 hops away. Even though many
cycles exist with the 7 hop GNET, there is the possibility that the
host cache will be populated with a distant node (due to search
results from these nodes).
Possibly downloading a popular file with many "referers" will give more
distant nodes [to be put in the host cache], but this will depend on
user interaction with Gtkg to get a diverse host cache.
These cycles aren't all bad. They provide redundancy to the network.
However, it is very bad if there is a high proportion of "3 cycle"
ultra connections. I think that the paper I saw giving numbers for
the 4 hop GNET didn't seem to take these cycles into account. It is
possible for an ultra to track these cycles and rate neighboring
ultras to improve network quality.
fwiw,
Bill Pringlemeir.
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Gtk-gnutella-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/gtk-gnutella-devel