Raphael Manfredi wrote: > Quoting Christian Biere <[EMAIL PROTECTED]> from ml.softs.gtk-gnutella.devel: > :In theory, yes. Gtk-Gnutella doesn't really do that. However that > :alone doesn't really solve the bootstrap problem and I don't think > :Gtk-Gnutella ever needs to fall back to a cache once you were > :connected. However, if there's a condition that makes it impossible > :for Gtk-Gnutella to connect to peers it will happily contact the > :caches over and over again. That's why I bumped the UHC lock to > :24 hours. Someone else reduced it to 10 minutes. > > Hmm... "Someone else"? > > The problem with UHCs are when you're UDP-firewalled. Currently, we don't > use the lack of replies from UHCs as an indication that we are > UDP-firewalled. Because some of the UHCs in our hardwired list could be > down, that could wrong. However, after say 10 unsuccessful attempts at > contacting an UHC, GTKG should consider it is definitively UDP-firewalled > and therefore no longer attempt to contact UHC and fallback to GWCs.
It would also be better if the user had a chance to figure out what the hell is going on before GTKG runs away contacting caches and tons of peers before you had the chance to have at least a half-decent configuration. Even if that would work for 90% of all people the other 10% can be a serious issue with respect to cache resources. > The only downside to this is that we don't know that we "don't know yet". > When GTKG starts up for the first time, it has to assume it is completely > firewalled, yet we cannot let it conntact the GWCs immediately: we prefer > the lower-cost UHCs at that point. > > :In return I increased back to an hour at least. You see negotiating > :reasonable timeouts works like a bazar (just in case you thought it was > :based on logic or something). > > Timeouts are black magic. Only experimentation can determine whether they > are appropriate. I definitely checked the worst case scenario and decided that the chosen timeout were far too low. Too low even to fix any (firewall) problems whilst it's trying to get connect. So if there's a real problem, all those attempts are wasted anyway. > 24 hours was far too large, 10 minutes was probably > too low. Yet, as you said, GTKG will only contact a cache when its > internal cache is empty. So we know that it will not abuse caches when > things work out fine. It's when things do not proceed as planned (e.g. the > UHC pongs get lost or are blocked by a firewall) that things start to > be interesting... Well it also contacts caches when the number of cached peers goes below the number of missing hosts which is all too often caused by throwing usable IP addresses away. > :The central issue however is (just as in kindergarten) discipline. > :Few clients (if any) use the caches truly as bootstrap system. Most > :of them fall back to them far too easily and do not restrict > :themselves to an acceptable amount of requests. You can only fix > :your own software, talking to other developers is (usually) as > :satisfying as a discussion with a brick wall. > It's true that some clients are abusing the caches. However, I suspect > only GTKG has such a large host cache. This doesn't help much when it's banging at away 40 connection attempts per second especially when it's running unattented. When the router craps out and you're offline for a while - which doesn't seem horribly unlikely - the cache gets empty no matter what. Yes, it'll detect being offline but if the connection is flapping that fails. > Other clients cache only a low > amount (100 or so) IP addresses and so it's entirely possible that those > addresses no longer work when the client is stopped for some time. But they probably check the uptime of those hosts. Something GTKG doesn't do. > :Now something which is also very important: No system will ever > :scale if clients fall back to *all* caches they know about. Basically, > :if you there are 10000 caches clients would bang 10000 caches instead > :of 20. Ok, the time necessary to connect to all of them would buy > :you some delay but in the end it still sucks. > What matters is the quality of the data in the caches. Apart from badly > written clients, noone would do that if they connect all right at the > first GWC connection. Yes, maybe *if* they connect alright. If they don't they'll keep contacting all caches they know about - probably forever and ever if not shut down somewhen. GTKG does exactly the same. Just block UDP and set tls_enforce to TRUE in order to simulate the BISPFH. After about 30 minutes, GTKG will have tried all UHCs thrice and all GWebCaches. Keep in mind that just because you don't see any real traffic your UHC requests might very have hit the UHCs and the GWebCaches had been hit eventhough you just saw a hang up. And exactly this happens when you use a Gnutella peer behind an .edu firewall. They use L7 filters, that's absolutely obvious from the symptoms. Of course, any badly configured firewall can have the same or similar effects. > :Therefore, clients must really *give up* *xor* use sufficiently large > :*exponential* delays when contacting caches. That may sound simple > :but how do you teach a brick? > Exponential is good up to a certain point. The real issue here is that > the first cache contacted should allow the client to bootstrap. No sane > client should contact more than a few caches, and they have to do that > because data quality in some caches is poor. Well but as I wrote. If you contact all of them because none seems to work you effectively hit them all and the system does not scale at all. That's as if all TCP/IP host were configured to fall back to the root DNS servers if a resolution fails. Of course that's not the case, you only use 1-3 public/private DNS servers and the rest is handled by a hierarchy of cascaded caches. > If people can't bootstrap easily, they will go away from Gnutella. > I'd say if after 10 minutes they're not connected, they'll switch to > some other network after a first try. If the caches are overloaded there's nothing to bootstrap anyway. They may as well go away before the network does. You can't enforce plug and play. If it doesn't connect, let them figure out how to fix and retry then. Retrying aggressively doesn't get you anywhere and may even cause trouble for the network as a whole. > So setting up timeouts of 24 hours is not doing any good to that. Either you or me must be misunderstanding this 24 hour timeout. This becomes only completely effective after Gtk-Gnutella contact all known UHCs 3 times. What magic do you expect to happen? Sure, this might be overkill in some rare cases but then you can simply restart Gtk-Gnutella to get a next round. Maybe you should (have) check(ed) main.log more often. There are clearly a couply of GTKGs that have problems getting connected (presumably due to a mismatch of extern/intern port) and ping the UHCs over and over again. Fortunately, the number of GTKG users and GTKG users affected by such problems is sufficiently low for now. However, the caches are almost on their knees already when Gnutella is still supposed to grow. Also again and again it's called "bootstrap". What business does any client have with those caches *hours* after you started it and after it actually was connected to several of peers? There's really no point to contact them ever again once you found a single running peer. Clients should reconnect for more X-Try-* or better yet re-ping peers over UDP using GGEP SCP but they must not contact a cache again. And please don't tell me you're worried about clients getting hit too often then. Obviously few care how often a cache gets hit (in the worst case). GTKG is far too trigger-happy with banning "unstable" IPs and throw addresses away because some arcane vendor limit is reached. I mean it can really cause any damage it likes to itself but this must not be achieved by utilizing and effectively wasting cache resources. UHC also means that you ping peers from your pool for more addresses; GTKG just throws them away for no reason! We can all happily blame other clients for being 100x worse but that doesn't really help anyone. Oh and I don't write this elsewhere because I've already wrote most of it several times and I know that those vendors don't give a damn or just don't grasp it. -- Christian
pgplia9iTu4s2.pgp
Description: PGP signature
