Bill Pringlemeir wrote: > On 18 Sep 2005, [EMAIL PROTECTED] wrote: > > Don't misunderstand me: I really love HTTP, I know it well enough. > > But it has an intrinsic overhead that I think is too large for GWCs. > Ok, I didn't find too much on UHC. However, it seems to be a lot > better than GWC because regular nodes on the network are used afaiu?
In theory, yes. Gtk-Gnutella doesn't really do that. However that alone doesn't really solve the bootstrap problem and I don't think Gtk-Gnutella ever needs to fall back to a cache once you were connected. However, if there's a condition that makes it impossible for Gtk-Gnutella to connect to peers it will happily contact the caches over and over again. That's why I bumped the UHC lock to 24 hours. Someone else reduced it to 10 minutes. In return I increased back to an hour at least. You see negotiating reasonable timeouts works like a bazar (just in case you thought it was based on logic or something). > So, I would propose a server that only handles "ip", "refer" and > "vote". I am using the "ip" like the GWC concept. The mechanism > could work like this, > Client A connects to bootstrap server and supplies it's IP and port. > A secret key (Sa) is sent to the client A. That doesn't look like a good idea. The cache must connect to clients to gather addresses, not vice-versa. Otherwise the cache will explode sooner or later. In fact, updates currently outweigh other requests in frequency at GWebCaches and that's really a problem with TCP. Personally, I'd prefer if you could describe your scheme without the encryption (as you say it's just a gimmick anyway). I'm not sure whether I really understood your proposal. Also, passing information about caches using client-to-client transfers should be carefully reconsidered. If anyone implemented that to publish GWebCache URLs the sky would almost certainly come down. Not a single vendor did check and normalize URLs probably and most of them *still* don't do this. I wouldn't be surprised at all if some clients considered blah.example.org and BLAH.example.org being different hosts and happily try both variants to "bootstrap". In the GWebCache system, the caches work as filter - at least some of them. One problem I see with publishing and collecting UHCs using hostnames is that someone could somewhen point dozens or hundreds of them to a single IP address (or range of IP addresses) to bring a host or network down. The central issue however is (just as in kindergarten) discipline. Few clients (if any) use the caches truly as bootstrap system. Most of them fall back to them far too easily and do not restrict themselves to an acceptable amount of requests. You can only fix your own software, talking to other developers is (usually) as satisfying as a discussion with a brick wall. Anyway, your vote mechanism might be redundant. Gnutella peers are constantly exchanging fresh peer addresses. This works inband as well as outband. Peers can (and most do) indicate their average daily uptime as well UHC support. So basically, once you're on the net, the client must simply collect addresses of peers with high uptimes (but also lower for more diversity). Then it should ping them once a while (outband) to see whether they're still online. This way your local hostcache should almost always be usable even after being offline for a week. However, I think this doesn't really solve the "bootstrap" problem. And you cannot prevent that peers bang them (due to sloppy coding or actually intentionally) anyway. I considered using "cheap" web servers that would simply serve a hostcache file (just a list of peer addresses). This wouldn't require any CGI or custom server software. The file itself would be pushed by the intelligently designed caches periodically and also contain a timestamp (as you can't trust last-modified) so that clients don't use peer addresses from stale caches. This way you could easily install hundreds of dumb caches. Of course access permissions could be a problem if those are not your own servers but rather servers of volunteers. Though there are infinite ways to transfer files (smtp, http, https, ftp, sth custom) and something like 10 minutes should be sufficient as frequency. Now something which is also very important: No system will ever scale if clients fall back to *all* caches they know about. Basically, if you there are 10000 caches clients would bang 10000 caches instead of 20. Ok, the time necessary to connect to all of them would buy you some delay but in the end it still sucks. Therefore, clients must really *give up* *xor* use sufficiently large *exponential* delays when contacting caches. That may sound simple but how do you teach a brick? -- Christian
pgpsHYiQ8T52E.pgp
Description: PGP signature
