Hi,

are there any plans to support GWC2? The protocol is quite brain-damaged
because you can't always tell them apart from the GWC side. I couldn't
care less but 40% or so of all GWCs are already V2.0. It's somewhat
interesting that this has been ignored for about one year by the GDF
now. I've seen some reasonable comments by Philippe Verbosity and he
obviously doesn't like the V2.0 specs either. I've noted no progress
on this to improve things whatsoever, though.

Also, I've noticed that the vast majority of GWC request does not contain
an IP address (or more important a port value) . That's of course
because GWCs are usually contacted while booting the servent, so it
doesn't ``know'' whether it's firewalled resp. NATed and behaves quite
anxiously. However, most servents probably already have an idea of
their IP address when contacting a GWC because they have (unsuccessfully)
tried to connect to other peers. What about doing a self-connect test
and if the IP matches to assume it's not firewalled? You can also use
the status of previous sessions to estimate the likeliness of being
firewalled or not. Thus, if the calculated probability of being
firewalled is low enough, the peer should send its IP address and 
port anyway. I think this would push much more fresh addresses to
the caches. I suppose these ``cheats'' would also help a little those
who keep reporting ``GTKG thinks I'm firewalled but I ain't, damnit''.

The GWC specs recommend that peers send this information after a
appropriate online duration but I assume this would overload them pretty
soon (currently). It doesn't look like a lot of servents are doing
this. I don't know whether GTKG does.

I don't know how GTKG choses the GWC URL if it sends one in a GWC
request but I'd think it's a good idea to send one which is known to
work - I'm not sure though - it's not mine, it's in the specs. It's just
that I've seen some lamers propagating bogus URLs pointing to something
which is obviously not a GWC. Last but not least, I suggest not to follow
URLs matching .*\.htm.*. That's usually a generic 404 site of a discontinued
GWC and such an URL would usually contain static information anyway.
(Of course, it could be *anything* - I'm talking about probabilities
here.)

It could also be worthwile to check out how efficient GWCs really are
i.e., how many of their peers and URLs do actually work. Maybe GTKG
could collect some stats about that. And BTW, GTKG should also record
a negative list for GWCs. Otherwise, it'll probably retry
known-to-be-dead GWCs session after session. Or am I missing something?

You might further want to hit the spec guys with a cluebat to support
hostnames besides IP addresses. GWC2 might already support this because
you can store arbitrary information along with an IP address or URL
but I don't know whether or how many caches actually support this. And
if it's not standardized it's pretty unreliable anyway.

Mayhap the usual suspect wants to launch GWC-sane?

-- 
Christian
 
But you probably won't listen to me anyway, will you?

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to