On Jun 2, 2006, at 8:18 AM, Matthew Toseland wrote:
> On Fri, Jun 02, 2006 at 08:13:55AM -0400, Colin Davis wrote: >> >> I think the first thing to do is to add a warning, when the node is >> spending more than XXX% of it's bandwidth/time checking node status. > > Possibly. Great. This is fairly trivial, but lets people know that they are being stupid. >> >> The second is to change the delay between checks to see if a node is >> back up. > > No. Some NATs have tunnel timeouts of under 30 seconds. We cannot > therefore increase the delay over that amount. That is the unfortunate > reality; everyone is NATted, and it's a PITA, but that's life. Forgive my ignorance, but it sounds like we're confusing two distinct issues. The first issue is maintaining a connection to a node- For this, I understand that a heartbeat needs to be sent out every ~30 secs, to keep the connection running through the NAT punchthrough... But the second is determining if the node is up at all. I don't see any reason this has to be the same heartbeat, with the same frequency. If I ask every one of my disconnected nodes if it is up (and tell it that I am up) only once / hr, but do it immediately on startup, that doesn't affect the above heartbeat... Example- Nodes A, B, C and D. Nodes A and B are running 24/7 They exchange heartbeats every 30 secs, and punch through NATs. Nodes C and D are transient- They run when the user has free bandwidth, etc. There is no reason that A and B should be trying to connect to these machines every 30 secs.... If C and D send a connection to all the nodes on their list on startup, telling them that they are up, and trying to establish the connection, then they are OK. The reply gets back, since it is within 30 secs, it goes through the passthrough. A and B can gradually increase the rate at which they check for C and D. If they don't get a reply after 30 secs, they increase it to a minute. If they don't get a reply after a minute, they increase it to 10 minutes, etc. Eventually, they hit a max in freenet.ini, and don't check any less frequently than that. (Ie, always check at least once an hour) But if node C or D come back online, they are sending a connection request to A and B /anyway/! The fact that we aren't checking for another hour doesn't hurt them. What am I missing?
