Re: [freenet-dev] Re: QueryReject patterns and NGRouting
On Mon, Nov 10, 2003 at 02:34:48AM -0500, Ken Corson wrote: Toad wrote: On Fri, Nov 07, 2003 at 10:29:45PM -0500, Zlatin Balevsky wrote: Toad wrote: One radical solution: Remove the code to reject queries when the bandwidth limit is exceeded! which returns us in the state 5010-5018 where the node has accepted umpteen transfers, each going at snail speed. Making that prevalent accross the network will be catastrophic. Not with NGRouting IMNSHO. Well, certainly LESS with NGR. But we still should consider setting a limit on the number of simultaneous transfers (trailers) per network link / pair of peers. If both nodes agree on the number, then the requestor will recognize the conditions under which it should not query for more data, and when it is okay to resume Q's again... Please stop saying this and read the COPIOUS discussion there has been on it FIRST. Another possibility is to go even further back in 5007-5010 where the node would still accept everything but wouldn't round-robin to the senders which effectively meant the queries were served in a fifo queue. Yes, we not only want to smooth the acceptance rate of incoming requests for the local node, we want to do it fairly for each requestor as well (smoothing per route). Round-robin is the perfect term here. Ken -- Matthew J Toseland - [EMAIL PROTECTED] Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. signature.asc Description: Digital signature ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
Martin Stone Davis wrote: Let me use a biological analogy: Say your goal is to have a whole bunch of babies (don't ask my why you'd want this). Also, say you're a woman from another planet (say, Mars), and you can actually get pregnant while you're pregnant. However, you can't have more than a certain number since you can't poop them all out of your vagina that fast. So, instead the best strategy is to abort the most recently-conceived fetuses, and let those normal 3-headed Martian babies survive. Martin- And people wonder why no women get involved in open source development?! Really Martin, somehow I followed your wacky analogy of Jon Brock's suggestion (btw real women are supposed to be from Venus). I personally did not find your Doctor-Patient scheduling analogy to be retarded at all. It is a little different from what I've been thinking, but most importantly, it was the first suggestion I've seen by anyone for how to implement active Q rate control between a pair of nodes, one that includes a punishment as incentive. People have often accused me of making far out analogies! Now I feel strangely normal by comparison :O Jon: on a more serious note, how do you know the upstream node is in any better condition to handle the request ? this seems like avoiding our local responsibility... i suppose it is fair to say, if a node doesn't reject a request, that means it is willing to handle it. We could try to shift our responsibility. But the local node made the same commitment, first. In the time elapsed between accepting a request, and receiving the response, our overall load may have swung against us. We should attempt to control load at the link level (between peers) as well, perhaps by limiting simultaneous trailers between a pair of nodes. Apologies if this is already done and I just don't know it yet. The appropriate limit per link will be different for every pair of nodes, based on their bandwidth capacities, and their numbers of peers. ken ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
Toad wrote: On Fri, Nov 07, 2003 at 10:29:45PM -0500, Zlatin Balevsky wrote: Toad wrote: One radical solution: Remove the code to reject queries when the bandwidth limit is exceeded! which returns us in the state 5010-5018 where the node has accepted umpteen transfers, each going at snail speed. Making that prevalent accross the network will be catastrophic. Not with NGRouting IMNSHO. Well, certainly LESS with NGR. But we still should consider setting a limit on the number of simultaneous transfers (trailers) per network link / pair of peers. If both nodes agree on the number, then the requestor will recognize the conditions under which it should not query for more data, and when it is okay to resume Q's again... Another possibility is to go even further back in 5007-5010 where the node would still accept everything but wouldn't round-robin to the senders which effectively meant the queries were served in a fifo queue. Yes, we not only want to smooth the acceptance rate of incoming requests for the local node, we want to do it fairly for each requestor as well (smoothing per route). Round-robin is the perfect term here. Ken ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
On Monday 10 November 2003 01:02 am, Ken Corson wrote: Jon: on a more serious note, how do you know the upstream node is in any better condition to handle the request ? this seems like avoiding our local responsibility... i suppose it is fair to say, if a node doesn't reject a request, that means it is willing to handle it. We could try to shift our responsibility. But the local node made the same commitment, first. In the time elapsed between accepting a request, and receiving the response, our overall load may have swung against us. We should attempt to control load at the link level (between peers) as well, perhaps by limiting simultaneous trailers between a pair of nodes. Apologies if this is already done and I just don't know it yet. The appropriate limit per link will be different for every pair of nodes, based on their bandwidth capacities, and their numbers of peers. Since you have to send out a message anyway, IE back to the node that made the request, it would be just as fast to send the request to the upstream node and tell it to send the data back to the first node. That way nobody has to establish a connection unless they get the data. However you still have to establish a connection, (very costly) and making this happen more under load, will not improve things. So it would be best if this were done only if the upstream node already had a connection to the downstream one. However that is so rarely true it would hardly be of any use. So to make it more usefull, this could be made the default behavior if the upstream node happens to notice that it has a connection the node listed as the requester. Then, so the intermediate node, knows the data actually was transfered and does not spend time sitting around waiting for it, the requesting node tells it if it gets the data. This would all work just fine. But it becomes a minor routing improvement and not a solution to load problems. ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Re: QueryReject patterns and NGRouting
Jon Brock wrote: --- Toad [EMAIL PROTECTED] wrote: Any other suggestions? Any detail as to why/how a particular option would work? Right now when the bandwidth limit is exceeded, all queries are rejected, right? What about if you just stop sending new files. Return a pointer to the node it thinks most likely has the file according to its routing table, or even where it got it from. If the resulting query is successful, the other node should be credited as well (for ngrouting purposes). This would be self-defeating. The whole point is to have more successes. If we stop transferring a file (the devs call it a key), then we'll be reducing the chances of a success. Let me use a biological analogy: Say your goal is to have a whole bunch of babies (don't ask my why you'd want this). Also, say you're a woman from another planet (say, Mars), and you can actually get pregnant while you're pregnant. However, you can't have more than a certain number since you can't poop them all out of your vagina that fast. So, instead the best strategy is to abort the most recently-conceived fetuses, and let those normal 3-headed Martian babies survive. Your method would be to cut out that cute 20-fingered tyke from it's mommy's belly and stick it into some surrogate mommy (or maybe even an incubator), risking the life (lives?) of little Tom, Dick, and Harry. Sure, you'd be able to have more live babies coming out of you, but why not let the other Martian women make their babies naturally. In the end, you'd end up with more babies, since you wouldn't be taking chances with them by performing the C-secion. I guess the idea is that if bandwidth is used up, but cpu isn't, then only bandwidth intensive things should be stopped. On the other hand, if cpu is used up and bandwidth isn't, and we get a request for a key we have, we should still answer it. Checking the local datastore for a key isn't very hard compared to routing, right? Another idea is to have a queue, and give priority to requests that are closer to one of the node's specialization areas. If all bandwidth is used up, but a new request comes in for a key that we have and is in our specialization, pause the most irrelevant transfer for a while and send that. Wouldn't that also cause ngrouting on the paused node to be less likely to route irrelevant requests later on? Then you would just need code that begins searching for other sources when the transfer drops below an acceptable speed. I'm confused... Could you put that in terms of Martian babies? -Martin ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Re: QueryReject patterns and NGRouting
Toad wrote: Currently the situation, even with the recently integrated probabilistic rejection, is as follows: We start off with no load We accept some queries Eventually we use up our outbound bandwidth, and due to either messageSendTimeRequest or the output bandwidth limit, we reject queries until our currently transferring requests have been fulfilled. With our current running average code, at this point the node's pSearchFailed estimate will go through the floor, and it won't recover because it won't be routed to. Possible solutions proposed: 1. Try the nodes again after some fixed, perhaps increasing, backoff, once we are into QR mode. One way to do this is to abuse the pSearchFailed estimator as edt has suggested; another way would be to randomly fork requests occasionally such that each node in the RT is visited at least every N seconds as long as the node has some load. The search failed estimator will recover quite fast if it gets retried and is not queryrejecting. 2. Use a really long term average. 3. Have the node somehow guess when it will next be available for queries, and tell the requesting node, which then uses that as a backoff time. Somebody suggested this too essentially. You could perhaps guesstimate it from the transfer rate... but sadly the transfer rate will vary over time.. Any other suggestions? Any detail as to why/how a particular option would work? Option 3 is close to my Doctor-Patient analogy where appointments are scheduled (see Solving the QR problem with scheduled appointments). There are two points to that strategy: The first is to benefit the requestees, by reducing the number of queries made down to an amount closer to that which the network can actually handle. The second is to benefit the requesters, by improving their ability to predict when another node will accept a query from them. I would approve of any of the other options if they could be made to achieve these two goals. As far as I can see, options 1 and 2 benefit the requesters (probably 1 is better than 2 in this regard) but neither option help the requestees. My plan of enforced scheduled appointments benefits both. Before I get into detail on how the requestee (Doctor) node should conduct scheduling, let me revise the Appointment scheme. A simpler alternative to making appointments is to just worry about when the node is likely to be available, and not worry about when it will get busy again (i.e. closer to option 3 than my original proposal). In this case, the Doctor node responds to queries it rejects by saying, Not now, and please don't disturb me again for at least x seconds. If a requesting node violates this command, then it gets penalized by the requestee, who will then tend to QR all requests from the offending node for a limited period of time. Now, about how the Doctor should set x. Really, any plausible method would be fine. Toad, you suggested guesstimating it from the transfer rate, but then criticized it due to the transfer rate varying over time. But I say, guess away! It's okay to not have a perfect prediction of how long we will be busy, just as long as we stop requesters from sending us a bunch of queries during the times when we are almost certain not to be able to accept them. Here's a method based on past experience only: Suppose the node notices that with x=0 (the current scheme), for 50% of the time, its QR:s last for 5 seconds straight, and for the other 50%, they last 15 seconds. Then for half the requesting nodes, it would want to set x=5, and the rest, set x=15. So, my proposal for our node to set x according to past experience. If the node starts noticing periods of inactivity where it could accept queries, it can start reducing x. There's more details to be filled in here, but I'll stop and await your input. -Martin ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Re: QueryReject patterns and NGRouting
Martin Stone Davis wrote: Toad wrote: Currently the situation, even with the recently integrated probabilistic rejection, is as follows: We start off with no load We accept some queries Eventually we use up our outbound bandwidth, and due to either messageSendTimeRequest or the output bandwidth limit, we reject queries until our currently transferring requests have been fulfilled. With our current running average code, at this point the node's pSearchFailed estimate will go through the floor, and it won't recover because it won't be routed to. Possible solutions proposed: 1. Try the nodes again after some fixed, perhaps increasing, backoff, once we are into QR mode. One way to do this is to abuse the pSearchFailed estimator as edt has suggested; another way would be to randomly fork requests occasionally such that each node in the RT is visited at least every N seconds as long as the node has some load. The search failed estimator will recover quite fast if it gets retried and is not queryrejecting. Actually, perhaps we could easily adjust this solution to meet with my goal of reducing the number of queries made: Any time the requestee notices that the requester is not backing off, it punishes the requester by QR:ing all of its requests for a limited period of time. 2. Use a really long term average. This won't work since then the prediction will still hardly ever match reality. The reality is that the node is usually either 100% QR or 100% QA. The goals are to get requesters to predict *better* as well as reducing the number of queries to match capacity. 3. Have the node somehow guess when it will next be available for queries, and tell the requesting node, which then uses that as a backoff time. Somebody suggested this too essentially. You could perhaps guesstimate it from the transfer rate... but sadly the transfer rate will vary over time.. Any other suggestions? Any detail as to why/how a particular option would work? snipped: my expansion of option 3 -Martin -Martin ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
Zlatin Balevsky wrote: So far NGRouting hasn't demonstrated ability to figure out its head from its arse (pardon my french). Why not remove all code but NGRouting and let it figure out everything on its own, including world hunger and price of fish /sarcasm Possibly, but my feeling is that there could still be route to the slowest node-style bugs in the NGR code. With this in mind, it is great that Tom Kaitchuck and others are investigating that code carefully. Ian. ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
On November 07, 2003 10:29 pm, Zlatin Balevsky wrote: One radical solution: Remove the code to reject queries when the bandwidth limit is exceeded! which returns us in the state 5010-5018 where the node has accepted umpteen transfers, each going at snail speed. Making that prevalent accross the network will be catastrophic. I see that now, with just the probablistic rejection. Another possibility is to go even further back in 5007-5010 where the node would still accept everything but wouldn't round-robin to the senders which effectively meant the queries were served in a fifo queue. What I have tried previously is to queue trailers (in PeerHandler) and not start sending them until we have bandwidth. To make this work I had to setup and varient of decayingRunningAverage that used a 5-10 second window to predict bandwidth. Using the current bandwidth numbers, based on a 1 minute window, lead to 'deadtime' where we did not sent at all... With the above code it place I would normally have 8-15 transmitting connections and a queue of 1-4M. Which seems much more reasonable given that I have use a 10K limit. Ed ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
On November 08, 2003 04:54 am, Martin Stone Davis wrote: If we try this here is a way it could work. We used the anullation (how every it is spelt) in the QR message to transmit sendqueuesize/outputbandwidthlimit or sendqueuesize/outputbandwidthused if there is no limit. In standard node estimator we add a new running average. It predicts the probabiliy of a node to fail a request after the anullation time has passed. We replace pSf with this once the anullation time a node passes us expires. Tracking the this new probability (pAfterFailed) should tell us if the scheme is working. Ed Toad wrote: Currently the situation, even with the recently integrated probabilistic rejection, is as follows: We start off with no load We accept some queries Eventually we use up our outbound bandwidth, and due to either messageSendTimeRequest or the output bandwidth limit, we reject queries until our currently transferring requests have been fulfilled. With our current running average code, at this point the node's pSearchFailed estimate will go through the floor, and it won't recover because it won't be routed to. Possible solutions proposed: 1. Try the nodes again after some fixed, perhaps increasing, backoff, once we are into QR mode. One way to do this is to abuse the pSearchFailed estimator as edt has suggested; another way would be to randomly fork requests occasionally such that each node in the RT is visited at least every N seconds as long as the node has some load. The search failed estimator will recover quite fast if it gets retried and is not queryrejecting. 2. Use a really long term average. 3. Have the node somehow guess when it will next be available for queries, and tell the requesting node, which then uses that as a backoff time. Somebody suggested this too essentially. You could perhaps guesstimate it from the transfer rate... but sadly the transfer rate will vary over time.. Any other suggestions? Any detail as to why/how a particular option would work? Option 3 is close to my Doctor-Patient analogy where appointments are scheduled (see Solving the QR problem with scheduled appointments). There are two points to that strategy: The first is to benefit the requestees, by reducing the number of queries made down to an amount closer to that which the network can actually handle. The second is to benefit the requesters, by improving their ability to predict when another node will accept a query from them. I would approve of any of the other options if they could be made to achieve these two goals. As far as I can see, options 1 and 2 benefit the requesters (probably 1 is better than 2 in this regard) but neither option help the requestees. My plan of enforced scheduled appointments benefits both. Before I get into detail on how the requestee (Doctor) node should conduct scheduling, let me revise the Appointment scheme. A simpler alternative to making appointments is to just worry about when the node is likely to be available, and not worry about when it will get busy again (i.e. closer to option 3 than my original proposal). In this case, the Doctor node responds to queries it rejects by saying, Not now, and please don't disturb me again for at least x seconds. If a requesting node violates this command, then it gets penalized by the requestee, who will then tend to QR all requests from the offending node for a limited period of time. Now, about how the Doctor should set x. Really, any plausible method would be fine. Toad, you suggested guesstimating it from the transfer rate, but then criticized it due to the transfer rate varying over time. But I say, guess away! It's okay to not have a perfect prediction of how long we will be busy, just as long as we stop requesters from sending us a bunch of queries during the times when we are almost certain not to be able to accept them. Here's a method based on past experience only: Suppose the node notices that with x=0 (the current scheme), for 50% of the time, its QR:s last for 5 seconds straight, and for the other 50%, they last 15 seconds. Then for half the requesting nodes, it would want to set x=5, and the rest, set x=15. So, my proposal for our node to set x according to past experience. If the node starts noticing periods of inactivity where it could accept queries, it can start reducing x. There's more details to be filled in here, but I'll stop and await your input. -Martin ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
On Fri, Nov 07, 2003 at 10:29:45PM -0500, Zlatin Balevsky wrote: One radical solution: Remove the code to reject queries when the bandwidth limit is exceeded! which returns us in the state 5010-5018 where the node has accepted umpteen transfers, each going at snail speed. Making that prevalent accross the network will be catastrophic. Not with NGRouting IMNSHO. Another possibility is to go even further back in 5007-5010 where the node would still accept everything but wouldn't round-robin to the senders which effectively meant the queries were served in a fifo queue. Heh. NGRouting can figure out when nodes are slow due to long term overload a lot more easily and less alchemically than it can deal with query rejections. Or so the theory goes. Am I smoking crack here? So far NGRouting hasn't demonstrated ability to figure out its head from its arse (pardon my french). Why not remove all code but NGRouting and let it figure out everything on its own, including world hunger and price of fish /sarcasm Sarcastic remark snipped. -- Matthew J Toseland - [EMAIL PROTECTED] Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. signature.asc Description: Digital signature ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Re: QueryReject patterns and NGRouting
On Sat, Nov 08, 2003 at 12:21:01PM +, Ian Clarke wrote: Zlatin Balevsky wrote: So far NGRouting hasn't demonstrated ability to figure out its head from its arse (pardon my french). Why not remove all code but NGRouting and let it figure out everything on its own, including world hunger and price of fish /sarcasm Possibly, but my feeling is that there could still be route to the slowest node-style bugs in the NGR code. With this in mind, it is great that Tom Kaitchuck and others are investigating that code carefully. I don't think so, because the last estimate seems to more or less correlate to the order by transfers, on the node status page. But it's possible there are major NGR bugs - there was one with transfer rates fixed recently. However, I am pretty sure that our current QueryReject behaviour is not conducive to NGRouting working well. Ian. -- Matthew J Toseland - [EMAIL PROTECTED] Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. signature.asc Description: Digital signature ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Re: QueryReject patterns and NGRouting
Ian Clarke wrote: Toad wrote: Currently the situation, even with the recently integrated probabilistic rejection, is as follows: We start off with no load We accept some queries Eventually we use up our outbound bandwidth, and due to either messageSendTimeRequest or the output bandwidth limit, we reject queries until our currently transferring requests have been fulfilled. Your solutions all seem to be addressing the wrong end of this problem by accepting that this situation will happen and trying to make NGR deal with it - but the real question is: Why are nodes getting themselves into a situation where they have to QR solidly for periods of time? The QRing should gradually increase with time such that the situation described above doesn't occur. Because nodes learn that we're overloaded far slower than we _get_ overloaded. Probabilistic QR should help some, but I think the best thing we've got is Ian's idea (which is not far off from Martin's): Let nodes know what we can handle, and then give them reasons not to try to send us (much) more than that. No, it's not entirely leech-proof, but it's still a good bit better than we've got. What we have here, is a failure to communicate. ;) --hobbs ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
[freenet-dev] Re: QueryReject patterns and NGRouting
One radical solution: Remove the code to reject queries when the bandwidth limit is exceeded! which returns us in the state 5010-5018 where the node has accepted umpteen transfers, each going at snail speed. Making that prevalent accross the network will be catastrophic. Another possibility is to go even further back in 5007-5010 where the node would still accept everything but wouldn't round-robin to the senders which effectively meant the queries were served in a fifo queue. NGRouting can figure out when nodes are slow due to long term overload a lot more easily and less alchemically than it can deal with query rejections. Or so the theory goes. Am I smoking crack here? So far NGRouting hasn't demonstrated ability to figure out its head from its arse (pardon my french). Why not remove all code but NGRouting and let it figure out everything on its own, including world hunger and price of fish /sarcasm pgp0.pgp Description: PGP signature ___ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl