On 10/7/06, Zoran Vasiljevic <[EMAIL PROTECTED]> wrote:

On 07.10.2006, at 00:39, Stephen Deasey wrote:

>
> But the call may take longer than the caller budgeted for, due to all
> the hidden timeouts, which are additive.
>

True.

> So the callers time budget is 5 seconds, and that's what they pass to
> -evaltimeout. But by default both the sendtimeout and recvtimeout are
> 5 seconds. So the total time spent on a successful call to ns_proxy
> eval could be over 15 seconds, which is 3x the time budget.
>
> The time budget is a single target value. In the future, for the
> majority of users, this is going to be set per URL.  For /checkout you
> may allow up to 60 seconds to serve the page before deciding you're
> overloaded. For /ads you may give up much sooner. Your server is busy
> and you need to shed load, so you shed the least important traffic.
>
> For a single page with some time budget, which depends on the URL,
> some of it may be used up in a call to ns_cache_eval before there is a
> change to call ns_proxy eval. i.e. the time budget is pretty dynamic.
>
> I don't see how the multiple fine-grained settings of ns_proxy can be
> used effectively in the simple case of a web page with a time budget
> which runs multiple commands with timeouts.

How would you handle this issue then?
Given you have 5 seconds budget to run the proxy command
and the fact that running the command involves round-trips
to some remote proxy on another host, how would you implement
that? You have to send the data to the proxy and then you need
to wait for all the data to come back. Would you break the
potentially valid request because it needs 5.01 secs to get the
last byte transfered back just because your total limit was set to
5 seconds?

If you can give a good solution, I will implement that immediately.


The caller doesn't have a time budget for executing the code in the
slave, they have a budget for sending the code, executing it, and
receiving the result. So yes, if it takes 5.01 secs with one byte
remaining, you fail. No crystal ball.

Exactly the same problem arises if you have an additional timeout of
1sec for receiving the result. What if it takes 1.01 sec with one byte
remaining?  Where do you draw the line? The difference is that now
you've implicitly stated that your time budget is 6 secs, but you're
less flexible because you've partitioned it. Increasing the original
time budget to 6 secs would have exactly the same effect, but avoid
spurious errors due to timeouts on one counter with time remaining on
the other.


But if we make this change, then we need no counters of errors at
various places because they will make no sense. Effectively
we have a budget of -evaltimeout which is divided across all
possible points where we will/must wait. A timeout expiring at
any of this point has no meaningfull information at all any more.
Right?


No, you still need to count each error type.  The caller of the code
can't do much at the time to solve the problem, but someone needs to
solve it, and to do that you need information.

So for example, if the code was timing out sending code to the slave,
you won't tune the code in the slave to be faster because that's not
the problem.  If you're timing out in the mutex wait for a slave
handle, maybe the pool size needs to be increased.  Perhaps the fact
that *some* bytes have been received but the receive timed out can be
used to distinguish the case where a slave successfully executes but
the comm channel fails?

The more useful info we can gather the better, I think.

Reply via email to