On 10/4/06, Zoran Vasiljevic <[EMAIL PROTECTED]> wrote:

On 03.10.2006, at 01:01, Stephen Deasey wrote:
>
>
> I was also wondering about the ns_proxy send/wait/receive.  Why are
> wait and receive separate commands?

Because you can wait, get the timeout, do something and then go repeat
waiting. It makes sense. You can't achieve this with a longer wait
timeout OR with a timed [ns_proxy eval].
Allright, you can argue: one can [ns_proxy receive proxy ?timeout?]
in which case you have the same behaviour. Correct. But what difference
would it make, really?


Two commands make it twice as hard to use. All that's needed is a
timeout switch:

   set result [ns_proxy wait -timeout 10 $handle]



> Also, does there need to be so many timeouts?  The waittimeout is 100
> msec. That means if all else goes well, but the wait time takes 101
> msec, the evaluation will not be successful. But looking at the other
> defaults, the script evaluator was prepared to wait (gettimeout 5000 +
> evaltimeout infinite + sendtimeout 1000 + recvtimeout 1000), or
> between six seconds and forever...  Wouldn't a single timeout do?

There are lots of "places" something can go "wrong" at the communication
path. Hence so many timeouts. At every place you send something or
receive something, there is a timeout. Timeout to send chunk of
data to proxy isn't the same as the timeout to wait for the proxy to
respond after feeding it some command to execute. Basically, one can
hardwire "sane" values for those communication timeouts (and there are
sane values set there as defaults) but somebody may come into the need
of adjusting them during runtime. You however do not need to do that,
as sane defaults are provided everywhere.


The caller has a time budget. That's the total amount of time they're
prepared to wait for a result.

The underlying implementation may break the process down into sub
tasks, but the caller doesn't really care, or know about this. If you
look at the sub-tasks you might be able to say what a reasonable
timeout might be.  But note: this is an optimisation. A single time
budget works for all the different sub-tasks, a special timeout for
some sub-task only allows that task to *fail* quicker.

It's not free though. You get the odd effect of failing with a timeout
when there's plenty of time left in the budget.

The error handling is also weird. As it's currently implemented,
there's a different error code for each kind of timeout failure.  The
caller is forced to deal with all the different ways a timeout might
occur. With a generic NS_TIMEOUT errorCode this can be skipped, but
now you're loosing information.

I think it needs to keep stats on the different failures, with a
ns_proxy_stats command to track it. This is the data you will use to
help you size the pool according to load and server ability.

It's interesting to note that an individual error may not actually be
an error. The goal is to size the pool according to resources
available for maximum performance. If a caller times out because there
are no handles, well maybe the system is doing it's job?

On the other hand, if 80% of the callers are failing dues to timeout,
well then you have a problem. Maybe your pool is undersized, or maybe
your server is overloaded.  It's the percentage of failures which
determine whether there's a problem with the system.


A single concept of timeout with statistics kept on failures would be
easier to implement and describe, would prevent spurious timeouts, and
would allow administrators to size the proxy pools.

Reply via email to