bigprincipalkk commented on issue #15810:
URL: https://github.com/apache/dubbo/issues/15810#issuecomment-3770769507

   @nithin-cherry Thank you for your response. Let me first summarize my 
current understanding of the code.
   The real latency is only updated once—when the client receives the server’s 
response—by refreshing the EWMA latency. Later, when an actual call is made, 
the current delay is predicted by either deferring or applying a penalty, and 
then smoothed via EWMA. This strikes me as incomplete for two reasons:
   After every successful metric update, a penalty logic is triggered that 
predicts the current latency as 2 × timeout. When timeout is either too small 
or too large, it dilutes the influence of the true latency. Could we adjust 
this so that if the elapsed time since the last update is less than timeout we 
simply use the EWMA latency, otherwise we let the predicted latency grow 
linearly up to 2 × timeout (with that value as the hard cap)?
   The present EWMA strategy uses a count-based decay with a fixed β = 0.5. A 
time-based decay—i.e., w_pre = exp(–timeDelta / τ)—would work well for both 
high-QPS and low-QPS scenarios. Under the current approach, when QPS is very 
low the EWMA latency still reflects delays measured long ago, which is usually 
undesirable.
   These are my questions; I may simply lack deeper insight and would 
appreciate your guidance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to