> How big the message worker's queue may grow until it becomes a problem?

Denis, you never know. Imagine node may be flooded with messages because of
the increased timeouts and network problems. I remember some cases with
hundreds of messages in queue on large topologies. Please, no O(n)
approaches =)

> So, we may never come to a point, when an actual
TcpDiscoveryMetricsUpdateMessage is processed.

Good catch! You can put hard limit and process enqued MetricsUpdate message
if last one of the kind was processed more than metricsUpdFreq millisecs
ago.

Denis, also note - initial problem is message queue growth. When we choose
to skip messages it means that node cannot process certain messages and
most probably experiencing problems. We need to think of killing such
nodes. I would suggest we allow queue overflow for 1 min, but if situation
does not go to normal then node should fire a special event and then kill
itself. Thoughts?

--Yakov

Reply via email to