On 1.7.2015 15.44, Juliusz Chroboczek wrote:
In my previous mails, I used the term "persistent state desynchronisation".
Since this apparently worried some people, so I guess I'll explain.
DNCP uses dynamic timers (Trickle) to flood a hash of the global state and
separate unicast request/response pairs to propagate actual data. Bad
things will happen if the hash is successfully flooded but the data cannot
be propagated, for example because it doesn't fit in the maximum packet
size, because a node is not publishing enough NEIGHBOR TLVs, or because of
implementation bugs.
More precisely, under such circumstances each attempt to flood the
inconsistent network hash is followed with a flurry of request/response
pairs and a reset of the Trickle timers to their minimal value (200ms).
The result is persistent spam.
Therefore, (1) we must make sure that a compliant implementation does not
cause state to become persistently inconsistent, and (2) we should develop
some mechanism that detects persistently desynchronised neighbours and
rate-limits them. (1) needs to go in the spec, while (2) is a "mere"
implementation detail.
Your definition of the worst-case here is slightly pessimistic I believe
(c.f. appendix B changenotes); in practise, as Trickle reset occurs if
and only if _locally_ calculated network state changes, if stuff coming
in doesn't agree with what we already have, we just do request+reply
every Trickle interval (local+remote) that fails, but Trickle timers do
back off eventually if we cannot actually change local state (of remote
nodes).
Obviously, it results in a partitioned network with 'spam' unicast
exchanges occuring every now and then on the border so it is not
desirable, but it does not cause network-wide problems nor any real
state churn beyond extra unicast packets that are essentially ignored or
replied to and then ignored.
Cheers,
-Markus
_______________________________________________
homenet mailing list
homenet@ietf.org
https://www.ietf.org/mailman/listinfo/homenet