Concerns about DNCP =================== DNCP is an elegant and small protocol that distributes HNCP data across the Homenet. DNCP works by flooding a hash of the full network state over link-local multicast, and synchronising the actual state piecewise by using link-local unicast request/response pairs. This document outlines a number of concerns with DNCP as described in draft-ietf-homenet-dncp-06.
I very strongly support the work that is being done on DNCP and HNCP. I will be glad to see something very similar to the current drafts adopted by the Homenet WG. Packet format ------------- ### There is no header A DNCP packet consists of just a sequence of TLVs. This means that there is no way to version the DNCP protocol. Should a change in the TLV format be required, a new UDP port will need to be allocated. (A new multicast group is not sufficient, since DNCP also uses unicast). I recommend adding a fixed-size header with a version number. ### NODE-ENDPOINT is stateful The DNCP/HNCP protocol suite uses the elegant technique of using recursively embedded TLVs: hop-to-hop data is in the packet toplevel, end-to-end data is in the NODE-STATE TLV, which may in turn contain TLVs that themselves contain embedded TLVs. The TLVs are mostly stateless, in the sense that they can be sent in any order or even in independent packets. NODE-ENDPOINT is an exception. NODE-ENDPOINT identifies the sender of this packet, and applies to all TLVs in this packet. The current specification implies that the NODE-ENDPOINT may appear anywhere in the packet, which would force the receiver to make two passes over the packet. Conceptually, NODE-ENDPOINT is a packet header, and it is best treated that way. Ideally, the information it contains would be part of the fixed-size packet header suggested above (but after the version number, which should be parsable even when the NODE-ENDPOINT format changes). Alternatively, specify that NODE-ENDPOINT MUST be the very first TLV in a packet, or at least appear before all currently-defined TLVs, which merely formalises what existing implementations already do. ### NODE-ENDPOINT is underspecified It is not clear whether NODE-ENDPOINT is required in all packets, and if not which TLVs are allowed in a packet without a NODE-ENDPOINT. Existing implementations appear to differ in that respect. ### Node data is underspecified The NODE-STATE TLV carries the end-to-end hash of the "node data". However, the exact "node data" is never specified exactly. For example, there is padding applied between TLVs (look, Ma, I've saved 4ns parsing my packet), and it is nowhere specified whether this padding participates in the hashing (it does). It turns out that in the absence of fragmentation the "node data" is just the raw binary data in the NODE-STATE TLV. This is the reasonable thing to do, and must be specified in this manner. It must also be made clear that this binary data MUST NOT be modified in transit (parsing/reformatting is not likely to work reliably), and that its hash SHOULD (or is that MAY?) be validated on reception. ### Normalisation is apparently useless DNCP specifies that the TLVs within a node state be sorted. Since both the raw binary data and the hash are end-to-end, it is not clear why this partial normalisation is useful. At the very least, the draft should make it clear that this normalisation should not be relied upon, and that peers MUST forward the binary data unchanged. I recommend simply dropping the normalisation, although this will require changes to the fragmentation scheme. ### FRAGMENT-COUNT is stateful in the reverse direction FRAGMENT-COUNT is stateful, but in the reverse direction: it changes the interpretation of the enclosing NODE-STATE TLV. Usually, the NODE-STATE hash carries exactly data being hashed; with a FRAGMENT-COUNT, it carries part of that data, and FRAGMENT-COUNT is not being hashed. FRAGMENT-COUNT is not end-to-end data, and it doesn't belong within the NODE-STATE TLV. It should either be a field in the NODE-STATE TLV itself (not in an embedded TLV), or put in a TLV that contains the NODE-STATE TLV. ### Fragmentation is specified at the TLV level The section about fragmentation is not quite clear to me. It appears to specify fragmentation in terms of TLVs (every fragment must consist of a valid series of TLVs), and appears to assume that the receiver is doing something smart in order to avoid reassembly timeouts. Unfortunately, this encoding makes it very difficult to implement a simpler scheme, where fragmentation and reassembly are TLV-agnostic and act entirely at the byte-stream level. In particular, a fragment TLV does not specify an initial byte offset, and the total length of the reassembled packet is not known beforehand, which makes buffer management in the receiver challenging. I recommend a TLV-agnostic fragmentation scheme, with fragment offsets counted in octets and a total reassembled size explicitly encoded. ### The fragment timeout is not specified It is not clear when the receiver can discard an incomplete defragmentation buffer. This might not be an issue if a TLV-based fragmentation scheme is used. ### Keep-alive intervals are flooded The KEEP-ALIVE-INTERVAL is within the node state, and hence flooded across the network. This information is of no interest to remote nodes, this TLV should be within at the top level of the packet in order to reduce the amount of information being flooded. Protocol dynamics ----------------- ### Need to check sender Most of the packets in DNCP have exactly the same meaning whether they are sent to a unicast or a multicast address. There is one exception: the keepalive timer is only reset for inconsistent network state when they are sent over unicast. This requires that the receiver check the destination address of packets, which is non-portable and might be impossible on some systems. I recommend removing the requirement to distinguish between unicast and multicast. If such distinction is needed, make it explicit in the TLV contents. ### Not clear when multicasts and unsolicited packets can be sent As it is written, the draft requires that all TLVs other than NETWORK-STATE be sent over unicast and does not allow unsolicited packets other than NETWORK-STATE. However, the draft authors indicate that unsolicited multicast packets are allowed. The draft should specify clearly when multicast and unsolicited packets are allowed. In particular, it should mention whether it is legal to reply to a request over multicast (which may make sense, at least on some link layers), and which packets can be sent unsolicited over multicast. ### Non standard port The draft specifies that it is required to answer a request from a node that does not publish a node state. What should happen when this request comes from a non-standard port? Monitoring software may need to run on the same node as a normal DNCP implementation. ### Non-publishing nodes A node may want to participate in the full protocol without publishing a node state in order to reduce the amount of data being flooded. Doing this naively might cause persistent state desynchronisation, leading to repeated resetting of the Trickle timers. Ideally, the draft would specify exactly which behaviours are allowed for non-publishing nodes. At any rate, a warning about the dangers of non-publishing nodes should be included in the draft. _______________________________________________ homenet mailing list homenet@ietf.org https://www.ietf.org/mailman/listinfo/homenet