Ben,

On Wed, Dec 08, 2021 at 06:00:44PM +0200, Ben Maddison wrote:
> > While a bit pedantic, I strongly suggest "TLVs SHOULD be sorted by their
> > code point.".
[...]
> That makes sense, thanks.
> Having never written a routine that does this, I am slightly surprised
> that it is still cheaper, even if the implementation cannot *require* that
> it will arrive sorted - but I am more than happy to take yours and Jeff's
> word for it!

Since it's not a state secret and a very common strategy for building
canonicalized collections:

1. Syntactically and semantically verify the contents of the received
TLV set.
2. Presuming step 1 passes, determine if for those rules whether there are
places in the data that should be cleaned/stripped.  As an example, an NLRI
MAY be encoded with trailing bits after the prefix length. (This causes fun
bugs elsewhere in the code in many cases!)  Step 2 might be done in tandem
with step 1 depending on whether the buffer in question is yours to do with
as you please or whether you're dealing with a shared buffer.
3. Presuming step 2 is done and is sorted in a canonical order[1], you can
simply take the memory in question and use it as a long key in a collection
to find an object with the same contents.  The collection in question is
usually some nice bit of work appropriate to the key properties, but
anything that's efficient will work.  Some implementations are partial to
height balanced trees, others to radix tables or similar like PATRICIA.

There is significant room for optimizations in step 3.

Why does sorted matter?  Because if it's not, you need to insert a step
between 2 and 3 to canonicalize the data prior to the lookup step.
Otherwise you'll happily find an object (or not) in your collection that
matches the bit pattern but duplicates the semantics of the object.

E.g. {1,2,3} and {1,3,2} may be the same state, but you now have two objects
representing it.  Having a common object means you can now link state
together from that canonicalized object and use it to find related things as
another key.

-- Jeff

[1] The BMP TLV draft somewhat frustratingly permits multiple instances of
the same type in a TLV set.  Repetitions breed ambiguityin canonicalization.
A result of this is that when "pick one" is an answer, sometimes you get
behavior that depends on the implementation's sorting choice in this
instance.

_______________________________________________
GROW mailing list
GROW@ietf.org
https://www.ietf.org/mailman/listinfo/grow

Reply via email to