Thanks a lot, Artur for the thorough review and many thought provoking high-level considerations from your review.
Here is diff -05 vs. -07: https://author-tools.ietf.org/diff?doc_1=draft-ietf-anima-brski-discovery-05&doc_2=draft-ietf-anima-brski-discovery-07 This contains both the fixes for you and Matt Kovatsch as well as those for Steffen Fries (-06). For your main concern of scope & size let me summarize upfront, there is more detailled feedback inline, but high-level: The actual goal of the document is to specify how BRSKI implementations should use discovery mechanism to achieve autoconfiguration, resilient, support load-splitting, AND extensibility of variations simultaneously deployed - using arbitrary discovery methods (because every industry/deployment-type seems to prefer a different one). In other work: all the functions that can only be enabled through dynamic, in-network discovery mechanisms. So, the pieces you seem to consider to be "too much / surplus" and suggest to move into separate documents - those are actually the core. The whole data model and registry - they are dependencies. And yes, the writeup of that dependency (data-mode for variations, IANA registry) takes the most part of the document. But it's still a dependency. So, we can certainly discuss whether we should split up the document into multiple ones, but given how we had core documents in ANIMA around 120 pages, this document being half that, i think it will be a lot easier to proceed with this chartered content through a single document. And the actual pieces you refer to would each be less than 5 pages. Not much saving in the overall scope - but a lot less ease of understanding what the remaining 50+ pages of dependencies are good for. And with respect to whether or not the text contains too many "operational" instead of "specification" aspects: a) I reread the sections you whee worried about and they are full of MUST/SHOULD/MAY, so i think its well on the side of "specification" b) ANIMA is OPS. We shouldn't be ashamed of more operational considerations wrapped into a spec. Although i think there is not a lot of this in this spec. And for BRSKI, i think we have "operational" considerations now because a) RFC8995 was already 116 pages and took alrady 7 years - and we really didn't have more operational considerations when we wrote it. c) I really consider the explanations mostly to be implementation considerations. And i have a terrible history of dealing with RTG specs where there are no or barely any spec writers left who could explain why spec decisions where made in those past RFCs. I am even rev'ing one such RFC right now (RFC1112). So don't get me started. And i have to note that i think that INT/ART often does a better job of not too terse specifications. But i hope a lot of the improved explanations will at least make all of this a lot easier to understand! Thanks again. Details below inline Cheers Toerless On Wed, Apr 09, 2025 at 02:35:44PM +0000, Artur Hecker wrote: > Hi > > > I was asked to do a review of the draft-ietf-anima-brski-discovery feedback, > so here it comes. The document is quite lengthy, so is this review, sorry for > this. > > The reviewed ID was: draft-ietf-anima-brski-discovery-05. > > The review below consists of two parts: general comments and editing remarks > on a per-section basis in the detailed comments. > > If you have any questions, I remain at your full disposal for further > discussions. > > > Best > artur > > <review> > > > 1. GENERAL COMMENTS > > The document is a bit too long and, in my opinion, goes in too many > side-discussions, not necessarily of primary importance for this > specification. Explicit examples welcome of what you consider to be unnecessary. I have been writing this from the background of having been involved in implementations of these type of technologies where mistakes where made and my writeup in this draft being intended to avoid such issues. > The beginning of the document (maybe the entire intro) could be rewritten to > better clarify what this document intends to specify, before discussing why > it's complicated and all the constraints. Ack. I have tried to do that. > The essence of this document are the IANA tables in Section 4 - it should be > probably shortened to reflect this. All other considerations could go into > separate "best practice" like drafts. Notably: > > > * Clarify early that this document is not proposing any discovery method > per se. Abstract: This document does not define any new discovery methods ... > * Give examples of discovery methods, which you believe are suitable for > what is proposed in this document. Personally, I believe giving some concrete > examples early in the text and clarifying what you want to specify would make > the text easier to read. Please check re-written "Challenges" section. > * Can it not be a DHCP option, in particular for IP/IPv6 based setups? If really desirable, i can write into an appendix a section as to why DHCP is in my opinion a really undesirable and difficult discovery option. Let me know. (DHCP is really a terrible configuration nerd-knob-monster. You would have to write a new DHCP WG draft fo any new attribute to signal BRSKI roles and variation strings. And then you need to configure to forward these attributes. And there is no protocol for responders to announce themselves to those DHCP servers, so everything needs to be configured, and how do you figure out when a server (e.g.: registrar) goes away.... There are also often not implemented drafts to propagate and bundle set of attributes propagation. Therribly complex and one-off solutions. In contrast, all the discovery options we describe have fully decentralized, peer-to-peer options, so no single point of failure. Yes, DNS-SD has unicast options with servers, but they do have announcements from servers now (DNSSD-SRP), and as a "trusted third party" allows to provide more security, yada yada Not a good topic to keep the document terse ;-) > * Introduction is probably too long. The given argumentation is hard to > understand before the explanations would come. I hope the added text in the beginning fixes that. > Proxies appear too early in the text: the document proposes a discovery > method for BRSKI, the intro starts discussing all kinds of issues with > proxies. Why? I added a paragraph to the beginning of "2.1.3. Variation agnostic support for Join Proxies" that hopefully sets the right stage to understand why discovery for BRSKI is particularily complex because of proxies. > * The definition of context (or variation context) and the use of it in > the document is overloaded: > * Context or variation context, please use one of the two Ok, now the text is only using the term context. > * "BRSKI" is both context name and the whole protocol family name Ok, i changed context name BRSKI to tBRSKI. Not quite sure if this deconfuses things, lets hear from other reviewers. > * Clarify reasons for calling it "responder socket" as opposed to using > URI/URL and the formats herein But what is discovered is a socket, aka: (ip-address, proto, port). Of course, one could use URI terminology to describe TCP or UDP sockets, but i am not aware that any RFC is doing this. I think it would rather confuse the matter. Socket is used consistently through all IETF for this purpose AFAIK. > * Please provide a rationale for choosing some non-standard encoding as > suggested in this text over e.g. JSON. > * Would seem to lack the type-value pair notion, this opens a debate > updateability vs. parsing complexity? Can you clarify which encoding you are referring to ? The only encoding that this document really introduces is really that of the variation string, and that came after discussing a lot of options, and a single string ultimately was what everybody could agree to because of its ability to be used in any discovery protocol and ease of parsing (list of variation strings as opposed to more complex set of variation type/value TLVs). > * To be potentially discussed in the security considerations: > * If variants allow for less secure methods, this is a potential > vulnerability (security downgrade attacks, problematic in particular with > proxies) I've thought about this as a question but could not come up with useful scenarios that would make sense to describe. There are currently not variations that expose differences in the security of the connections, and writing theoreticals about possible future options to do that would just unnecessarily make the document even longer than it already is. And even more so: there is a bigger battle about completely retiring TLS 1.2 in the IETF and letting those poor devices with slowly evolving toolchains bear the brunt of the problems that this evolution will bring when not done carefully. And i'd hate to drag this document into that can of worms. > * For this document to improve chances of picking non-malicious > announcements (s. 3.2.4), it should prioritize the "highest security" > responders, which so far it does not. Otherwise it should prescribe a > sufficiently secure option as default and discuss required validations along > this default mechanism. Specifically in 3.2.4, I am not sure this is useful: > high weights in service announcements are probably as easy to forge, as > service announcements per se. I've added a paragraph about this at the end of the security section. > * Potentially, a "network name" is missing: it would seem that pledge > does not care what it connects to. It just tries to discover *some*, *any* > registrar. Is it always the case? Shouldn't the discovery hint on a name? > * Consider the following use case: physical infrastructure with a > dozen of "virtual networks" that I can pledge to, once I have established a > physical link. The discovery would yield say 20 answers. Yet, with high > probability I am only interested in the 2 redundant proposals (responders, > variants) coming from the actual virtual network that I want to and can > connect to. Where is that? How would it be implemented? I don't quite understand what you think is the most likely use-case example for this. We did discuss i think similar problems towards rfc8994, but always felt those where further off. So i would really need a strong use case with enough details to expand the scope of the document to include something like this. Otherise it's more easily done as a followup document. > * Detailed considerations for proxies (stateless/stateful, selection, > resilience, etc): not sure that these considerations belong into the > discovery specification like this text. > * Many of these things may not be of primary importance for the > discovery itself, as the discovery does not have to use any proxy whatsoever. > * The opposite also holds: proxies might be useful without any > discovery. > * The document becomes too heavy and would be difficult to maintain if > we combine both in one. > * Suggestion: probably, a separate draft, can be considered for these > detailed proxy considerations. > * The current text on proxy considerations per se seems valuable > e.g. for developers, however, it has little to do with the actual discovery > specification of that document. > * This separate draft could summarize best practices for proxies, > i.e., imaginable/common proxy modes, proxy deployment thoughts, proxy vs. > discovery coexistence, recommended proxy behaviour (selection of endpoints, > connection dispatch, etc.) Most of these things do not look like a > specification, but rather as a BP to me. What is described in section 3.3 and its subsections is not a BCP, but it is different options for state machineries of a variation agnostic proxy. I think it is very similar to other protocol state machineries i have seen in protocol specifications. If you think the text reads too informally, i could think of adding pseudo-code, but from past experience there are always strong voices of not ONLY having pseudocode, but also "targeted to human" text. > * Hence: I suggest that these be extracted, put in a different > document and simply referenced in that one. > * Only discovery-specific thoughts with regard to proxies (e.g. > transparent support of new variations, or "arbitrary variation support") > could potentially stay, in particular if the discussion explains the design > choices of the variations, contexts, defaults, etc. made in this document > through the argument of making proxy support simpler. If they rather explain > how this arbitrary variation support can be best implemented in a proxy, it > should be moved to the other document. The "transparent variation support options" subsectsion of 2.2 is less than 4 pages. It seems like a lot of refacturing and possible duplication of effort needed to split this work into two drafts for this purpose, and i don't think it will improve ease of readability for implementers. * Note that the selection as such is not discovery-specific, as a typical proxy could have a preconfigured list of known registrars as well, as part of its configuration. How to best map N clients to available K servers within a proxy is a fundamental proxy/gateway problem, independent of discovery. Preconfigured registrars are outside scope because it is not discovery based. The other aspects you mention are covered by the scope definition in the abstract: "This document defines the procedures necessary for interoperable announcement, discovery and selection in the face of redundant responders, making deployment more resilient by allowing easy and automatic redundancy and load-splitting." Similar to rfc6763, "discovery" is just the keyword for those other aspect: announcement, discovery, selection, redundany, load-splitting. Of course, if there i a strong feeling in the WG that we should be more exhaustive as to this scope in the title (unlike e.g.: rfc6763), then i can add that to the title. > * Not sure why the discussion on construction of IDevID and the > verification thereof (3.4.2ff, in particular 3.4.3) should be in this > document. Because this is also a problem of using discovery method(s) to announce and discover all or specific pledges and to form the discoverable name in such a way that it will be as much as possible unambiguous (across multiple vendors) as well as having ownership cryptographicaly be validated (because it takes info from IDevID). > * Same argument as for the whole proxy discussion, the point being: > whether there is discovery, or whether the pledges and registrars are > respectively preconfigured, the identification string construction and its > respective matching remain exact the same. It's not because of the discovery > that these complexities arise. I don't see the point. You actually must use some form of discovery to discover a pledge. You can not configure the pledges IP-address that you need to connect to it. And in the case of pledge discovery the relevant aspects to discover and authenticate a pledge is related to standardizing on how the instance name are formed. Also the avoidance of possible cross-vendor collisions. > * Not sure why the discussion on host names in 3.5 is absolutely > required. Isn't it a matter for DNS-SD anyhow, regardless whether it's BRSKI > or not? I don't think "absolutely necessary" is a good criteria. Remember, ANIMA is an OPS group, not e.g.: an RTG group. And the task of announcement/discovery/selection is one that comes with a large amount of unnecessary freedom of existing protocol decision making (as in the instance name in DNS-SD). Hence it makes a lot of sense to constrain the viable implementation alternatives to those that minimize or ideally completely eliminate the need for any configuration (self-configuring as a key property of ANIMA solutions) and best diagnostics. That is what the MAY/SHOULD/MUST requirements in this section allude to. ANIMA is an OPS WG, not a RTG working group. There is not necessarily the history of attempting to compress specification to the point where barely the authors know how to implement it ;-)) > * Granted, some of the discussed facts are BRSKI-specific. Can we not > at least shorten the text by simply stating these facts? It would streamline > the text. Something like: "BRSI registrars do not need host names". Etc. But > the overall discussion seems a bit off for that draft. But "BRSKI registrars do not need host names" is exactly that attempt to compress specifications to the degree of being wrong and not understandable. Every DNS-SD service instance needs instance name and host name to encode the information. They are just not significant unless you have human users or other entities that pick an instance not purely on attributes like priority/weight or variation, but based on 'oh, that printer name sounds like giving e the best print quality'. But given how you do need to specify instance name and host name to encode the DNS-SD information, it does make sense to give MUST/SHOULD/MAY guidance to make implementations as consistent and as auto-configuring as beneficial. That is what the section does. > * Current security considerations seem to be rather related to DNS-SD and > less specific to the actual specification in this draft. DNS-SD related > security vulnerabilities do not need to be rediscussed in this draft. > Potentially better security considerations are suggested above. I just read through the sections, and there is nothing i think that is already discussed in RFC6763 (DNS-SD). Instead, the main point relates to the hardening of the pledge BRSKI protocol where the relationship to DNS-SD is simply that by being discoverable (through DNS-SD browsing), the ability to perform (D)DoS attacks against pledges is simply increased. Note again, that the sc The other (last) paragraph is actually an IMHO very nice security option that i am sure was not written anywhere else because it relies on SRP (which just recently became RFC) plus the use of IDevID as TLS authenticator for the SRSP connection - to ultimately authenticate on the DNS-SD server the ownership of the instance name for the pledge. So maybe in summary for the high-level points: The scope of the document is really about ho But definitely appreciate the > > > 1. DETAILED COMMENTS > > (per section, with section numbers from the draft. s/x/y = substitute x by y) > > > 1. Overview > > > > - s/new use-case preferences may prefer/new use-case preferences may result > in > > - s/with interoperability, that the mechanisms defined in this document > intent to help sove/with interoperability, which the mechanisms defined in > this document intend to help solve > Text got replaced by new Challenges text. > 2.1.1 > > > > - s/When an initiator such as a BRSKI proxy or BRSKI pledge uses a > > mechanism such as Section 1 to discover an instance of a role it > > intends to connect to, such as a registrar, it may discover more than > > one such instance./"" > > o Note: the phrase is present two times, albeit in slightly different forms. Duplicate removed, thanks. > - s/FOr example/For example fixed. > - Note: "Different BRSKI deployments may prefer different discovery > > mechanisms, such as Section 1, Section 1, Section 1 or others" (check > Sections). The "Section 1" issues arise from an ambitious, but failing attempt for referencing the keywords. I've so far given up on the idea, so these "Section 1" issues are gone (referencing terminology section keywords..). > - s/This make it/This makes it fixed. > 3.1 > > - s/supporting specific a specific BRSKI role/supporting a specific BRSKI > role fixed. > - s/whether that responders variation/whether that responder's variation fixed. > 3.1.4 > > - s/Todays these are based on whether Section 1 is used or not/Today, these > are based on whether Section 1 is used or not. fixed typo. > o Note: Section 1 is unclear in this context. see above. > 3.1.7 > > - s/This document does also not introduce variation contexts for/Also, this > document does not introduce variation contexts for Fixed. > o Note: it seems redundant to write "variation context" given the provided > definition. Right, see above. > o A bit later, in 3.1.8 the document says that it introduces variation > contexts in form of IANA tables. Thus, maybe the phrase above should be > reworded to: "This document only introduces variation contexts for specific > roles". fixed to: Also, this document does not introduce contexts for discovery of other BRSKI roles beyond those mentioned, such as discovery discovery of MASA by registrars. > 3.1.8.2 > > - Note: For example, Section 1 specifies the empty string "" as the > objective-value in Section 1 discovery. > > o In this phrase, the reference to "Section 1" is unclear. Section 1 of > this document does not have anything of this kind. This was menat to render to [BRSKI] (aka: RFC8995), instead of "Section 1". I'll have to chat with RFC Editor in person if there is a chance to do what i really wanted to do... But for the time being, all those references to keywords will now go to the entries in the references section, where it does work. > 3.1.8.3 > > > > - Note: "one-off" variation string is not very clear without further > explanations. Variations may be identified through other "irregular" strings, such as "", which are not created from concatenation of varation type values, whenever necessary for backward compatibility. > 3.1.9 > > - Note: "Section 1" again. Unclear reference. > > - /s/dcoument/document See above. > 3.2 > > > > - s/specific do the target deployment/specific to the target deployment Fixed. > > > > 3.2.1 > > > > - s/Inter-DC service redundany/Inter-DC service redundancy fixed. > > > > 3.2.2 > > > > - s/Problems such as a hanging//unresponsible/Problems such as a > hanging//unresponsive fixed. > - s/when the responder did/when the responder died fixed. > 3.2.3 > > - s/Stateless proxies can not/Stateless proxies cannot fixed > > - s/stateless responsivness/stateless responsiveness fixed > > - s/service annuncements/service announcements fixed > > - s/superceeded/superseded fixed > > > > 3.3.2 > > > > - s/3.3.2.1. Direction Connections Mode/3.3.2.1. Direct Connection Mode fixed > > - s/registrars service announcement/registrar's service announcement fixed > > - s/for that pledges connection/for that pledge's connection fixed > > - s/registrar sockets it maps to between/registrar sockets they map to > between fixed > > > > 3.3.2.2 > > > > - Note: probably the order of description here should be slightly changed, > or the text should be rewritten slightly - current text: > > o It then connects pledges to the best available registrar socket from that > set. > > o It then announces this socket with the parameters of the best discovered > registrar socket <...>. > > However a proxy does these two things in the opposite order, usually. Fixed. > - Note: "Stateless proxies can not learn unresponsiveness" the whole > paragraph is a repetition from 3.2.3, verbatim, including typos. Is that > redundant text required? No, i removed duplicate paragraphs between those two section. > - Note: the paragraphs just after it are also repetitions from 3.2.3, > largely. Please check, if this cannot be organized differently, maybe rather > re-structure in Considerations for Stateless proxies and Considerations for > Stateful Proxies, and then introduce particularities for registrars and > responder selection accordingly? Should hopefully be fine now. I kept the copy that applies to both proxy operations mode options in the shared section, and what only applied to best registrar mode to that section. > 3.3.2.3 > > - s/UDP for connections pledges/UDP for connections from pledges fixed > > - s/implementation can not discover/implementation cannot discover fixed > > > > > > 3.3.4 > > > > - s/example the common cade of/example the common case of > fixed > > > 3.4. > > > > - Note: Section 1 reference again. Ack. > > > > 3.4.1 > > > > - s/can easier enumerate expected pleges/can enumerate expected pledges > more easily fixed > > 3.4.2 > > > > - s/Two different manufacturer/Two different manufacturers Fixed > > - Note: the serial number refers to the physical device. I am not sure > about the precise use case for BRSKI, but I can easily see an additional > problematic situation, namely, where two or more pledges run on the same > physical device. This can happen, when the physical device with a serial > number visible from outside is some form of bay, collection, host, > load-balancer, etc., which internally has e.g. several line cards, several > boards, several storage devices or whatever. Also, I can imagine an accessing > host being a laptop or a smartphone (BYOD), in which resp. only some > particular secure enclave is supposed to have access. There can be even > several such environments, e.g. one for product line work, and some other for > more general admin tasks. These two would hardly be the same as the physical > laptop and have to be distinguished from each other; besides, the physical > laptop might be exchanged e.g. by the issuing enterprise every 2-3 years, yet > the basic software environment might stay quite unchanged, updates > notwithstanding. This is what we call "composite devices", and i know that colleagues in my previous employe where looking a lot into their life cycle and composition workflows. Unfortunately, we did not have anyone in ANIMA who brought this use-case to the table so that we could write recommendations specifically for those cases. I do actually think that the mechanisms of BRSKI including the discovery in this document should work perfectly well for these scenarios. The discovery mechanisms do allow easily for multiple independent instanceces/enclaves to operate independently from each othrer, but yet even sharing a single IP/IPv6 address (something which may not even be necessary for all composite device cases). There is no use of hard-coded/well-known port-number for example for BRSKI. There is of course a need for the discovery functionality to be acessible/useable from multiple clients in case they have to share a single network address because the discovery protocols do typically rely on hard-coded port number (e.g.: mDNS, GRASP, CORE-LF). But that is primarily an implementation isue (which would be fun to document ;-) > 3.4.3 > > > > - s/use simple numerica/use simple numerical fixed. > > > > 3.5.1.3 > > > > - s/may need to to discover/may need to discover fixed > > - s/which printer to to pick/which printer to pick fixed > > - s/This requirement exist/This requirement exists fixed > > - s/operators are not unnecessary required/operators are not unnecessarily > required fixed > > - s/such as a registrar understand/such as a registrar understands > fixed > > - s/OUI where assigned/OUI were assigned fixed > > - s/when they have globals scope/when they have global scope fixed > > - s/unique name, is actually unique/unique name is actually unique fixed > > - s/names. instance names/names. Instance names fixed > > - Note: please review and simplify/correct the following phrase: > > o "to pick the same host name requirements based encoding format for both > type of DNS RR nams when using encoded addresses in them" Fixed to: For operational simplicity, instance names SHOULD be constructed in the same manner as target hostnames in an implementation. For example by replacing ":" with "-". > > - s/vendors that usecBRSKI/vendors that use cBRSKI Fixed. > > - s/SRP is useable/SRP is usable fixed > > - s/appening some string/appending some string fixed > > - s/and it's numeric process/and its numeric process fixed > > > > 3.5.2.2 > > > > - s/In Figure 5, A separate/In Figure 5, a separate fixed > > - s/value that also support/value that also supports fixed > > > > 3.5.3.1 > > > > - s/Section 1 introduces a stand alone/Section 1 introduces a standalone fixed > > - s/In summaery/In summary fixed > > - Note: check the "Section 1" occurrences see above. > > > > 3.5.3.2 > > > > - Note: check the "Section 1" occurrences see above. > > - s/assumeption is/assumption is fixed. > > > > 3.5.3.3 > > > > - s/extensions to what was is specified/extensions to what is specified fixed > > - Note: check/correct the follow-up phrases: > > o "This document does not specifiy how to discover It uses terms" fixed. > > - Note: there is a requirement (MUST) to register something with IANA. Does > it make sense? > > o Shouldn't a specification either register it or assume that it is > registered? The whole purpose of IANA registries is that they become the primary point of truth for implementers - and hence the source for interop and non-collision of chosen code-points. This is what justifies if not mandates the rfc2119 language. (given how rfc2119 language is primarily meant to descrive interop related requirements). . The specification of the registry table itself defines if and what type of documentation needs to exist. > - s/likewise, Additional BRSKI/likewise, additional BRSKI fixed. > > > > 4.1.2 > > > > - s/IANA is asked to add an entries/IANA is asked to add entries fixed. > - s/useable/usable fixed. > > > 5 > > > > - Note: please check the occurrences of "Section 1" again > > o "In Section 1, pledges are easier subject to DoS attacks than in Section > 1," see above. Thanks a lot! EOF > > > > </review> > _______________________________________________ > Anima mailing list -- [email protected] > To unsubscribe send an email to [email protected] _______________________________________________ Anima mailing list -- [email protected] To unsubscribe send an email to [email protected]
