Benjamin Kaduk has entered the following ballot position for draft-ietf-capport-architecture-08: Discuss
When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-capport-architecture/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- (1) and (2) should be easy to fix; (3) may well be "fixed" by telling me I'm too naive :) (1) Given that section 1 describes other options, the abstract should not limit to just DHCP and RA as options for provisioning the API URL. (2) Section 4.1 says that: 5. The Captive Portal API server indicates to the Enforcement Device that the User Equipment is allowed to access the external network. but I believe this should be the "Captive Portal Server" (or, as the previous point has it, the "web portal"). (3) Probably a "discuss discuss", but ... in Section 1 we have: * Solutions SHOULD NOT require the forging of responses from DNS or HTTP servers, or any other protocol. In particular, solutions SHOULD NOT require man-in-the-middle proxy of TLS traffic. I'd like to understand the motivation for this one a little better. Naively, it seems like we could get away with "MUST NOT require" while still allowing it to be done. Am I missing something obvious? ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- I'd like to see some more discussion of which signals are authenticated and how, and what kind of authorization checks are possible. In well-run networks DHCP and RA signals should be relatively trustworthy, but clients don't always have a good indicator for whether a given network falls into that category. Are there (other) mechanisms that can be used to give trust in the authenticity of a given Captive Portal API URI and that that API is authorthorized to provide unconstrained access for the network in question? We require TLS for accessing the API server, but (as I note inline) there are more details that can be given about this TLS usage. What can be done to authenticate and authorize the Captive Portal Server? Most importantly (and most appropriately for an architecture document), which of these properties are strictly required vs. merely optional? These are not Discuss-level points because an architecture does not strictly-speaking need to specify all of them, but having some indication of how we plan to achieve them would give greater confidence that this architecture will be a useful one. I'm happy to see the response to the genart reviewer's comment regarding "a" vs. "the" capport architecture; thanks! Abstract This document describes a CAPPORT architecture. DHCP or Router Advertisements, an optional signaling protocol, and an HTTP API are used to provide the solution. The role of Provisioning Domains nit: there's perhaps a bit of a lack of parallelism in the list structure, where we talk about specific mechanisms for provisioning without describing the more abstract concept of provisioning, and list that alongside an abstract mention of "a signaling protocol" and the both-abstract-and-concrete "HTTP API". Section 1 Implementations generally require a web server, some method to allow/ block traffic, and some method to alert the user. Common methods of nit: I'd suggest clarifying that this is "implementations of captive portals" (or is it "captive networks"?). alerting the user involve modifying HTTP or DNS traffic. nit: perhaps "at present" or "prior to this work"? If I understand correctly one of the goals of this work is to shift the balance of captive portals away from these practices (while acknowledging that fully eliminating them is not feasible in the near future). * Solutions MAY allow a device to be alerted that it is in a captive network when attempting to use any application on the network. I'm also not sure I understand this one, especially in light of the following (paraphrased) "SHOULD allow learning of captivity before application attempts to use the network". What's the alternative to "MAY allow", not-allowing such detection at all? * The architecture MUST provide a path of incremental migration, acknowledging a huge variety of portals and end-user device implementations and software versions. nit: "preexisting" or similar would go a long way here. * Network provisioning protocols provide end-user devices with a side note: using the word "provisioning" to describe things like DHCP and RA feels odd to me, presumably due to my background and what I expect provisioning to be. I can see why it makes sense to use the term for this purpose, though. Perhaps an additional adjective could help clarify what is meant, though I don't have a suggestion at hand. for this purpose are available in [RFC7710bis]. Other protocols (such as RADIUS), Provisioning Domains [I-D.pfister-capport-pvd], or static configuration may also be used. A device MAY query this side note: personally, I'd expand to "may also be used to convey this API URI", though it's probably not required for clarity. The device MAY take immediate action to satisfy the portal (according to its configuration/policy). side note: it's not entirely clear to me that we need a normative MAY for this. Section 2.1 have Internet access). The User Equipment communication is typically restricted by the Enforcement Device, described in Section 2.4, until site-specific requirements have been met. It seems like these "site-specific requirements" must be the "Captive Portal Conditions" that we just defined. * SHOULD have a mechanism for notifying the user of the Captive Portal It is pretty important that this mechanism be non-spoofable by, e.g., untrusted websites. I think we should mention something about "non-spoofable" here. * MAY prevent applications from using networks that do not grant full network access. E.g., a device connected to a mobile network may be connecting to a captive WiFi network; the operating system MAY avoid updating the default route until network access restrictions have been lifted (excepting access to the Captive nit: maybe say in which direction the update would go and/or something about why the move to wifi is desirable? None of the above requirements are mandatory because (a) we do not wish to say users or devices must seek full access to the captive network, (b) the requirements may be fulfilled by manually visiting the captive portal web application, and (c) legacy devices must continue to be supported. side note: in my opinion, it's possible to support legacy devices in practice without baking their limitations into the spec. If User Equipment supports the Captive Portal API, it MUST validate the API server's TLS certificate (see [RFC2818]). An Enforcement We should probably cite RFC 6125 here and say something about how the UE gets a name to validate the server's certificate against (and what name type to use). [I-D.ietf-capport-api] for more information. If certificate validation fails, User Equipment MUST NOT proceed with any of the behavior described above. I'm not sure which behavior the "behavior described above" is. "[accessing...] OCSP responders, CRLs, and NTP servers" doesn't seem quite right since that's *how* you determine that certificate validation fails, but the bits further up about "navigate [to] the Captive Portal user interface" do not seem to clearly call out a single behavior or set of behaviors by the UE. Section 2.2.2 Although still a work in progress, [I-D.pfister-capport-pvd] proposes a mechanism for User Equipment to be provided with PvD Bootstrap Information containing the URI for the JSON-based API described in Section 2.3. I don't think "JSON-based" is supported by the text of ยง 2.3 (and isn't really appropriate for an architecture doc in most cases, anyway). Section 2.3 The purpose of a Captive Portal API is to permit a query of Captive Portal state without interrupting the user. This API thereby removes the need for User Equipment to perform clear-text "canary" HTTP queries to check for response tampering. nit: probably don't need to be specific about HTTP, here. At minimum, the API MUST provide: (1) the state of captivity and (2) a URI for the Captive Portal Server. Is there anything useful to say about the URI scheme for the captive portal server URI? I guess I could probably (grudgingly) come up with a case where http-not-s would be tolerable, but given that we admit the possibility of "payment" as a captive portal condition, I don't want us to encourage sending payment or other sensitive information over schemes inappropriate for such information. A caller to the API needs to be presented with evidence that the content it is receiving is for a version of the API that it supports. What about evidence that the content it is receiving is intended to be used with, and authorized to speak for, the network it is joining? When User Equipment receives Captive Portal Signals, the User Equipment MAY query the API to check the state. The User Equipment nit: we seem to use "the state of its captivity" most places. The API MUST use TLS to ensure server authentication. The implementation of the API MUST ensure both confidentiality and integrity of any information provided by or required by it. It's a little weird to split the TLS requirements between here and Section 2.1, though I guess if we're splitting things by role it's probably unavoidable. (I made my RFC 6125 comment in Section 2.1 and it probably doesn't need to appear in both places.) Section 2.4 * May signal User Equipment using the Captive Portal Signaling protocol if certain traffic is blocked. nit: I think that "optionally signals" might be a better fit for the list structure as used in the other bullet points. Section 2.5 When User Equipment first connects to a network, or when there are changes in status, the Enforcement Device could generate a signal toward the User Equipment. This signal indicates that the User Equipment might need to contact the API Server to receive updated information. For instance, this signal might be generated when the end of a session is imminent, or when network access was denied. Would this signal also be used when the UE has successfully met the Captive Portal Conditions? Section 2.6 * The User Equipment queries the API to learn of its state of captivity. If captive, the User Equipment presents the portal user interface from the Web Portal Server to the user. [we previously discussed this UE behavior as optional. I don't mind having the text be descriptive like this, since it's describing the diagram, and the diagram is not binding on all UEs, but it seemed worth noting just in case.] Section 3.1 An Identifier is a characteristic of the User Equipment used by the components of a Captive Portal to uniquely determine which specific User Equipment is interacting with them. An Identifier MAY be a Do we want to say anything about what scope within which the uniqueness must hold? ("No" is probably fine.) Section 3.2.1 Each instance of User Equipment interacting with the Captive Network MUST be given an identifier that is unique among User Equipment interacting at that time. side note: "MUST be given" gets a knee-jerk "by whom?" response from me. It's probably okay for this document to not specify, though, as it may depend on the nature of the Captive Network. Over time, the User Equipment assigned to an identifier value MAY change. Allowing the identified device to change over time ensures that the space of possible identifying values need not be overly large. Is the identifier assigned to a given UE on the same network expected to be able to change as well? This may have some privacy considerations... Section 3.2.2 are active at the same time. This property is particularly important when the User Equipment is extended externally to devices such as billing systems, or where the identity of the User Equipment could imply liability. nit(?): is it the UE that is extended externally or the identifier thereof? Section 3.2.4 In some situations, the User Equipment may have multiple IP addresses, while still satisfying all of the recommended properties. nit: as written, "while still satisfying all of the recommended properties" is describing the UE, but the context of Section 3.4 suggests that we want to be talking about the recommended properties for identifiers. Section 3.5 Accessing the API MAY depend on contextual information. However, the URIs provided in the API SHOULD be unique to the UE and not dependent on contextual information to function correctly. Should the per-UE APIs and/or the mapping between UE and per-UE API be unguessable? (Do we want to reference Capability URLs [https://www.w3.org/TR/capability-urls/]?) Section 4 I might consider explicitly saying "non-normative" somewhere in here. Section 4.1 4. If necessary, the User navigates the web portal to gain access to the external network. nit: "navigates to" Section 4.2 3. The User Equipment's UI indicates that the length of time left for its access has fallen below a threshold 4. The User Equipment visits the API again to validate the expiry time side note: I feel like there's implicitly some User action in here, though I don't know that we need to actually say anything about it. (Otherwise we wouldn't have the UI indicating things.) Section 4.3 Whenever a new Portal URI is received by end User Equipment, it SHOULD discard the old URI and use the new one for future requests to the API. What kind of validation/authorization checks need to be applied to the new Portal URI? (nit: we probably should check the terminology in this section; the Section 1.2 lexicon would call this information the "Captive Portal API Server URI" and not a "Portal URI".) Section 7 This mechanism rather inherently requires having multiple entities track the UE's identity (and, thus, likely be tracking a proxy for the user's identity). It seems appropriate to include some discussion of the privacy considerations of this tracking, and whether/what kind of anonymity support is appropriate! Section 7.1 Given that a user chooses to visit a Captive Portal URI, the URI location SHOULD be securely provided to the user's device. E.g., the DHCPv6 AUTH option can sign this information. I'm not sure that I understand the intent behind the "Given that" construction here. Is it trying to emphasize user choice, and thus the need for informed choice? Section 7.2 [In the vein of my previous remarks, there are many ways to use TLS, and usually we provide more details on how we expect TLS to be used.] Section 7.3 The API MUST ensure the integrity of this information, as well as its confidentiality. Who/what is the attacker(s) that we need to preserve confidentiality from? Section 7.4 * Accesses to the API Server are rate limited, limiting the impact of a repeated attack. One might consider a flooding attack that tries to get the UE to use all its (rate-limited) connections to get some information that is not the information that it's most important for the UE to have. If there's only a single operation that can be performed at the API Server (which I believe is the intent?) there is no such attack, but it may be worth mentioning that there is no such attack. Section 8.1 Interestingly, none of the places where we reference 7710bis have surrounding text that clearly incur a normative dependency. Appendix A We explain the use of the "canary" term here, but have already used it twice (with no forward-reference) in the body of the document. Another test that can be performed is a DNS lookup to a known address with an expected answer. If the answer differs from the expected answer, the equipment detects that a captive portal is present. DNS queries over TCP or HTTPS are less likely to be modified than DNS queries over UDP due to the complexity of implementation. Is the reader supposed to draw the conclusion that DoTCP/DoH provide less-reliable captive-portal detection than Do53? (I assume "TCP" is not a typo for "TLS", here, though am unsure enough to want to check.) Malicious or misconfigured networks with a captive portal present may not intercept these requests and choose to pass them through or decide to impersonate, leading to the device having a false negative. nit: I suggest "these 'canary' requests" to clarify which requests we're talking about. _______________________________________________ Captive-portals mailing list Captive-portals@ietf.org https://www.ietf.org/mailman/listinfo/captive-portals