Hi all, Sorry to have taken so long to get back to this, and thank you for continuing to make updates in response to the changes in the framework and other profiles.
In general, the protocol mechansisms defined here are in good shape; thank you! I made a github PR with some changes that seemed easier to phrase in the form of a patch than a prose comment: https://github.com/ace-wg/mqtt-tls-profile/pull/77 I did find a couple of significant issues that need to be addressed before IETF LC, but I think any needed changes will be pretty localized. Specifically, there's no requirement for ACE access tokens to be self-deliniating, so we can't actually programmatically tell whether there's content after the token in a CONNECT message; the mechanisms in Sections 2.2.4.1/2.2.4.2 seem to assume that we can do so for determining whether the CONNECT is just providing a token or also providing PoP over a TLS exporter value. I think that this just means we need to have an explicit "token length" field (or similar). There are also a few places where we seem to be putting requirements on Broker behavior that are in direct conflict with normative requirements of the MQTT specification. We can't override the external spec, so we'll need to check and reword in any places where there are conflicts. (I'm not an expert on MQTT and only read the spec as part of doing this review, so it's entirely possible that I'm misinterpreting the MQTT spec in some or all of these locations.) A few other general notes before the section-by-section notes: There is very little in this document about the HTTP-based interactions with the AS. I think the intent is to defer to the core framework for that, but being a little more explicit about what is being pulled in and how would be helpful. If we're using TLS Exporters and allow TLS-not-1.3, we need to make some additional requirements on TLS usage in order for the exporter values to be safe. Typically this takes the form of requiring the extended master secret extension along with guidance on what cipher suites to use; I guess RFC 7925 (rather than 7525) would be the default reference for other TLS usage. This document seems to mostly use British English. AFAIR, that's okay, but if it's inconsistent the RFC Editor will prefer American English. I didn't attempt to check for this (though there are tools like https://github.com/larseggert/ietf-reviewtool that will do so). Section 1 their subscribers. In the rest of the document the terms "RS", "MQTT Server" and "Broker" are used interchangeably. We will probably get a reviewer asking why we can't pick a single term and standardize on it. However, I expect that there are places where we want to emphasize on one aspect or another of its behavior, so don't think we should actually do so. Similarly for the places where we mention that CoAP can be used (but don't reference a concrete specification). Clients use MQTT PUBLISH message to publish to a topic. This document does not protect the payload of the PUBLISH message from the Broker. Hence, the payload is not signed or encrypted specifically for the subscribers. This functionality MAY be implemented using the proposal outlined in the ACE Pub-Sub Profile [I-D.ietf-ace-pubsub-profile]. I suggest s/MAY/may/, which would avoid any need to make ace-pubsub-profile a normative reference. reference or JSON Web Token (JWT) [RFC7519]. For JWTs, this document follows [RFC7800] for PoP semantics for JWTs. The Client-AS and RS- AS MAY also use protocols other than HTTP, e.g. Constrained Application Protocol (CoAP) [RFC7252] or MQTT; it is recommended that TLS is used to secure these communication channels between Client-AS and RS-AS. Implementations MAY also use "application/ace+cbor" content type, and CBOR encoding [RFC8949], and CBOR Web Token (CWT) [RFC8392] and associated PoP semantics to reduce the protocol memory and bandwidth requirements. For more information, see Proof-of- Possession Key Semantics for CBOR Web Tokens (CWTs) [RFC8747]. One thing that can be surprising to readers not versed in the ecosystem is that RFCs 7800 and 8747 only talk about the token claims that are used to build the PoP system, and don't actually define mechanisms for providing the proof of possession. We might want a forward-reference to § 2 where we do actually specify mechansisms to prove possession of the indicated key. Section 2.2.1 The way we name these authentication options with specific (quoted) strings suggests that they will be used as a protocol element. But where? Is the literal string "Known(RPK/PSK)" used in both cases (vs. having distinct strings for RPK and PSK)? Also, the hyphen character tends to more often be used as a joiner than a separator, so it's easy to misread this as a triple of "TLS", "Anon-MQTT", "None"/etc.. (I originally was going to ask why these all had a "TLS" prefix...) It might be better to use a semicolon or comma instead of hyphen. o "TLS:Anon-MQTT:None": This option is used only for the topics that do not require authorization, including the "authz-info" topic. Are there topics other than "authz-info" that don't require authorization? We might need to add some heding language to the earlier statement that "Client and the Broker MUST perform mutual authentication" if so. o "TLS:Anon-MQTT:ace": The token is transported inside the CONNECT message, and MUST be validated using one of the methods described in Section 2.2.2. This option also supports a tokenless connection request for AS discovery. We should probably look carefully at the "for AS discovery" phrasing, in light of the late changes in how the framework talks about the AS Request Creation Hints. o "TLS:Known(RPK/PSK)-MQTT:none": For the RPK, the token MUST have been published to the "authz-info" topic. For the PSK, the token MAY be, alternatively, provided as an "identity" in the "identities" field in the client's "pre_shared_key" extension in TLS 1.3. The TLS session set-up is as described in DTLS profile for ACE [I-D.ietf-ace-dtls-authorize]. A couple notes here: First, the DTLS profile primarily uses (D)TLS 1.2 terminology, with the DTLS 1.3 flow as an addendum. I'm actually happy for us to only or primarily talk about the TLS 1.3 idiom here in this document, but we may want to think about what phrasing we use for the relationship between our usage and the DTLS profile. Second, the DTLS profile is for, well, DTLS, not TLS. We should acknowledge that and state the (lack of) effect on the procedures to follow. It is RECOMMENDED that the Client implements "TLS:Anon-MQTT:ace" as a first choice when working with protected topics. However, depending on the Client capability, Client MAY implement "TLS:Known(RPK/PSK)- MQTT:none", and consequently "TLS:Anon-MQTT:None" to submit its token to "authz-info". It's good that we provide guidance on which of the authentication schemes are preferred (since we offer both ACE-layer and TLS-layer schemes). However, we will surely be asked to defend why there are two possible ways of doing it instead of just one, and this text doesn't really do a good job of that. What might cause a need to implement TLS:Known(RPK/PSK)-MQTT:none? The Broker MUST support "TLS:Anon-MQTT:ace". To support Clients with different capabilities, the Broker MAY provide multiple client authentication options, e.g. support "TLS:Known(RPK)-MQTT:none" and "TLS:Anon-MQTT:None", to enable RPK-based client authentication, but fall back to "TLS:Anon-MQTT:ace" if the Client does not send a client certificate (i.e. it sends an empty Certificate message) during the TLS handshake. [Just a potential nit about the wording: "fall back" implies some ordering requirements, but IIRC use of RPK requires sending the token to authz-info before starting the new TLS connection, which doesn't quite match the steps as ordered in this description.] The Broker MUST be authenticated during the TLS handshake. If the Client authentication uses TLS:Known(RPK/PSK), then the Broker is authenticated using the respective method. Otherwise, to authenticate the Broker, the client MUST validate a public key from a X.509 certificate or an RPK from the Broker against the 'rs_cnf' parameter in the token response. The AS MAY include the thumbprint of the RS's X.509 certificate in the 'rs_cnf' (thumbprint as defined in [I-D.ietf-cose-x509]). In this case, the client MUST validate the RS certificate against this thumbprint. Just to confirm: we consciously chose to not reference RFC 6125 for "normal" X.509 server certificate validation procedures, in favor of the "constrained environment" procedures we describe here? Section 2.2.2 The Broker MUST verify the validity of the token (i.e. through local validation or introspection, if the token is a reference) as described in Section 2.2.5. If the token is not valid, the Broker MUST discard the token. Depending on the QoS level of the PUBLISH I see that this is covered in the framework (and we reference the appropriate section already), but I'd consider reiterating that "not valid" includes "the Broker is not an intended audience of the token". It must be noted that when the RS sends the 'Not authorized' response, this corresponds to the token being invalid, and not that the actual PUBLISH message was not authorized. Given that the "authz-info" is a public topic, this response is not expected to cause confusion. Thanks for including this note. I assume that it would be challenging to get OASIS to allocate a new reason code for us to use, and we didn't attempt to pursue that path. Section 2.2.3 Similarly, the Broker MUST NOT process any packets before it has sent a CONNACK. The only exceptions are DISCONNECT or an AUTH response from the Client. nit: in some pedantic sense, these two sentences are in conflict, since the latter violates the "MUST NOT". Who cares and how much has varied over time, especially on the IESG, so it's not clear that we actually need to change anything right now. +------------------------------------------------------+ |CPT=1 | Rsvd.|Remaining len.| Protocol name len. = 4 | +------------------------------------------------------+ Figure 2 appears to show that the Remaining Length field of the fixed header occupies a single octet, but IIUC it's encoded as a variable byte integer. Fixing that would also let us put the separator that appears after it on a proper byte boundary. | 'M' 'Q' 'T' 'T' | +------------------------------------------------------+ This figure does not show the two-byte length prefix for the string. The Will Flag indicates that a Will message needs to be sent if the [...] Interval in the Will Properties. Section 5 explains how the Broker deals with the retained messages in further detail. We might be able to get away with leaving the description of Will operation to the MQTT spec, and not have to say so much about it here. In MQTT v5.0, the Client signals a clean session (i.e. the session does not continue an existing session), by setting the Clean Start Flag to 1 and, the Session Expiry Interval to 0 in the CONNECT message. [...] I don't understand why setting the Session Expiry Interval to 0 is needed to produce a clean session. As I understand it, setting this interval to 0 merely directs the server to not store session state after the client disconnects from this session, which is unrelated to whether or not this is a new session or a reused session. In this profile, the Broker SHOULD always start with a clean session regardless of how these parameters are set. Starting a clean session helps the Broker avoid keeping unnecessary session state for unauthorised clients. If the Broker starts a clean This SHOULD seems highly problematic to me. It looks like it contradicts a hard requirement of MQTT 5.0 ("If a CONNECT packet is received with Clean Start set to 0 and there is a Session associated with the Client Identifier, the Server MUST resume communications with the Client based on state from the existing Session"). If we want to recommend that the broker does not maintain session state, that should be implemented by setting the Session Expiry Interval in the CONNACK, not as part of the CONNECT processing. When reconnecting to a Broker that supports session continuation, the Client MUST still provide a token, in addition to using the same Client identifier, setting the Clean Start to 0 and supplying a Session Expiry interval in the CONNECT message. The Broker MUST (As above, setting a Session Expiry interval seems to relate to the subsequent connection, not the current one, so this feels over-specified.) Note that, according to the MQTT standard, the Broker uses the Client identifier to identify the session state. In the case of a Client identifier collision, a client may take over another client's session. [...] Just to confirm: the ACE token is not used to provide authorization to use a given client identifer; the client identifier is just used as an unauthenticated identifier? We might consider calling that out explicitly. topics. Therefore, while this issue is not expected to affect security, it may affect QoS (i.e. PUBLISH or QoS messages saved for Client A may be delivered to a Client B). [...] I think this is just an aside and not something we need to cover in this document, but what happens if (e.g.) PUBCOMP goes to client B when PUBREC went to client A? Does the message actually get delivered? Does anything deadlock? Section 2.2.4.1, 2.2.4.2 (Expounding on the high-level comment from above,) It seems that we're using the presence/absence of extra data after the token to indicate whether or not the MAC/Signature over exporter content is present and thus which message flow is being used for authentication. However, this is only possible if the token itself is self-describing, which I do not think is guaranteed. Consider, for example, the case of a token that's a reference value that must be introspected in order to retrieve claim information. Such tokens can be arbitrary byte strings, so I think we need some other in-band way to differentiate between authentication methods. Section 2.2.4 To use AUTH, the Client MUST set the Authentication Method as a property of a CONNECT packet by using the property identifier 21 (0x15). This is followed by a UTF-8 Encoded String containing the name of the Authentication Method, which MUST be set to 'ace'. [...] (I assume there's not an MQTT registry that we can register "ace" in as an Authentication Method.) Section 2.2.4.1 For this option, the Authentication Data MUST contain the two-byte integer token length, the token, and the keyed message digest (MAC) or the Client signature (as shown in Figure 4). [...] Does this go in the AUTH (which is what §2.2.4 claims to cover) or the initial CONNECT? (Hint: later in the paragraph the broker replies to it with a CONNACK.) Section 7.5 of [RFC8446]). This content is exported from the TLS session using the exporter label 'EXPORTER-ACE-MQTT-Sign-Challenge', an empty context, and length of 32 bytes. [...] While an empty context should provide ample protection, it seems to me that we could consider using the client identifier as the context. There can also be value in incorporating information on the server identity in the output. If the SNI extension is used, that information would already be included in the key schedule, though we do not currently seem to mandate SNI usage. Current best practices for new deployments are to always use SNI and always use ALPN, so we should probably consider both of those. a CONNACK with the appropriate response code. The client cannot reauthenticate using this method during the same session ( see Section 4). Depending on what "session" means, this restriction may be too strict. We should probably be more clear about what "session" means...if it's the MQTT session, I think it's okay to use this method when re-connecting to take over the session. Section 2.2.4.2 I a little bit wonder if we need to hardcode the nonce length or could let it be variable, with the corresponding change in level of protection provided. In non-constrained setups we would typically use a 128- or even 256-bit bit nonce for this purpose, not a 64-bit one, and it's somewhat surprising to preclude the stronger usage. (Especially so since we use a 256-bit value from the TLS exporter.) If we do allow for length variation, we'll need to add length prefixes to the MAC/signature input (we should probably do that anyway since the client nonce is variable-length). Section 2.2.5 To authenticate the Client, the RS validates the signature or the MAC, depending on how the PoP protocol is implemented. HS256 (HMAC- SHA-256) [RFC6234] and Ed25519 [RFC8032] are mandatory to implement depending on the choice of symmetric or asymmetric validation. I think there is a decent argument (and that it's likely some other AD will make it) that we need to make both HS256 and Ed25519 mandatory to implement for the Broker, leaving only clients with the choice. Otherwise we can get into scenarios where interop is impossible. Validation of the signature or MAC MUST fail if the signature algorithm is set to "none", when the key used for the signature algorithm cannot be determined, or the computed and received signature/MAC do not match. Where would the "none" appear? We haven't said anything about a COSE encoding for the signature or MAC value, or anything like that...I assumed it was going to be the "raw" output from the relevant primitive (EdDSA, HMAC, etc.). Section 2.2.6.2 On success, the reason code of the CONNACK is "0x00 (Success)". The AS informs the client that selected profile is "mqtt_tls" using the "ace_profile" parameter in the token response. If the Broker starts The line about the AS returning "mqtt_tls" as the selected profile feels out of place here, where we're discussing successful MQTT authorization. a new session, it MUST also set Session Present to 0 in the CONNACK packet to signal a clean session to the Client. Otherwise, it MUST set Session Present to 1. (As above,) my understanding is that the Broker does not have agency over whether a new session is started, and must honor the client's request. So these "MUST set" seem out of place. If the Broker accepts the connection, it MUST store the token until the end of the connection. On Client or Broker disconnection, the Client is expected to transport a token again on the next connection attempt. This seems to deviate somewhat from the framework, that expects the RS to be prepared to store at least one token for future use, and recommends storing one token per PoP key. The Broker SHOULD also use a cache time out to introspect tokens regularly. We will surely be asked to provide guidance on what timescale "regularly" indicates, if we do not proactively provide some guidance. Section 3 The scope field contains the publish and subscribe permissions for the Client. The scope is a JSON array, each item following the Authorization Information Format (AIF) for ACE [I-D.ietf-ace-aif]. Using the Concise Data Definition Language (CDDL) [RFC8610], the specific data model for MQTT is: This seems a little dicey, since we claim to allow JWT tokens as well as CWT. JWT "scope" is pretty tightly nailed down to be a space-separated list (though CWT gives much greater freedom). We probably need to have some text about this situation, with a phrase about how "in order to be compatible with the JWT scope format, we use a single scope value with internal structure", that this structure is also compatible with the CWT rules, and noting that our AIF usage prevents any internal spaces, so the interpretation of the scope value is unambiguous even when unmodified JWT libraries are used. If the Will Flag is set, then the Broker MUST check that the token allows the publication of the Will message (i.e. the Will Topic filter is in the scope array). We might want to say a little more about when this check happens. My intuition is that it occurs during the CONNECT processing where the actual Will message is provided, and that the connection would be rejected if the Will message is unauthorized. But this section is titled "Authorizing PUBLISH and SUBSCRIBE Messages", so one might be forgiven for assuming that CONNECT is out of scope... Section 3.2 subscription to the particular topic). The Broker sends a PUBLISH message with the Topic name to all the valid subscribers. (nit) It seems a little strange to mention specifically the Topic name but say nothing about there being a payload along with it. Section 3.3 On receiving the SUBSCRIBE message, the Broker MUST use the type of message (i.e. SUBSCRIBE) and the Topic Filter in the message header to match against the scope field of the stored token or introspection result. The Topic Filters MUST be equal or a subset of at least one of the 'topic_filter' fields in the scope array found in the Client's token. I suggest being very explicit about whether the wildcards in the token scope are expanded as part of matching the topic filter in the request, or if the SUBSCRIBE message must use topic filters that match byte-for-byte the permissions granted by the token. Section 4 Authentication Data. The Broker accepts reauthentication requests if the Client has already submitted a token (may be expired) and validated via the challenge-response PoP. Otherwise, the Broker MUST deny the request. If the reauthentication fails, the Broker MUST send a DISCONNECT with the reason code "0x87 (Not Authorized)". Is this correct? It seems to say that if the initial CONNECT used the TLS exporter for PoP, then it's forbidden to use the challenge-response method for PoP and thus impossible to reauthenticate on that connection. I don't understand why such a limitation would be needed or useful. Section 5 In the case of a Client DISCONNECT, the Broker deletes all the session state but MUST keep the retained messages. By setting a As written, this seems to imply that all client DISCONNECT messages result in the loss of session state. I didn't (quickly) find a clear statement one way or the other in the MQTT spec, but it does seem that it's allowed to send a nonzero session expiry interval in a DISCONNECT, which seems to imply that such DISCONNECT messages do not cause all session state to be discarded. Hence, the new subscribers can receive the last sent message from the publisher for that particular topic without waiting for the next PUBLISH message. The Broker MUST continue publishing the retained messages as long as the associated tokens are valid. This MUST seems to be repeating a requirement from the MQTT spec, which may not merit normative language from us. In case of disconnections due to network errors or server disconnection due to a protocol error (which includes authorization errors), the Will message is sent if the Client supplied a Will in the CONNECT message. The Client's token scope array MUST include the This "if the Client supplied a Will in the CONNECT message" implies that authorization checks are performed at time of CONNECT... Will Topic. The Will message MUST be published to the Will Topic ... but "scope array MUST include the Will Topic" suggests an authorization check when the Will message is actually sent. Is it one, the other, or both checks that must pass? Section 6.1 [My earlier comments about the MQTT header layout apply to Figure 11 as well.] The Broker SHOULD NOT accept session continuation. To this end, the Broker ignores how the Clean Session Flag is set, and on connection success, the Broker MUST set the Session Present flag to 0 in the CONNACK packet to indicate a clean session to the Client. [...] As above, this seems to violate the MQTT normative requirements (but I mostly only read about MQTT 5, not 3.1.1). The CONNECT in MQTT v3.1.1 does not have a field to indicate the authentication method. To signal that the Username field contains an ACE token, this field MUST be prefixed with 'ace' keyword, which is followed by the access token. [...] An example of what this looks like would be helpful. It sounds like we just take the first three bytes for the sentinel and then go directly to the access token, vs having some kind of "separator" as part of the sentinel value. In MQTT v3.1.1, the MQTT Username is a UTF-8 encoded string (i.e. is prefixed by a 2-byte length field followed by UTF-8 encoded character data) and may be up to 65535 bytes. Therefore, an access token that is not a valid UTF-8 MUST be Base64 [RFC4648] encoded. (The MQTT Password allows binary data up to 65535 bytes.) Data-dependent encoding transformation without explicit signaling is a really bad idea. I think we need to always base64-encode the token. We should also specify a section reference to RFC 4648 (for plain base64 vs base64url) and probably make a statement about whether padding characters are retained or omitted. Section 6.2 o RS-Client PUBLISH authorization failure: When RS is forwarding PUBLISH messages to the subscribed Clients, it may discover that some of the subscribers are no more authorized due to expired tokens. These token expirations SHOULD lead to disconnecting the Client rather than silently dropping messages. (I'm not actually sure how much the MQTT spec says about this type of scenario, and thus whether the "SHOULD" is the right term to use.) Section 7.1 This document registers 'EXPORTER-ACE-MQTT-Sign-Challenge' (introduced in Section 2.2.4.1 in this document) in the TLS Exporter Label Registry [RFC8447]. We need to specify the other columns in the registry. I think we can have: DTLS-OK: Y Recommended: Y Reference: [this document] Section 7.2 This document registers the 'application/ace+json' media type for messages of the protocols defined in this document carrying parameters encoded in JSON. Thanks for sending this to the media-types list for review (https://mailarchive.ietf.org/arch/msg/media-types/85kGXBBKaWqIoCSU5k7GrE5FRWw/). It's unfortunate that nobody replied to that thread, but I don't know that there's more that we can do. Section 7.3 The following registrations are done for the ACE OAuth Profile Registry following the procedure specified in [I-D.ietf-ace-oauth-authz]. [...] o CBOR Value: I think it's clearer for reviewers if we put "TBA" or "to be assigned by IANA" here. o Description: Profile for delegating Client authentication and authorization using MQTT as the application protocol and TLS For transport layer security. It seems like it might be preferred to talk about using MQTT for the C/RS interactions and HTTP for the interactions with the AS. Separately, mention that TLS is used for confidentiality/integrity protection and server authentication; also, client authentication can be provided either via TLS or using in-band proof of possession at the MQTT application layer. Section 8 revoked topics. If the RS caches the token introspection responses, then the RS SHOULD use a reasonable cache timeout to introspect ("reasonable cache timeout" again) If the RS supports the public "authz-info" topic, described in Section 2.2.2, then this may be vulnerable to a DDoS attack, where many Clients use the "authz-info" public topic to transport fictitious tokens, which RS may need to store indefinitely. We do say that the RS only stores "valid" tokens, which includes being generated by a trusted AS and having RS as the audience. So it's not clear that this statement is accurate if the attack is to involve "fictitious" tokens. Similar attacks are possible, though. Section 10.1 The normative use of AIF means that we'll have to wait for AIF to catch up to us at some point, whether at the RFC Editor or sooner. RFC 8447 probably does not need to be normative; it is just mentioned as the thing that is the reference for the IANA registry of TLS exporter values. Section 10.2 I think the DTLS profile needs to be a normative reference, since we defer to it for TLS session set-up. Likewise, RFC 6234 and RFC 8032 specify MTI algorithms for authentication, which would make them normative. Appendix A I think we need to update the checklist to match the current template from https://datatracker.ietf.org/doc/html/draft-ietf-ace-oauth-authz-43#appendix-C -Ben _______________________________________________ Ace mailing list Ace@ietf.org https://www.ietf.org/mailman/listinfo/ace