Hi! I conducted an AD review of draft-ietf-i2nsf-nsf-monitoring-data-model-08. Thanks for this detailed info and data model.
My high-level comments are as follows: ** There is a lot of flexibility in this data model in the cardinality of the fields and the sheer number of free form fields. The positive of this approach is that it should be able to represent a wide variety of tools/NSF. The negative of this is that significant profiling or out of band knowledge will be needed to make many of these field machine readable. It would be helpful to discuss this in the text ** There are a number of taxonomies in this data model. They will be helpful to parse and triage alarms. However, I have some concerns about the completeness of those currently specified and the lack of discussion on how they might be extended. It would also be helpful to cover this in the text. ** (Mentioned below in detail) I found the philosophically framing in Section 3 and 4, not entirely in sync. Additionally, the taxonomy of different kinds of data introduction in Section 4 did not clearly align for me against the data model in Section 10. ** (Mentioned below in detail) I found a few places where there were assumption and architectural elements outside of the base I2NSF architecture (RFC8329 and draft-ietf-i2nsf-applicability-18). Where that occurs more detail would be helpful, or reconsideration if this is necessary. Now the more specific comments: ** This document provides both an information model and seemingly a YANG module to implement it. I may have missed it, but it would be helpful to state that obvious fact ** Section 1. Editorial. s/Monitoring procedures intent to acquire/Monitoring procedures acquire/ ** Section 1. This sentence didn't parse for me: OLD Monitoring procedures intent to acquire vital types of data with respect to NSFs, (e.g., alarms, records, and counters) via data in motion (e.g., queries, notifications, and events) NEW This interface enables the sharing of vital data from the NSFs (e.g., alarms, records, and counters) to the Security Controller through a variety of mechanisms (e.g., queries, notifications, and events). ** Section 1. s/for an NSF for an NSF/for an NSF/ ** Section 1. Recommend sticking to the named I2NSF architecture of RFC8329 so s/e.g., Security Controller and NSF Data Analyzer/e.g., Security Controller/ ** Section 1. Is the phrase "... provides visibility for an NSF for an NSF data collector ...", the same thing as saying "...provides visibility into an NSF for the NSF data collector"? ** Section 1. How important is it to introduce the new architectural element of the "NSF data controller" as a super-set of the previously defined Security Controller, and the never previously mentioned or defined "NSF Data Analyzer". I asked because the I2NSF reference architectures in RFC8329 and draft-ietf-i2nsf-applicability don't have it. I see value in keeping the language consistent with previous drafts by defining a new role for the Security Controller (from draft-ietf-i2nsf-applicability) or Network Management Operator System (from RFC8329) instead. If it's imperative to invent this new architecture term, please explain how it is architecturally different than these previously defined components. ** Section 1. Editorial. Per "The information model for the NSF monitoring interface presented in this document is a complementary information model to the information model ...", recommend against using the term "information model" three times in the same sentence for readability. ** Section 3. I found these use cases clear. However, after reading Section 4, I had trouble relating the terminology and nuances to this section. -- There is discussion of "events" and "activity logs" in these use cases. However, in Section 4 notes events, notifications and records. How do notifications and records align with these use cases? -- These use cases discuss primary discuss acting on events and triggered by activity. However, in subsequent sections (e.g., Section 7), there is note of "alerts" and "alarms". How are they related? ** Section 4. Editorial. Per "In order to maintain ...", this sentence has a double negative (two "not") ** Section 4. Per "Three basic domains about the monitoring information originating from a system entity [RFC4949] or an NSF are highlighted in this document", what is the relationship between a "system entity" and NSF? What's a "system entity" in the I2NSF framework/architecture? ** Section 4. The Alarm Management Framework in [RFC3877] defines an Event as something that happens as a thing of of interest. -- Typo. s/of of/of/ -- The definition from RFC3877 "Something that happens which may be of interest"? Editorially, the exact words are clearer. ** Section 4. The citation doesn't seem right. Per: It defines a fault as a change in status, crossing a threshold, or an external input to the system. Section 3.1 of RFC3877 says a fault is "Lasting error or warning condition." The quoted text of "... as a change in status, crossing a threshold, or an external input to the system" comes from the definition of an event and it states them as examples: "A fault, a change in status, crossing a threshold, or an external input to the system, for example." ** Section 4. Per "... the scope of the Alarm Management Framework's Events is still applicable due to its broad definition", can you please clarify what is being invoked from RFC3877 beyond these definitions. ** Section 4.1. The section used term "retention" in a way I didn't expect. I'm most familiar with retention practices in the area of alerts and logs as discussing what information is kept and for how long. Does that need to be covered here? ** Section 4.1. Editorial. Typically, a system entity populates standardized interface, such as SNMP, NETCONF, RESTCONF or CoMI to provide and emit created information directly via NSF Monitoring Interface New Typically, a system entity populates standardized interfaces, such as SNMP, NETCONF, RESTCONF or CoMI to emit information via the NSF Monitoring Interface ** Section 4.1. Per "Alternatively, the created information is ...", is it "alternatively" or "additionally"? ** Section 4.1. Typo. s/ Monistoring/Monitoring/ ** Section 4.1. Per the paragraph beginning with "Information retained on a system entity ...", I had trouble following this guidance -- Is the text suggesting that the data be emitted in some way other than the standardized data model described in this document? ** Section 4.1. Per "An I2NSF User is required to process fresh [RFC4949] records ...", -- how does an I2NSF user know the records are fresh? -- what is a "homogenizing function"? -- per the architecture, how is the I2NSF user "...proving[ing] them to other I2NSF Components", as the user only interacts with the Controller? ** Section 4.1. Per "When retained or emitted, the information required to support monitoring processes has to be processed by an I2NSF User at some point in the workflow. Typical locations of these I2NSF Users are: ...", unless I'm misunderstanding that it meant by location, this behavior doesn't seem to align with the reference architecture of Figure 1 of draft-ietf-i2nsf-applicability or Section 3 of RFC8329, which suggest that only the security controller is directly interacting with the NSF. ** Section 4.2. There is a distinction being made between an event and a notification, but that distinction isn't clear to me - are events and notifications the same things except events come from an I2NSF component, but notifications do not? If notifications aren't part of the I2NSF architecture, how are they in scope to this data model? Do multiple notifications aggregate to be an event? ** Section 4.2. There deliberate taxonomy being created around events, notifications and records with each having significantly different properties to warrant distinct categories. This distinction is not clear to me when manifested in the YANG module. Are all top level containers with "*-event-*" in their names events? Are records the containers with "log" in their name? What are notifications? Additionally, "alarms" are later introduced in the text, how do they relate? ** Section 4.4. It isn't clear from the text whether records are shared via the monitoring interface? ** Section 4.4. Per "Unlike information emitted via notifications and events, records do not require immediate attention from an analyst but may be useful for visibility and retroactive cyber forensic": -- I don't see text in Section 4.2 is it noted that notifications required immediate attention from an analyst -- What is the relationship between the analyst and the "I2NSF user"? ** Section 5. This section appears to be restating very similar information to Section 4.3. Are both needed? ** Section 5. Per the various examples of "data model and interaction model for data in motion", their applicability to I2NSF isn't clear. This document is specifying the encoding of data in a particular information and data model. IPFIX and NetFlow don't seem germane, and represent an alternative to some of the information described here. ** Section 5.1. This section explicitly calls out YANG Push and YANG Subscribed Notification. Are these recommended protocols? I ask because the previous sections (4.3 and 5.0), already described the generic utility of push and notification capabilities. Here the capabilities are being described again specifically. ** Section 6 and 7. There is no guidance on whether any of these data items are mandatory. ** Section 6. Per the Basic Information model: -- message: is this field identifying the type of message or the message itself? -- nsf-name: is the name a FQDN? Or any arbitrary string? -- Should there be a timestamp? ** Section 7. Editorial. s/as alarm/as an alarm/ and s/with basic information/with the basic information/ ** Section 7.*. The definition of "acquisition method", "emission-type" or "dampening-type" aren't clear from the one-word descriptions. Earlier text hints at acquisition method and emission type, but dampening isn't mentioned. Recommend up front definition before use here. ** Section 7.1.*. -- A simple definition of memory, cpu, disk, etc. would be helpful to explicitly say in the text -- each of these alarms has a "message" field. Is the proposed text the actual message? The text seems to read like the definition of that type of alarm. ** Section 7.1.5. interface-state. "up, down and congested" seems to be mixing different properties. Are the underlying semantics up-not-congested, up-but-congested, and down? ** Section 7.2.1 and 7.2.2. -- Can a user only belong to one group? -- Recommend defining "authentication" -- per "login-ip-address" - is this the true "login IP" or the IP address of the action that triggered the alarm? ** Section 7.2.2. Recommend defining the scope of "configuration change" ** Section 7.2.3. Should there be some indicator as to why this flow was shared? ** Section 7.3 and 7.5. I was expecting to find symmetry between the NSF events described here and those described by the "content-security-controls" and "attack-mitigation-control" in draft-ietf-i2nsf-nsf-facing-interface. I assumed that if the NSF has a certain capability then it would be able to generate events for it using this info/data model. Instead I found representations for modeling an alert for which capabilities doesn't seem to exist; and capabilities for which there wasn't a corresponding way to created alarms. Specifically: -- voip-volte, pkt-capture and mail-filtering don't seem to have an analog here -- here there is an intrusion event but draft-ietf-i2nsf-nsf-facing-interface makes a distinction between ids and ips -- botnet, session and vulnerability scanning are defined here don't have an analog in draft-ietf-i2nsf-nsf-facing-interface ** Section 7.3.1. Per attack-type -- saying "Any one of ..." and then ending it with an "and etc" is confusing because this is no longer an enumerated list. -- was it intentional to have this list not align with the types of DDOS attacks enumerated in draft-ietf-i2nsf-nsf-facing-interface? ** Section 7.3.1. Per dst-ip, should this be expressed as a network mask or domain name for additional flexibility? ** Section 7.3.1. Rule-name -- Why is the "rule-name" encoded here but not in any of the other types of events (as it seems like it would be useful)? -- Is this the name of the I2NSF Policy Rule or rule specific to the configuration on the NSF? ** Section 7.3.1. Per "profile", can "security profile" please be defined. ** Section 7.3.3 and 7.3.4 (and same applies to YANG module) -- What is a src/dst-zone? -- Should the IP/port/raw_info fields describe the flow in which the malware was seen rather than a packet? (Per Section 7.3.4) I ask because it would seem unlikely that a network-based malware inspection tech would operate on a per-packet basis (or that the file with the malware payload actually fit in a single packet). Per Section 7.3.4, most modern IDS/IPS also operate on streams/flows. ** Section 7.3.5. role. -- The roles seem like the would benefit from further generalization. Given the diversity of C2 servers approaches, would it be cleared to s/IRC and Web/C2/ -- How would peer-to-peer zombie be handled, that is where the compromised hosts can talk to each other? -- Is an "other" needed to catch additional cases? ** Section 7.4. What is the relationship of a system log to the event, notification and record taxonomy introduced earlier ** Section 7.4.1, Per "Administrator" -- Editorially, this is one of the few uppercase field names -- is this a user name? ** Section 7.4.2. -- Are the CPU, disk, sessions, traffic rates, traffic speed numbers aggregates across all CPUs and interfaces on a system? -- How does one provide per CPU/interface/disk stats? ** Section 7.4.3 -- what is "access"? Is this describing the means by which the user accessed the system? Is PPP, Point-to-Point Protocol? I don't recognized "SVN"? -- What is "online-duration"? and "logout-duration"? ** Section 7.5.2 Editorial. Using the term "victim-id" suggests to me that a particular system has been exploited. I was under the impression that a "vulnerability scanning log" merely found a vulnerability. ** Section 7.6.1. What is the temporal frame of reference for these counters? For example, is the peak computed since the last time the counter was polled? Since some internal state was reset? ** Section 7.7.1, What are "*-regions"? ** YANG. typedef dpi-type. What is the difference between "data-filtering" and "application-behavior-control"? Wouldn't a subset of application behavior be filtering data it sends? ** YANG. typedef operation-type. "Configuration" seems underspecified, unless it is intended to mean any operation done by a user beyond login or logout. For example, if a I list the contents of a data store, I'm not doing a configuration. Is that something that would be in scope to log? ** YANG. typedef operation-type. The information model distinguishes between a privileged and unprivileged user. Does that distinction apply here? ** YANG. typedef login-mode. -- Saying "root" seems Unix-centric. -- "mode" doesn't seem like the right term. Root, user and guest are roles, but the means of logging in is the same. ** YANG. identity periodical. Editorial. Should this be "periodic"? ** YANG. There are a number of enumerations whose descriptions are simply repetitions of the name. Please scrub the model and add improved descriptions. For example: -- YANG. identity dampening-type, no-dampening and on-repetition. The descriptions for these identities are not meaningful as they simply repeat their names. -- YANG. identity authentication-mode and event-type he specific authentications modes do not have meaningful descriptions and merely repeat their names. -- YANG. identity access-mode. The description does not make it clear what this or the derived enums are. ** YANG. identity virus-type. -- Typo. s/caan/can/ -- the taxonomy of virus-type seems to be incomplete. How does one characterize non-self replicating, non-trojan and pure binary malware? -- the descriptions of the derived virus-type enums are not meaningful and simply repeat their names ** YANG. identity req-method. Is there a reason why the HTTP request methods are incomplete? ** YANG. identity whitelist and blacklist. In the spirit of inclusive language, please do not use these terms. Consider accept/allow vs. deny-list. ** YANG. identity user-defined. What does a "user-defined" list mean when compared to an accept or deny list? ** YANG. identity malicious-category. How is this different than a deny list ("identity blacklist") ** YANG. Per the various protocol identities (i.e., identity ftp, icmpv6, etc.), can you explain how these are used (not saying there is an issue, I'm just don't understand). ** YANG. identity http. s/HTPP/HTTP/ ** YANG. grouping common-monitoring-data. In this grouping, the severity type uses enums of critical, high, middle and low. However, in the information model, Section 6, the severity is described as a number from 0..3. Shouldn't they be the same? ** YANG. leaf vendor-name. Is this field also free form, like "leaf message" ** YANG. leaf nsf-name. Is this either an IP and a FQDN (or host name), or is the "name" here an arbitrary label? ** YANG. grouping i2nsf-nsf-event-type-content-extended. What are "src-zone" and "dst-zone"? ** YANG. When "grouping log-action" is invoked in various cases in the model, should there be flexibility to have multiple actions (if I'm reading the YANG correctly, it's 0..1) ** YANG. grouping attack-rates. Editorial. Please spell out PPS and BPS in the description. ** YANG. grouping traffic-rates. What are the units and phenomenon measured by the "leaf total-traffic"? ** YANG. grouping traffic-rates. For the counters that are *-average-*, how is the time horizon conveyed? ** YANG. leaf src-user. Is that the I2NSF user? A user name? ** YANG. case i2nsf-traffic-flows. -- This container represents flows, but all of the leaves seems to talk about packets. Recommend s/of the packet/of the flow/ -- per "leaf arrival-rate", how is this computed? Is this the average arrival rate for all packets in the flow? ** YANG. notification i2nsf-log. Case i2nsf-nsf-system-access-log. -- Per "leaf administrator", is this the username of the "administrator"? -- Per "leaf result", how would a binary result be encoded? -- is there some what to convey the format of the input or output (e.g., the content is a bash shell command) ** YANG. container i2nsf-system-res-util-log. -- per "leaf system-status", what kind of text would be extended in this free form string? -- per "cpu, memory, disk-usage", what are the units of the uint8 value? -- per "session, process-num", is uint8 sufficiently large to represent a session a process count? ** YANG. container i2nsf-system-user-activity-log. -- what do the field online-duration and logout-duration mean? -- per additional-info, how should the list of values after the "e.g., Successful User ..." be read? It seems to be suggesting a set of enumerated values. ** YANG. container i2nsf-nsf-detection-ddos. -- Per "leaf attack-src/dst-ip", can the text "... If there are a large number of IPv4 (or IPv6) addresses, then pick a certain number of resources according to different rules" be clarified. I didn't follow the guidance about picking resources? ** YANG. i2nsf-nsf-detection-virus. -- per "leaf file-type", could this be expressed as a https://www.iana.org/assignments/media-types/media-types.xhtml? -- should a hash of the file also be an option? -- per "leaf os", what is meant by the "simple" adjective for "Simple OS information"? ** YANG. container i2nsf-nsf-detection-web-attack -- What is the "uri-category"? -- What is the "rsp-code", is that the HTTP response code? -- Is the req-client-app, the user agent string? -- recommend that the descriptions reflect the precise HTTP header field names if appropriate ** YANG. container i2nsf-nsf-log-vuln-scan. -- is there a reason why there isn't a mechanism for structure vulnerability information (e.g., CVE) or severity (e.g., CVSS)? ** Section 14. Per the usual YANG template, this text is silent on read operations and RPC. Please clarify. ** Section 14. Clarifying text. OLD ... which can be created, modified and deleted ... are considered sensitive NEW ... which can be created, modified and deleted ... are considered sensitive as they all could potentially impact security monitoring and mitigation activities. Write operations (e.g., edit-config) applied to these data nodes without proper protection could result in missed alarms or incorrect alarms information being returned to the NSF collector. ** Section 14. Per "The monitoring YANG module should be protected by the secure communication channel, to ensure its confidentiality and integrity.", can the intent of this sentence please be clarified. The first paragraph of this section already established that either SSH or TLS must be used. ** Section 14. In another side, the NSF and NSF data collector can all be faked, which lead to undesirable results (i.e., leakage of an NSF's important operational information, and faked NSF sending false information to mislead the NSF data collector). The mutual authentication is essential to protected against this kind of attack. The current mainstream security technologies (i.e., TLS, DTLS, IPsec, and X.509 PKI) can be employed appropriately to provide the above security functions. There are a few threats here and they should be separated. Mutual authentication doesn't appear to mitigate all of them. -- compromised NSF (valid credentials): can send falsified information to the NSF collector to mislead detection or mitigation activities; and/or to hide activity. There is no in-framework mechanism to mitigate this and an issue for all monitoring infrastructures. -- compromised NSF collection (has valid credentials): has visibility into all collected security alarms; entire detection and mitigation infrastructure may be suspect -- impersonating NSF: system trying to send false information; client authentication would help the NSF collector identify this invalid NSF in the "push" model (NSF-to-collector); "pull" model (collector-to-NSF) should already be addressed -- impersonating NSF collector: legitimate NSF is tricked into communicating with a rouge NSF collector; for "push" (NSF-to-collector), without valid credentials, this should already not work; for "pull" (collector-to-NSF), mutual auth would mitigate ** ID nits returned the following: == Unused Reference: 'RFC2119' is defined on line 3817, but no explicit reference was found in the text [Roman] This means the boilerplate text on RFC2119 is not used correctly == Unused Reference: 'I-D.ietf-i2nsf-capability' is defined on line 3944, but no explicit reference was found in the text [Roman] This should be removed. ** Downref: Normative reference to an Unknown state RFC: RFC 956 [Roman] This document is referenced in the introductory paragraph of Section 10, but doesn't appear to be used. ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Downref: Normative reference to an Informational RFC: RFC 3954 [Roman] This reference can be informative ** Downref: Normative reference to an Informational RFC: RFC 4949 [Roman] This reference can be informative ** Downref: Normative reference to an Historic RFC: RFC 6587 [Roman] This reference can be informative. However, are you sure it shouldn't be rfc5425 (syslog over TLS)? ** Downref: Normative reference to an Informational RFC: RFC 8329 [Roman] This reference can be informative == Outdated reference: draft-ietf-netconf-subscribed-notifications has been published as RFC 8639 == Outdated reference: draft-ietf-netconf-yang-push has been published as RFC 8641 [Roman] These two reference just need to be replaced by their corresponding RFC. Regards, Roman _______________________________________________ I2nsf mailing list I2nsf@ietf.org https://www.ietf.org/mailman/listinfo/i2nsf