[I2nsf] AD Review of draft-ietf-i2nsf-nsf-monitoring-data-model-08

Roman Danyliw Fri, 25 Jun 2021 11:26:55 -0700

Hi!

I conducted an AD review of draft-ietf-i2nsf-nsf-monitoring-data-model-08.  
Thanks for this detailed info and data model.


My high-level comments are as follows:

** There is a lot of flexibility in this data model in the cardinality of the 
fields and the sheer number of free form fields.  The positive of this approach 
is that it should be able to represent a wide variety of tools/NSF.  The 
negative of this is that significant profiling or out of band knowledge will be 
needed to make many of these field machine readable.  It would be helpful to 
discuss this in the text

** There are a number of taxonomies in this data model.  They will be helpful 
to parse and triage alarms.  However, I have some concerns about the 
completeness of those currently specified and the lack of discussion on how 
they might be extended.  It would also be helpful to cover this in the text.

** (Mentioned below in detail) I found the philosophically framing in Section 3 
and 4, not entirely in sync.  Additionally, the taxonomy of different kinds of 
data introduction in Section 4 did not clearly align for me against the data 
model in Section 10.

** (Mentioned below in detail) I found a few places where there were assumption 
and architectural elements outside of the base I2NSF architecture (RFC8329 and 
draft-ietf-i2nsf-applicability-18).  Where that occurs more detail would be 
helpful, or reconsideration if this is necessary.

Now the more specific comments:

** This document provides both an information model and seemingly a YANG module 
to implement it.  I may have missed it, but it would be helpful to state that 
obvious fact

** Section 1. Editorial. s/Monitoring procedures intent to acquire/Monitoring 
procedures acquire/

** Section 1.  This sentence didn't parse for me:

OLD
Monitoring procedures
   intent to acquire  vital types of data with respect to NSFs, (e.g.,
   alarms, records, and counters) via data in motion (e.g., queries,
   notifications, and events)

NEW
This interface enables the sharing of vital data from the NSFs (e.g., alarms, 
records, and counters) to the Security Controller through a variety of 
mechanisms (e.g., queries, notifications, and events).

** Section 1.  s/for an NSF for an NSF/for an NSF/

** Section 1.  Recommend sticking to the named I2NSF architecture of RFC8329 so 
s/e.g., Security Controller and NSF Data Analyzer/e.g., Security Controller/

** Section 1.  Is the phrase "... provides visibility for an NSF for an NSF 
data collector ...", the same thing as saying "...provides visibility into an 
NSF for the NSF data collector"?

** Section 1.  How important is it to introduce the new architectural element 
of the "NSF data controller" as a super-set of the previously defined Security 
Controller, and the never previously mentioned or defined "NSF Data Analyzer".  
I asked because the I2NSF reference architectures in RFC8329 and 
draft-ietf-i2nsf-applicability don't have it.  
I see value in keeping the language consistent with previous drafts by defining 
a new role for the Security Controller (from draft-ietf-i2nsf-applicability) or 
Network Management Operator System (from RFC8329) instead.  If it's imperative 
to invent this new architecture term, please explain how it is architecturally 
different than these previously defined components.

** Section 1.  Editorial.  Per "The information model for the NSF monitoring 
interface presented in this document is a complementary information    model to 
the information model ...", recommend against using the term "information 
model" three times in the same sentence for readability.

** Section 3.  I found these use cases clear.  However, after reading Section 
4, I had trouble relating the terminology and nuances to this section.

-- There is discussion of "events" and "activity logs" in these use cases.  
However, in Section 4 notes events, notifications and records.  How do 
notifications and records align with these use cases?

-- These use cases discuss primary discuss acting on events and triggered by 
activity.  However, in subsequent sections (e.g., Section 7), there is note of 
"alerts" and "alarms".  How are they related?

** Section 4.  Editorial. Per "In order to maintain ...", this sentence has a 
double negative (two "not")

** Section 4.  Per "Three basic domains about the monitoring information 
originating from a system entity [RFC4949] or an NSF are highlighted in this 
document", what is the relationship between a "system entity" and NSF?  What's 
a "system entity" in the I2NSF framework/architecture?

** Section 4.

The Alarm Management Framework in [RFC3877] defines an Event as
   something that happens as a thing of of interest.  

-- Typo. s/of of/of/

-- The definition from RFC3877 "Something that happens which may be of 
interest"?  Editorially, the exact words are clearer.

** Section 4.  The citation doesn't seem right.  Per:

It defines a fault
   as a change in status, crossing a threshold, or an external input to
   the system.

Section 3.1 of RFC3877 says a fault is "Lasting error or warning condition."  
The quoted text of "... as a change in status, crossing a threshold, or an 
external input to the system" comes from the definition of an event and it 
states them as examples: "A fault, a change in status, crossing a threshold, or 
an external input to the system, for example."

** Section 4.  Per "... the scope of the Alarm Management Framework's Events is 
still applicable due to its broad definition", can you please clarify what is 
being invoked from RFC3877 beyond these definitions.

** Section 4.1.  The section used term "retention" in a way I didn't expect.  
I'm most familiar with retention practices in the area of alerts and logs as 
discussing what information is kept and for how long.  Does that need to be 
covered here?

** Section 4.1. Editorial.
   Typically, a system entity populates standardized interface, such as
   SNMP, NETCONF, RESTCONF or CoMI to provide and emit created
   information directly via NSF Monitoring Interface
New
Typically, a system entity populates standardized interfaces, such as
   SNMP, NETCONF, RESTCONF or CoMI to emit information via the NSF Monitoring 
Interface

** Section 4.1.  Per "Alternatively,   the created information is ...", is it 
"alternatively" or "additionally"?  

** Section 4.1. Typo. s/ Monistoring/Monitoring/

** Section 4.1.  Per the paragraph beginning with "Information retained on a 
system entity ...", I had trouble following this guidance -- Is the text 
suggesting that the data be emitted in some way other than the standardized 
data model described in this document?

** Section 4.1.  Per "An I2NSF User is required to process fresh [RFC4949] 
records ...", 

-- how does an I2NSF user know the records are fresh?

-- what is a "homogenizing function"?

-- per the architecture, how is the I2NSF user "...proving[ing] them to other 
I2NSF Components", as the user only interacts with the Controller?

** Section 4.1.  Per "When retained or emitted, the information required to 
support monitoring processes has to be processed by an I2NSF User at some point 
in the workflow.  Typical locations of these I2NSF Users are: ...", unless I'm 
misunderstanding that it meant by location, this behavior doesn't seem to align 
with the reference architecture of Figure 1 of draft-ietf-i2nsf-applicability 
or Section 3 of RFC8329, which suggest that only the security controller is 
directly interacting with the NSF.

** Section 4.2.  There is a distinction being made between an event and a 
notification, but that distinction isn't clear to me - are events and 
notifications the same things except events come from an I2NSF component, but 
notifications do not?  If notifications aren't part of the I2NSF architecture, 
how are they in scope to this data model? Do multiple notifications aggregate 
to be an event?

** Section 4.2.  There deliberate taxonomy being created around events, 
notifications and records with each having significantly different properties 
to warrant distinct categories. This distinction is not clear to me when 
manifested in the YANG module.  Are all top level containers with "*-event-*" 
in their names events?  Are records the containers with "log" in their name?  
What are notifications?  

Additionally, "alarms" are later introduced in the text, how do they relate?

** Section 4.4.  It isn't clear from the text whether records are shared via 
the monitoring interface?

** Section 4.4.  Per "Unlike information emitted via notifications and events,  
     records do not require immediate attention from an analyst but may be 
useful for visibility and retroactive cyber forensic":
-- I don't see text in Section 4.2 is it noted that notifications required 
immediate attention from an analyst

-- What is the relationship between the analyst and the "I2NSF user"?

** Section 5. This section appears to be restating very similar information to 
Section 4.3.  Are both needed?

** Section 5.  Per the various examples of "data model and interaction model 
for data in motion", their applicability to I2NSF isn't clear.  This document 
is specifying the encoding of data in a particular information and data model.  
IPFIX and NetFlow don't seem germane, and represent an alternative to some of 
the information described here.

** Section 5.1.  This section explicitly calls out YANG Push and YANG 
Subscribed Notification.  Are these  recommended protocols?  I ask because the 
previous sections (4.3 and 5.0), already described the generic utility of push 
and notification capabilities.  Here the capabilities are being described again 
specifically.

** Section 6 and 7.  There is no guidance on whether any of these data items 
are mandatory.  

** Section 6.  Per the Basic Information model:

-- message: is this field identifying the type of message or the message 
itself?  

-- nsf-name: is the name a FQDN? Or any arbitrary string?

-- Should there be a timestamp?

** Section 7.  Editorial.  s/as alarm/as an alarm/ and s/with basic 
information/with the basic information/

** Section 7.*.  The definition of "acquisition method", "emission-type" or 
"dampening-type" aren't clear from the one-word descriptions.  Earlier text 
hints at acquisition method and emission type, but dampening isn't mentioned.  
Recommend up front definition before use here.

** Section 7.1.*.  
-- A simple definition of memory, cpu, disk, etc. would be helpful to 
explicitly say in the text
-- each of these alarms has a "message" field.  Is the proposed text the actual 
message?  The text seems to read like the definition of that type of alarm.

** Section 7.1.5. interface-state.  "up, down and congested" seems to be mixing 
different properties.  Are the underlying semantics up-not-congested, 
up-but-congested, and down?

** Section 7.2.1 and 7.2.2.
-- Can a user only belong to one group?
-- Recommend defining "authentication"
-- per "login-ip-address" - is this the true "login IP" or the IP address of 
the action that triggered the alarm?

** Section 7.2.2.  Recommend defining the scope of "configuration change"

** Section 7.2.3.  Should there be some indicator as to why this flow was 
shared?

** Section 7.3 and 7.5.  I was expecting to find symmetry between the NSF 
events described here and those described by the "content-security-controls" 
and "attack-mitigation-control" in draft-ietf-i2nsf-nsf-facing-interface.  I 
assumed that if the NSF has a certain capability then it would be able to 
generate events for it using this info/data model.  Instead I found 
representations for modeling an alert for which capabilities doesn't seem to 
exist; and capabilities for which there wasn't a corresponding way to created 
alarms.  Specifically: 

-- voip-volte, pkt-capture and mail-filtering don't seem to have an analog here
-- here there is an intrusion event but draft-ietf-i2nsf-nsf-facing-interface 
makes a distinction between ids and ips
-- botnet, session and vulnerability scanning are defined here don't have an 
analog in draft-ietf-i2nsf-nsf-facing-interface

** Section 7.3.1.  Per attack-type
-- saying "Any one of ..." and then ending it with an "and etc" is confusing 
because this is no longer an enumerated list.
-- was it intentional to have this list not align with the types of DDOS 
attacks enumerated in draft-ietf-i2nsf-nsf-facing-interface?

** Section 7.3.1.  Per dst-ip, should this be expressed as a network mask or 
domain name for additional flexibility?

** Section 7.3.1.  Rule-name
-- Why is the "rule-name" encoded here but not in any of the other types of 
events (as it seems like it would be useful)?
-- Is this the name of the I2NSF Policy Rule or rule specific to the 
configuration on the NSF?

** Section 7.3.1.  Per "profile", can "security profile" please be defined.

** Section 7.3.3 and 7.3.4 (and same applies to YANG module)
-- What is a src/dst-zone?
-- Should the IP/port/raw_info fields describe the flow in which the malware 
was seen rather than a packet?  (Per Section 7.3.4) I ask because it would seem 
unlikely that a network-based malware inspection tech would operate on a 
per-packet basis (or that the file with the malware payload actually fit in a 
single packet).  Per Section 7.3.4, most modern IDS/IPS also operate on 
streams/flows.

** Section 7.3.5.  role.  
-- The roles seem like the would benefit from further generalization.  Given 
the diversity of C2 servers approaches, would it be cleared to s/IRC and Web/C2/
-- How would peer-to-peer zombie be handled, that is where the compromised 
hosts can talk to each other?
-- Is an "other" needed to catch additional cases?

** Section 7.4.  What is the relationship of a system log to the event, 
notification and record taxonomy introduced earlier

** Section 7.4.1, Per "Administrator"
-- Editorially, this is one of the few uppercase field names
-- is this a user name?

** Section 7.4.2.
-- Are the CPU, disk, sessions, traffic rates, traffic speed numbers aggregates 
across all CPUs and interfaces on a system?
-- How does one provide per CPU/interface/disk stats?

** Section 7.4.3
-- what is "access"?  Is this describing the means by which the user accessed 
the system?  Is PPP, Point-to-Point Protocol?  I don't recognized "SVN"?
-- What is "online-duration"? and "logout-duration"?

** Section 7.5.2  Editorial.  Using the term "victim-id" suggests to me that a 
particular system has been exploited.  I was under the impression that a 
"vulnerability scanning log" merely found a vulnerability.

** Section 7.6.1.  What is the temporal frame of reference for these counters?  
For example, is the peak computed since the last time the counter was polled?  
Since some internal state was reset?

** Section 7.7.1, What are "*-regions"?

** YANG.  typedef dpi-type.  What is the difference between "data-filtering" 
and "application-behavior-control"?  Wouldn't a subset of application behavior 
be filtering data it sends?

** YANG.  typedef operation-type.  "Configuration" seems underspecified, unless 
it is intended to mean any operation done by a user beyond login or logout.  
For example, if a I list the contents of a data store, I'm not doing a 
configuration.  Is that something that would be in scope to log?

** YANG.  typedef operation-type.  The information model distinguishes between 
a privileged and unprivileged user.  Does that distinction apply here?

** YANG.  typedef login-mode.
-- Saying "root" seems Unix-centric.
-- "mode" doesn't seem like the right term.  Root, user and guest are roles, 
but the means of logging in is the same.

** YANG.  identity periodical.  Editorial.  Should this be  "periodic"?

** YANG.  There are a number of enumerations whose descriptions are simply 
repetitions of the name.  Please scrub the model and add improved descriptions. 
 For example:
-- YANG.  identity dampening-type, no-dampening and on-repetition.  The 
descriptions for these identities are not meaningful as they simply repeat 
their names.
-- YANG.  identity authentication-mode and event-type he specific 
authentications modes do not have meaningful descriptions and merely repeat 
their names.
-- YANG.  identity access-mode.  The description does not make it clear what 
this or the derived enums are.

** YANG.  identity virus-type.  
-- Typo.  s/caan/can/
-- the taxonomy of virus-type seems to be incomplete.  How does one 
characterize non-self replicating, non-trojan and pure binary malware?
-- the descriptions of the derived virus-type enums are not meaningful and 
simply repeat their names

** YANG.  identity req-method.  Is there a reason why the HTTP request methods 
are incomplete?

** YANG.  identity whitelist and blacklist.  In the spirit of inclusive 
language, please do not use these terms.  Consider accept/allow vs. deny-list.  

** YANG.  identity user-defined.  What does a "user-defined" list mean when 
compared to an accept or deny list?

** YANG.  identity malicious-category.  How is this different than a deny list 
("identity blacklist")

** YANG.  Per the various protocol identities (i.e., identity ftp, icmpv6, 
etc.), can you explain how these are used (not saying there is an issue, I'm 
just don't understand).

** YANG.  identity http. s/HTPP/HTTP/

** YANG.  grouping common-monitoring-data.  In this grouping, the severity type 
uses enums of critical, high, middle and low.  However, in the information 
model, Section 6, the severity is described as a number from 0..3.  Shouldn't 
they be the same?

** YANG.  leaf vendor-name.  Is this field also free form, like "leaf message"

** YANG.  leaf nsf-name.  Is this either an IP and a FQDN (or host name), or is 
the "name" here an arbitrary label?

** YANG.  grouping i2nsf-nsf-event-type-content-extended.  What are "src-zone" 
and "dst-zone"?

** YANG.  When "grouping log-action" is invoked in various cases in the model, 
should there be flexibility to have multiple actions (if I'm reading the YANG 
correctly, it's 0..1)

** YANG.  grouping attack-rates.  Editorial.  Please spell out PPS and BPS in 
the description.

** YANG.  grouping traffic-rates.  What are the units and phenomenon measured 
by the "leaf total-traffic"?

** YANG.  grouping traffic-rates.  For the counters that are *-average-*, how 
is the time horizon conveyed?

** YANG. leaf src-user.  Is that the I2NSF user? A user name?

** YANG.  case i2nsf-traffic-flows.  
-- This container represents flows, but all of the leaves seems to talk about 
packets.  Recommend s/of the packet/of the flow/
-- per "leaf arrival-rate", how is this computed?  Is this the average arrival 
rate for all packets in the flow?

** YANG.  notification i2nsf-log.  Case i2nsf-nsf-system-access-log.  
-- Per "leaf administrator", is this the username of the "administrator"?
-- Per "leaf result", how would a binary result be encoded?
-- is there some what to convey the format of the input or output (e.g., the 
content is a bash shell command)

** YANG. container i2nsf-system-res-util-log.
-- per "leaf system-status", what kind of text would be extended in this free 
form string?
-- per "cpu, memory, disk-usage", what are the units of the uint8 value?
-- per "session, process-num", is uint8 sufficiently large to represent a 
session a process count?

** YANG. container i2nsf-system-user-activity-log.  
-- what do the field online-duration and logout-duration mean?
-- per additional-info,  how should the list of values after the "e.g., 
Successful User ..." be read?  It seems to be suggesting a set of enumerated 
values.

** YANG.  container i2nsf-nsf-detection-ddos.  
-- Per "leaf attack-src/dst-ip", can the text "... If there are a large number 
of IPv4 (or IPv6) addresses, then pick a certain number of resources according 
to different rules" be clarified.  I didn't follow the guidance about picking 
resources?

** YANG.  i2nsf-nsf-detection-virus.
-- per "leaf file-type", could this be expressed as a 
https://www.iana.org/assignments/media-types/media-types.xhtml?
-- should a hash of the file also be an option?
-- per "leaf os", what is meant by the "simple" adjective for "Simple OS 
information"?

** YANG.  container i2nsf-nsf-detection-web-attack
-- What is the "uri-category"?
-- What is the "rsp-code", is that the HTTP response code?
-- Is the req-client-app, the user agent string?
-- recommend that the descriptions reflect the precise HTTP header field names 
if appropriate

** YANG.  container i2nsf-nsf-log-vuln-scan. 
-- is there a reason why there isn't a mechanism for structure vulnerability 
information (e.g., CVE) or severity (e.g., CVSS)?

** Section 14.  Per the usual YANG template, this text is silent on read 
operations and RPC.  Please clarify.

** Section 14.  Clarifying text.
OLD
... which can be created, modified and deleted ... are considered sensitive

NEW
... which can be created, modified and deleted ... are considered sensitive as 
they all could potentially impact security monitoring and mitigation 
activities.  Write operations (e.g., edit-config) applied to these data nodes 
without proper protection could result in missed alarms or incorrect alarms 
information being returned to the NSF collector.

** Section 14.  Per "The monitoring YANG module should be protected by the 
secure communication channel, to ensure its confidentiality and integrity.", 
can the intent of this sentence please be clarified.  The first paragraph of 
this section already established that either SSH or TLS must be used.

** Section 14.

In another side, the NSF and NSF data collector can
   all be faked, which lead to undesirable results (i.e., leakage of an
   NSF's important operational information, and faked NSF sending false
   information to mislead the NSF data collector).   The mutual
   authentication is essential to protected against this kind of attack.
   The current mainstream security technologies (i.e., TLS, DTLS, IPsec,
   and X.509 PKI) can be employed appropriately to provide the above
   security functions.

There are a few threats here and they should be separated.  Mutual 
authentication doesn't appear to mitigate all of them.
-- compromised NSF (valid credentials): can send falsified information to the 
NSF collector to mislead detection or mitigation activities; and/or to hide 
activity.  There is no in-framework mechanism to mitigate this and an issue for 
all monitoring infrastructures.
-- compromised NSF collection (has valid credentials): has visibility into all 
collected security alarms; entire detection and mitigation infrastructure may 
be suspect
-- impersonating NSF: system trying to send false information; client 
authentication would help the NSF collector identify this invalid NSF in the 
"push" model (NSF-to-collector); "pull" model (collector-to-NSF) should already 
be addressed
-- impersonating NSF collector: legitimate NSF is tricked into communicating 
with a rouge NSF collector; for "push" (NSF-to-collector), without valid 
credentials, this should already not work; for "pull" (collector-to-NSF), 
mutual auth would mitigate

** ID nits returned the following:
  == Unused Reference: 'RFC2119' is defined on line 3817, but no explicit
     reference was found in the text

[Roman] This means the boilerplate text on RFC2119 is not used correctly

  == Unused Reference: 'I-D.ietf-i2nsf-capability' is defined on line 3944,
     but no explicit reference was found in the text

[Roman] This should be removed.

  ** Downref: Normative reference to an Unknown state RFC: RFC  956

[Roman] This document is referenced in the introductory paragraph of Section 
10, but doesn't appear to be used.

  ** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
     RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  ** Downref: Normative reference to an Informational RFC: RFC 3954

[Roman] This reference can be informative

  ** Downref: Normative reference to an Informational RFC: RFC 4949

[Roman] This reference can be informative

  ** Downref: Normative reference to an Historic RFC: RFC 6587

[Roman] This reference can be informative.  However, are you sure it shouldn't 
be rfc5425 (syslog over TLS)?

  ** Downref: Normative reference to an Informational RFC: RFC 8329

[Roman] This reference can be informative

  == Outdated reference: draft-ietf-netconf-subscribed-notifications has been
     published as RFC 8639

  == Outdated reference: draft-ietf-netconf-yang-push has been published as
     RFC 8641
[Roman] These two reference just need to be replaced by their corresponding RFC.

Regards,
Roman

_______________________________________________
I2nsf mailing list
I2nsf@ietf.org
https://www.ietf.org/mailman/listinfo/i2nsf

[I2nsf] AD Review of draft-ietf-i2nsf-nsf-monitoring-data-model-08

Reply via email to