Hi Haoyu,

Thanks for the expedient updates, they look good to me.  I've requested IETF LC.

Regards,
Rob


> -----Original Message-----
> From: Haoyu Song <haoyu.s...@futurewei.com>
> Sent: 13 October 2021 16:05
> To: Rob Wilton (rwilton) <rwil...@cisco.com>; draft-ietf-opsawg-
> ntf....@ietf.org
> Cc: opsawg@ietf.org; 'opsawg-chairs' <opsawg-cha...@ietf.org>
> Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2]
> 
> Hi Rob,
> 
> Thank you very much for your second review! We have made all the
> modifications you pointed out.
> https://datatracker.ietf.org/doc/draft-ietf-opsawg-ntf/09/
> Please help to move it forward. Thanks again!
> 
> Best regards,
> Haoyu
> 
> -----Original Message-----
> From: Rob Wilton (rwilton) <rwil...@cisco.com>
> Sent: Tuesday, October 12, 2021 2:08 PM
> To: Haoyu Song <haoyu.s...@futurewei.com>; draft-ietf-opsawg-
> ntf....@ietf.org
> Cc: opsawg@ietf.org; 'opsawg-chairs' <opsawg-cha...@ietf.org>
> Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2]
> 
> Hi Haoyu,
> 
> Thanks for applying the markups.
> 
> I've given -08 another read through, and I think that there are still some
> tweaks to the grammar that I would recommend.  I've also included some
> automated warnings from a grammar tool that a probably also worth fixing
> (you would get similar warnings during IESG review anyway).  I think that
> once you have fixed these we should be ready to go.
> 
> 3.2.  Use Cases
> 
>    *  Security: Network intrusion detection and prevention systems need
>       to monitor network traffic and activities and act upon anomalies.
>       Given increasingly sophisticated attack vector coupled with
>       increasingly severe consequences of security breaches, new tools
>       and techniques need to be developed, relying on wider and deeper
>       visibility into networks.  The ultimate goal is to achieve the
>       ideal security with no or minimal human intervention.
> 
> RW: suggest
> no or minimal human => no, or only minimal, human intervention
> 
> 
> * Last sentence suggest:
> 
> The ultimate goal is to achieve the ideal security with no, or only minimal,
> human intervention.
> 
>       networks.  While a policy or an intent is enforced, the compliance
>       needs to be verified and monitored continuously relying on
>       visibility that is provided through network telemetry data, any
>       violation needs to be reported immediately, and updates need to be
>       applied to ensure the intent remains in force.
> 
> RW: Suggest:
> 
> While a policy or intent is enforced, the compliance
> needs to be verified and monitored continuously by relying on
> visibility that is provided through network telemetry data.  Any
> violation must be notified immediately, potentially resulting in
> updates to how the policy or intent is applied in the network to ensure
> that it remains in force, or otherwise alerting the network administrator
> to the policy or intent violation.
> 
>    *  ...
>       overwhelming.  While machine learning technologies can be used for
>       root cause analysis, it up to the network to sense and provide the
>       relevant diagnostic data which are either actively fed into or
>       passively retrieved by machine learning applications.
> 
> RW: Suggest:
> actively fed into or passively retrieved by => actively fed into, or passively
> retrieved, by
> 
> 
> 4.  Network Telemetry Framework
> 
> RW: (Section 4.3)are applied. => (Section 4.3) are applied.
> 
> 
> 4.1.  Top Level Modules, diagram:
> 
> RW:
> 1. I still not sure that I would list "ACL" under a control plane object.
> 2. Thinking about it, I think that this table would be more consistent if the
> columns were ordered with management plane before control plane, e.g.,:
>    +---------+--------------+--------------+---------------+-----------+
>    | Module  | Management   | Control      | Forwarding    | External  |
>    |         | Plane        | Plane        | Plane         | Data      |
> 
> 
> 4.1.1.  Management Plane Telemetry
> 
>    *  Convenient Data Subscription: An application should have the
>       freedom to choose the data export means such as the data types (as
>       described in Figure 4) and the export means and frequency (e.g.,
>       on-change or periodic subscription).
> 
> RW:
> I don't think that the client is really choosing the data types, but
> instead choosing which data to export, and how it is exported.  How about:
> 
> Convenient Data Subscription: An application should have the
> freedom to choose which data is exported (see section 4.3) and the
> means and frequency of how that data is exported (e.g.,
> on-change or periodic subscription).
> 
>    *  High Speed Data Transport: In order to keep up with the velocity
>       of information, a server needs to be able to send large amounts of
>       data at high frequency.  Compact encoding formats or data
>       compression schemes are needed to compress the data and improve
>       the data transport efficiency.  The subscription mode, by
>       replacing the query mode, reduces the interactions between clients
>       and servers and helps to improve the server's efficiency.
> 
> RW:
> are needed to compress the data => are needed to reduce the quantity of
> data
> 
> 
> 4.1.2.  Control Plane Telemetry
> 
> RW:
> (e.g., the IGP monitoring => (e.g., IGP monitoring
> 
> 
> 4.1.3.  Forwarding Plane Telemetry
> 
> RW: Perhaps:
> between forwarding and telemetry => between forwarding performance and
> telemetry
> 
> RW:
> described in Appendix => Please add a reference to the section where
> postcard telemetry is described, perhaps A3.5?
> 
> Very minor nit:
> Search for "e.g. " and replace with "e.g., "
> 
> 4.2.  Second Level Function Components
> 
> RW: Sorry, I had a typo in my previous suggested text, correction:
> 
> The telemetry module as each plane => The telemetry module at each plane
> 
> RW:
> provisioned in device => provisioned in the device
> 
> 4.3.  Data Acquisition Mechanism and Type Abstraction
> 
>    *  Event-triggered Data: The data are conditionally acquired based on
>       the occurrence of some events.  For example, a network interface
>       changing its operational state from up to down can be a trigger
>       event.  Such data can be actively pushed through subscription or
>       passively polled through query.  There are many ways to model
>       events, including using Finite State Machine (FSM) or Event
>       Condition Action (ECA) [I-D.wwx-netmod-event-yang].
> 
> RW: For example, a network interface changing its operational state from up
> to down can be a trigger event. =>
> 
> An example of event-triggered data could be an interface changing
> operational state between up and down.
> 
> 
> 4.4.  Mapping Existing Mechanisms into the Framework
> 
> RW: Figure 5: Existing Work Mapping II => Figure 5: Existing Work Mapping
> 
> 
> 6.  Security Considerations
> 
> RW: vulnerability. => vulnerabilities.
> 
> Spellings to check:
> de-facto,
> exensive,
> secuirty,
> telemtry,
> tradeoff,
> 
> Grammar Warnings:
> Section: abstract, draft text:
> This document clarifies the terminologies and classifies the modules and
> components of a network telemetry system from several different
> perspectives.
> Warning:  Consider using several.
> Suggested change:  "several"
> 
> Section: 1, draft text:
> All the modules are internally structured in the same way, including
> components that allow to configure data sources with regards to what data
> to generate and how to make that available to client applications,
> components that instrument the underlying data sources, and components
> that perform the actual rendering, encoding, and exporting of the generated
> data.
> Warning:  Use in regard to, with regard to, or more simply regarding.
> Suggested change:  "in regard to"
> 
> Section: 2, draft text:
> - gRPC Remote Procedure Call, a open source high performance RPC
> framework that gNMI is based on.
> Warning:  Use an instead of 'a' if the following word starts with a vowel
> sound, e.g. 'an article', 'an hour'
> Suggested change:  "an"
> 
> Section: 3.2, draft text:
> The ultimate goal is to achieve the ideal security with no or minimal human
> intervention.
> Warning:  Did you mean now (=at this moment) instead of 'no' (negation)?
> Suggested change:  "now"
> 
> Section: 3.3, draft text:
> - Some of the conventional OAM techniques (e.g., CLI and Syslog) lack a
> formal data model.
> Warning:  If the text is a generality, 'of the' is not necessary.
> Suggested change:  "Some"
> 
> Section: 3.5, draft text:
>  A telemetry framework collects together all of the telemetry-related works
> from different sources and working groups within IETF.
> Warning:  Consider using all the.
> Suggested change:  "all the"
> 
> Section: 4.1, draft text:
> Some of the operational states can only be derived from data plane data
> sources such as the interface status and statistics.
> Warning:  If the text is a generality, 'of the' is not necessary.
> Suggested change:  "Some"
> 
> Section: 4.1.3, draft text:
> This raises some challenges to the network data plane devices where the first
> hand data originates.
> Warning:  'first hand' seems to be a compound adjective in front of a noun.
> Use a hyphen: first-hand.
> Suggested change:  "first-hand"
> 
> Section: 4.1.3, draft text:
> While supporting network visibility is important, the telemetry is just an
> auxiliary function, and it should strive to not impede normal traffic
> processing and forwarding (i.e., the forwarding behavior should not be
> altered and the tradeoff between forwarding and telemtry should be well
> balanced).
> Warning:  This word is normally spelled with hyphen.
> Suggested change:  "well-balanced"
> 
> Section: 6, draft text:
> For example, telemetry data can be manipulated to exhaust various network
> resources at each plane as well as the data consumer; falsified or tampered
> data can mislead the decision making and paralyze networks; wrong
> configuration and programming for telemetry is equally harmful.
> Warning:  This word is normally spelled with hyphen.
> Suggested change:  "decision-making"
> 
> Section: 6, draft text:
> Some of the security considerations highlighted above may be minimized or
> negated with policy management of network telemetry.
> Warning:  If the text is a generality, 'of the' is not necessary.
> Suggested change:  "Some"
> 
> Section: A.1.2, draft text:
> gRPC is an [RFC7540] based open source micro service communication
> framework.
> Warning:  This word is normally spelled as one.
> Suggested change:  "microservice"
> 
> Section: A.2.1, draft text:
> The BGP routes (including [RFC7854], [I-D.ietf-grow-bmp-adj-rib-out], and [I-
> D.ietf-grow-bmp-local-rib] are encapsulated in the BMP Route Monitoring
> Message and the BMP Route Mirroring Message, providing both an initial
> table dump and real-time route updates.
> Warning:  Unpaired symbol: ')' seems to be missing
> 
> Section: A.3.1, draft text:
> Since networks offer rich sets of network performance measurement data
> (e.g packet counters), traditional approaches run into limitations.
> Warning:  The abbreviation e.g. (= for example) requires two periods.
> Suggested change:  "e.g.,"
> 
> Section: A.4.1, draft text:
> For example, a sports event takes place and some unexpected movement
> makes it highly interesting and many people connects to sites that are
> reporting on the event.
> Warning:  Consider using an extreme adjective for 'interesting'.
> Suggested change:  "fascinating"
> 
> Section: A.4.1, draft text:
> For example, a sports event takes place and some unexpected movement
> makes it highly interesting and many people connects to sites that are
> reporting on the event.
> Warning:  If 'people' is plural here, don't use the third-person singular 
> verb.
> Suggested change:  "connect"
> 
> Section: A.4.1, draft text:
> Additional types of detector types can be added to the system but they will
> be generally the result of composing the properties offered by these main
> classes.
> Warning:  Use a comma before 'but' if it connects two independent clauses
> (unless they are closely connected and short).
> Suggested change:  "system, but"
> 
> Thanks,
> Rob
> 
> 
> 
> > -----Original Message-----
> > From: Haoyu Song <haoyu.s...@futurewei.com>
> > Sent: 08 October 2021 00:15
> > To: Rob Wilton (rwilton) <rwil...@cisco.com>; draft-ietf-opsawg-
> > ntf....@ietf.org
> > Cc: opsawg@ietf.org; 'opsawg-chairs' <opsawg-cha...@ietf.org>
> > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2]
> >
> > Hi Rob,
> >
> > We have updated the draft according to your review suggestions and
> uploaded
> > the -08 version. In the new revision we believe all your
> suggestions/questions
> > have been addressed. Please let me know if you have further questions.
> Thank
> > you very much!
> >
> > Best regards,
> > Haoyu
> >
> >
> > -------------------------------------------------
> > A new version of I-D, draft-ietf-opsawg-ntf-08.txt has been successfully
> > submitted by Haoyu Song and posted to the IETF repository.
> >
> > Name:               draft-ietf-opsawg-ntf
> > Revision:   08
> > Title:              Network Telemetry Framework
> > Document date:      2021-10-07
> > Group:              opsawg
> > Pages:              40
> > URL:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .i%2F&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7Cec0d086f9
> d4b4bacbe9908d98dc47250%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%
> 7C0%7C637696697183566271%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&a
> mp;sdata=gYhlKlJGAJFrPLMQJsyJrWGUxq00Al5pOTrq%2BBAo%2BPE%3D&a
> mp;reserved=0
> > etf.org%2Farchive%2Fid%2Fdraft-ietf-opsawg-ntf-
> >
> 08.txt&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77c
> e
> >
> 0246132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7
> >
> C1%7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> >
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&am
> >
> p;sdata=fm%2FeutvtbKzZN7c%2BvZzlzmZzSWQs0I52sn68EQ1bSv0%3D&amp;
> r
> > eserved=0
> > Status:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatat
> r
> > acker.ietf.org%2Fdoc%2Fdraft-ietf-opsawg-
> >
> ntf%2F&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77
> c
> >
> e0246132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%
> >
> 7C1%7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> >
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a
> >
> mp;sdata=mPDw6Gz2JqqJ%2F6X0ISjEH5MH1nL%2Bgn5MK4VnbaBAfRs%3D&
> > amp;reserved=0
> > Htmlized:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatat
> r
> > acker.ietf.org%2Fdoc%2Fhtml%2Fdraft-ietf-opsawg-
> >
> ntf&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77ce02
> >
> 46132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C1
> >
> %7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> >
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;
> >
> sdata=x8mxaK3UugiiTtDDX1YCrs3a9%2FjhdUXBPMetNuoR1SM%3D&amp;res
> e
> > rved=0
> > Diff:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww
> .i%2F&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7Cec0d086f9
> d4b4bacbe9908d98dc47250%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%
> 7C0%7C637696697183566271%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&a
> mp;sdata=gYhlKlJGAJFrPLMQJsyJrWGUxq00Al5pOTrq%2BBAo%2BPE%3D&a
> mp;reserved=0
> > etf.org%2Frfcdiff%3Furl2%3Ddraft-ietf-opsawg-ntf-
> >
> 08&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77ce02
> >
> 46132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C1
> >
> %7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> >
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;
> >
> sdata=3QV9pT%2Fzs5xj6WxMLqIwGr2%2F4cD7xqclE3uznclsZfA%3D&amp;re
> s
> > erved=0
> >
> >
> > -----Original Message-----
> > From: Haoyu Song
> > Sent: Wednesday, October 6, 2021 9:14 AM
> > To: Rob Wilton (rwilton) <rwil...@cisco.com>; draft-ietf-opsawg-
> > ntf....@ietf.org
> > Cc: opsawg@ietf.org
> > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2]
> >
> > Hi Rob,
> >
> > Thank you very much for the review! We'll update the draft as you
> suggested.
> >
> > Best regards,
> > Haoyu
> >
> > -----Original Message-----
> > From: Rob Wilton (rwilton) <rwil...@cisco.com>
> > Sent: Wednesday, October 6, 2021 3:55 AM
> > To: draft-ietf-opsawg-ntf....@ietf.org
> > Cc: opsawg@ietf.org
> > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2]
> >
> > Sigh, this also appears to be truncated in my email client.
> >
> > To be sure that you see all the comments (i.e., to the end of the
> document),
> > please either see the previous attachment. The full email can also be seen
> in
> > the archives at
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail
> ar
> >
> chive.ietf.org%2Farch%2Fmsg%2Fopsawg%2FWDnVtM_vLm15X28OTEwI9Q6
> g
> >
> fx0%2F&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7Cf1e7980d
> >
> 22be45a356e608d988b7d5ba%7C0fee8ff2a3b240189c753a1d5591fedc%7C1
> >
> %7C0%7C637691145441218654%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
> >
> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&
> >
> amp;sdata=d3NH7iwGu4T99Y%2Fwh9jft0oWofQeKyfWhcuBCQSZcJM%3D&a
> > mp;reserved=0
> >
> > Regards,
> > Rob
> >
> >
> > -----Original Message-----
> > From: Rob Wilton (rwilton) <rwil...@cisco.com>
> > Sent: 06 October 2021 11:48
> > To: draft-ietf-opsawg-ntf....@ietf.org
> > Cc: opsawg@ietf.org
> > Subject: AD review of draft-ietf-opsawg-ntf-07 [2]
> >
> > Hi,
> >
> >
> >
> > Here is my belated AD review of draft-ietf-opsawg-ntf-07.txt.  [Text file 
> > with
> > comments attached in case this also gets truncated.]
> >
> >
> >
> > I would like to thank you for the effort that you have put into this
> document,
> > and apologise for my long delay in reviewing it.
> >
> >
> >
> > Broadly, I think that this is a good and useful framework, but in some of 
> > the
> > latter parts of the document it seems to give prominence to protocols that I
> > don't think have IETF consensus behind them yet (particularly DNP).  I have
> > flagged specific comments in comments inline within the document, but I
> think
> > that the document will have been accuracy/longevity if text about the
> potential
> > technologies is mostly kept to the appendices.
> >
> >
> >
> > There were quite a lot of cases where the text doesn't scan, or read easily,
> > particularly in the latter sections of this document, although I acknowledge
> > that none of the authors appear to be native English speakers.  Ideally,
> these
> > sorts of issues would have been highlighted and addressed during WG LC.
> > Although the RFC editor will improve the language of the documents,
> making
> > the improvements now before IESG review will aid its passage, and
> hopefully
> > result in a better document when it is published.  I have flagged and
> proposed
> > alternative text/grammar where possible.  Once you have made the
> markups
> > and resolved the issues/questions that I have raised then I can run it
> through a
> > grammar checking tool (Lar's will run an equivalent tool during IESG review
> > anyway ...)
> >
> >
> >
> > All of my comments are directly inline, please search for "RW" or "RW:"
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > OPSAWG                                                           H. Song
> >
> > Internet-Draft                                                 Futurewei
> >
> > Intended status: Informational                                    F. Qin
> >
> > Expires: August 23, 2021                                    China Mobile
> >
> >                                                        P. Martinez-Julia
> >
> >                                                                     NICT
> >
> >                                                             L. Ciavaglia
> >
> >                                                                    Nokia
> >
> >                                                                  A. Wang
> >
> >                                                           China Telecom
> >
> >                                                        February 19, 2021
> >
> >
> >
> >
> >
> >                       Network Telemetry Framework
> >
> >                         draft-ietf-opsawg-ntf-07
> >
> >
> >
> > Abstract
> >
> >
> >
> >    Network telemetry is a technology for gaining network insight and
> >
> >    facilitating efficient and automated network management.  It
> >
> >    encompasses various techniques for remote data generation,
> >
> >    collection, correlation, and consumption.  This document describes an
> >
> >    architectural framework for network telemetry, motivated by
> >
> >    challenges that are encountered as part of the operation of networks
> >
> >    and by the requirements that ensue.  Network telemetry, as
> >
> >    necessitated by best industry practices, covers technologies and
> >
> >    protocols that extend beyond conventional network Operations,
> >
> >
> >
> >    Administration, and Management (OAM).  The presented network
> >
> >    telemetry framework promises flexibility, scalability, accuracy,
> >
> >    coverage, and performance.  In addition, it facilitates the
> >
> >    implementation of automated control loops to address both today's and
> >
> >    tomorrow's network operational needs.  This document clarifies the
> >
> >    terminologies and classifies the modules and components of a network
> >
> >    telemetry system from several different perspectives.  The framework
> >
> >    and taxonomy help to set a common ground for the collection of
> >
> >    related work and provide guidance for related technique and standard
> >
> >    developments.
> >
> >
> >
> > RW:
> >
> > I would suggest condensing the abstract to the following, and move the
> other
> > text to the introduction if it is not already covered there.
> >
> >
> >
> >    Network telemetry is a technology for gaining network insight and
> >
> >    facilitating efficient and automated network management.  It
> >
> >    encompasses various techniques for remote data generation,
> >
> >    collection, correlation, and consumption.  This document describes an
> >
> >    architectural framework for network telemetry, motivated by
> >
> >    challenges that are encountered as part of the operation of networks
> >
> >    and by the requirements that ensue.  This document clarifies the
> >
> >    terminologies and classifies the modules and components of a network
> >
> >    telemetry system from several different perspectives.  The framework
> >
> >    and taxonomy help to set a common ground for the collection of
> >
> >    related work and provide guidance for related technique and standard
> >
> >    developments.
> >
> >
> >
> >
> >
> > Status of This Memo
> >
> >
> >
> >    This Internet-Draft is submitted in full conformance with the
> >
> >    provisions of BCP 78 and BCP 79.
> >
> >
> >
> >    Internet-Drafts are working documents of the Internet Engineering
> >
> >    Task Force (IETF).  Note that other groups may also distribute
> >
> >    working documents as Internet-Drafts.  The list of current Internet-
> >
> >    Drafts is at
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatat
> r
> >
> acker.ietf.org%2Fdrafts%2Fcurrent%2F&amp;data=04%7C01%7Chaoyu.song
> %
> >
> 40futurewei.com%7Cf1e7980d22be45a356e608d988b7d5ba%7C0fee8ff2a3b
> > 240189c753a1d5591fedc%7C1%7C0%7C637691145441218654%7CUnknown
> >
> %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW
> >
> wiLCJXVCI6Mn0%3D%7C1000&amp;sdata=4B6oa1Ks5lxCrKsVA33csv8LE2rTL1
> > nZmfTlAv9n9ww%3D&amp;reserved=0.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 1]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >    Internet-Drafts are draft documents valid for a maximum of six months
> >
> >    and may be updated, replaced, or obsoleted by other documents at any
> >
> >    time.  It is inappropriate to use Internet-Drafts as reference
> >
> >    material or to cite them other than as "work in progress."
> >
> >
> >
> >    This Internet-Draft will expire on August 23, 2021.
> >
> >
> >
> > Copyright Notice
> >
> >
> >
> >    Copyright (c) 2021 IETF Trust and the persons identified as the
> >
> >    document authors.  All rights reserved.
> >
> >
> >
> >    This document is subject to BCP 78 and the IETF Trust's Legal
> >
> >    Provisions Relating to IETF Documents
> >
> >
> >
> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftrus
> te
> > e.ietf.org%2Flicense-
> >
> info&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7Cf1e7980d22
> b
> >
> e45a356e608d988b7d5ba%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C
> >
> 0%7C637691145441218654%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wL
> >
> jAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp
> ;
> >
> sdata=6bgdcWR1Sp3ry4Xg6iJN79hoSxXhzT2FvtcqMXUnmGs%3D&amp;reserv
> > ed=0) in effect on the date of
> >
> >    publication of this document.  Please review these documents
> >
> >    carefully, as they describe your rights and restrictions with respect
> >
> >    to this document.  Code Components extracted from this document must
> >
> >    include Simplified BSD License text as described in Section 4.e of
> >
> >    the Trust Legal Provisions and are provided without warranty as
> >
> >    described in the Simplified BSD License.
> >
> >
> >
> > Table of Contents
> >
> >
> >
> >    1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
> >
> >    2.  Glossary  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
> >
> >    3.  Background  . . . . . . . . . . . . . . . . . . . . . . . . .   6
> >
> >      3.1.  Telemetry Data Coverage . . . . . . . . . . . . . . . . .   7
> >
> >      3.2.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . .   7
> >
> >      3.3.  Challenges  . . . . . . . . . . . . . . . . . . . . . . .   9
> >
> >      3.4.  Network Telemetry . . . . . . . . . . . . . . . . . . . .  10
> >
> >    4.  The Necessity of a Network Telemetry Framework  . . . . . . .  12
> >
> >    5.  Network Telemetry Framework . . . . . . . . . . . . . . . . .  13
> >
> >      5.1.  Top Level Modules . . . . . . . . . . . . . . . . . . . .  14
> >
> >        5.1.1.  Management Plane Telemetry  . . . . . . . . . . . . .  17
> >
> >        5.1.2.  Control Plane Telemetry . . . . . . . . . . . . . . .  17
> >
> >        5.1.3.  Forwarding Plane Telemetry  . . . . . . . . . . . . .  18
> >
> >        5.1.4.  External Data Telemetry . . . . . . . . . . . . . . .  20
> >
> >      5.2.  Second Level Function Components  . . . . . . . . . . . .  21
> >
> >      5.3.  Data Acquisition Mechanism and Type Abstraction . . . . .  22
> >
> >      5.4.  Mapping Existing Mechanisms into the Framework  . . . . .  24
> >
> >    6.  Evolution of Network Telemetry Applications . . . . . . . . .  25
> >
> >    7.  Security Considerations . . . . . . . . . . . . . . . . . . .  26
> >
> >    8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  27
> >
> >    9.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  27
> >
> >    10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  28
> >
> >   11. Informative References  . . . . . . . . . . . . . . . . . . .  28
> >
> >    Appendix A.  A Survey on Existing Network Telemetry Techniques  .  32
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 2]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >      A.1.  Management Plane Telemetry  . . . . . . . . . . . . . . .  32
> >
> >        A.1.1.  Push Extensions for NETCONF . . . . . . . . . . . . .  32
> >
> >        A.1.2.  gRPC Network Management Interface . . . . . . . . . .  32
> >
> >      A.2.  Control Plane Telemetry . . . . . . . . . . . . . . . . .  33
> >
> >        A.2.1.  BGP Monitoring Protocol . . . . . . . . . . . . . . .  33
> >
> >      A.3.  Data Plane Telemetry  . . . . . . . . . . . . . . . . . .  33
> >
> >        A.3.1.  The Alternate Marking (AM) technology . . . . . . . .  33
> >
> >        A.3.2.  Dynamic Network Probe . . . . . . . . . . . . . . . .  34
> >
> >        A.3.3.  IP Flow Information Export (IPFIX) protocol . . . . .  35
> >
> >        A.3.4.  In-Situ OAM . . . . . . . . . . . . . . . . . . . . .  35
> >
> >        A.3.5.  Postcard Based Telemetry  . . . . . . . . . . . . . .  35
> >
> >      A.4.  External Data and Event Telemetry . . . . . . . . . . . .  35
> >
> >        A.4.1.  Sources of External Events  . . . . . . . . . . . . .  36
> >
> >        A.4.2.  Connectors and Interfaces . . . . . . . . . . . . . .  37
> >
> >    Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  37
> >
> >
> >
> > 1.  Introduction
> >
> >
> >
> >    Network visibility is the ability of management tools to see the
> >
> >    state and behavior of a network, which is essential for successful
> >
> >    network operation.  Network Telemetry revolves around network data
> >
> >    that can help provide insights about the current state of the
> >
> >   network, including network devices, forwarding, control, and
> >
> >    management planes, and that can be generated and obtained through a
> >
> >    variety of techniques, including but not limited to network
> >
> >    instrumentation and measurements, and that can be processed for
> >
> >    purposes ranging from service assurance to network security using a
> >
> >    wide variety of techniques including machine learning, data analysis,
> >
> >    and correlation.  In this document, Network Telemetry refer to both
> >
> >    the data itself (i.e., "Network Telemetry Data"), and the techniques
> >
> >    and processes used to generate, export, collect, and consume that
> >
> >    data for use by potentially automated management applications.
> >
> >    Network telemetry extends beyond the conventional network Operations,
> >
> >    Administration, and Management (OAM) techniques and expects to
> >
> >    support better flexibility, scalability, accuracy, coverage, and
> >
> >    performance.
> >
> >
> >
> > RW: I suggest 'historical' rather than 'conventional'
> >
> >
> >
> >
> >
> >    However, the term of network telemetry lacks a solid and unambiguous
> >
> >    definition.  The scope and coverage of it cause confusion and
> >
> >    misunderstandings.  It is beneficial to clarify the concept and
> >
> >    provide a clear architectural framework for network telemetry, so we
> >
> >    can articulate the technical field, and better align the related
> >
> >    techniques and standard works.
> >
> >
> >
> > RW: Rather than term of, perhaps 'the term "network telemetry" lacks an
> >
> >     unambiguous definition'.
> >
> >
> >
> >
> >
> >    To fulfill such an undertaking, we first discuss some key
> >
> >    characteristics of network telemetry which set a clear distinction
> >
> >    from the conventional network OAM and show that some conventional
> OAM
> >
> >    technologies can be considered a subset of the network telemetry
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 3]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >    technologies.  We then provide an architectural framework for network
> >
> >    telemetry which includes four modules, each concerned with a
> >
> >    different category of telemetry data and corresponding procedures.
> >
> >    All the modules are internally structured in the same way, including
> >
> >    components that allow to configure data sources with regards to what
> >
> >    data to generate and how to make that available to client
> >
> >    applications, components that instrument the underlying data sources,
> >
> >    and components that perform the actual rendering, encoding, and
> >
> >    exporting of the generated data.  We show how the network telemetry
> >
> >    framework can benefit the current and future network operations.
> >
> >    Based on the distinction of modules and function components, we can
> >
> >    map the existing and emerging techniques and protocols into the
> >
> >    framework.  The framework can also simplify the tasks for designing,
> >
> >    maintaining, and understanding a network telemetry system.  At last,
> >
> >    we outline the evolution stages of the network telemetry system and
> >
> >    discuss the potential security concerns.
> >
> >
> >
> >    The purpose of the framework and taxonomy is to set a common ground
> >
> >    for the collection of related work and provide guidance for future
> >
> >    technique and standard developments.  To the best of our knowledge,
> >
> >    this document is the first such effort for network telemetry in
> >
> >    industry standards organizations.
> >
> >
> >
> >
> >
> > 2.  Glossary
> >
> >
> >
> >    Before further discussion, we list some key terminology and acronyms
> >
> >    used in this documents.  We make an intended differentiation between
> >
> >    the terms of network telemetry and OAM.  However, it should be
> >
> >    understood that there is not a hard-line distinction between the two
> >
> >    concepts.  Rather, network telemetry is considered as the extension
> >
> >    of OAM.  It covers all the existing OAM protocols but puts more
> >
> >    emphasis on the newer and emerging techniques and protocols
> >
> >    concerning all aspects of network data from acquisition to
> >
> >    consumption.
> >
> >
> >
> >
> >
> > RW:
> >
> > Nit: "this documents." -> "this document."
> >
> > Nit: "as an extension" rather than "as the extension".
> >
> >
> >
> >    AI:  Artificial Intelligence.  In network domain, AI refers to the
> >
> >       machine-learning based technologies for automated network
> >
> >       operation and other tasks.
> >
> >
> >
> >    AM:  Alternate Marking, a flow performance measurement method,
> >
> >       specified in [RFC8321].
> >
> >
> >
> >    BMP:  BGP Monitoring Protocol, specified in [RFC7854].
> >
> >
> >
> >    DNP:  Dynamic Network Probe, referring to programmable in-network
> >
> >       sensors for network monitoring and measurement.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 4]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >    DPI:  Deep Packet Inspection, referring to the techniques that
> >
> >       examines packet beyond packet L3/L4 headers.
> >
> >
> >
> >    gNMI:  gRPC Network Management Interface, a network management
> >
> >       protocol from OpenConfig Operator Working Group, mainly
> >
> >       contributed by Google.  See [gnmi] for details.
> >
> >
> >
> >    gRPC:  gRPC Remote Procedure Call, a open source high performance RPC
> >
> >       framework that gNMI is based on.  See [grpc] for details.
> >
> >
> >
> >    IPFIX:  IP Flow Information Export Protocol, specified in [RFC7011].
> >
> >
> >
> >    IOAM:  In-situ OAM, a dataplane on-path telemetry technique.
> >
> >
> >
> >   NETCONF:  Network Configuration Protocol, specified in [RFC6241].
> >
> >
> >
> >    NetFlow:  A Cisco protocol for flow record collecting, described in
> >
> >       [RFC3594].
> >
> >
> >
> >    Network Telemetry:  The process and instrumentation for acquiring and
> >
> >       utilizing network data remotely for network monitoring and
> >
> >       operation.  A general term for a large set of network visibility
> >
> >       techniques and protocols, concerning aspects like data generation,
> >
> >       collection, correlation, and consumption.  Network telemetry
> >
> >       addresses the current network operation issues and enables smooth
> >
> >       evolution toward future intent-driven autonomous networks.
> >
> >
> >
> >    NMS:  Network Management System, referring to applications that allow
> >
> >       network administrators manage a network.
> >
> >
> >
> > RW: referring to => refers to applications that allow network administrators
> to
> > manage a network.
> >
> >
> >
> >
> >
> >
> >
> >    OAM:  Operations, Administration, and Maintenance.  A group of
> >
> >       network management functions that provide network fault
> >
> >       indication, fault localization, performance information, and data
> >
> >       and diagnosis functions.  Most conventional network monitoring
> >
> >       techniques and protocols belong to network OAM.
> >
> >
> >
> >    PBT:  Postcard-Based Telemetry, a dataplane on-path telemetry
> >
> >       technique.
> >
> >
> >
> >    SMIv2  Structure of Management Information Version 2, specified in
> >
> >       [RFC2578].
> >
> >
> >
> > RW:
> >
> > Is SMIv2 a better reference than MIBs, that readers are more likely to be
> > familiar with?
> >
> >
> >
> >
> >
> >    SNMP:  Simple Network Management Protocol.  Version 1 and 2 are
> >
> >       specified in [RFC1157] and [RFC3416], respectively.
> >
> >
> >
> >    YANG:  The abbreviation of "Yet Another Next Generation".  YANG is a
> >
> >       data modeling language for the definition of data sent over
> >
> >
> >
> > RW:
> >
> > Nit: Please drop the first sentence, and add a reference to RFC 7950.
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 5]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >       network management protocols such as the NETCONF and RESTCONF.
> >
> >       YANG is defined in [RFC6020].
> >
> >
> >
> >    YANG ECA  A YANG model for Event-Condition-Action policies, defined
> >
> >       in [I-D.wwx-netmod-event-yang].
> >
> >
> >
> >    YANG PUSH:  A method to subscribe pushed data from remote YANG
> >
> >       datastore on network devices.  Details are specified in [RFC8641]
> >
> >       and [RFC8639].
> >
> >
> >
> > RW:
> >
> > Perhaps borrow from the abstract in RFC 8641.
> >
> >   "A mechanism that allows subscriber applications to request a
> >
> >    stream of updates from a YANG datastore on a network device".  Details
> are
> > ...
> >
> >
> >
> >
> >
> > 3.  Background
> >
> >
> >
> >    The term "big data" is used to describe the extremely large volume of
> >
> >    data sets that can be analyzed computationally to reveal patterns,
> >
> >    trends, and associations.  Networks are undoubtedly a source of big
> >
> >    data because of their scale and the volume of network traffic they
> >
> >    forward.  It is easy to see that network operations can benefit from
> >
> >    network big data.
> >
> >
> >
> > RW:
> >
> > Also need to consider privacy.
> >
> >
> >
> > I think that we need to be careful not to imply that the intention here is 
> > to
> > read/snoop on the data being carried over the network rather than gather
> > insights into flows
> >
> >
> >
> >
> >
> >
> >
> >    Today one can access advanced big data analytics capability through a
> >
> >    plethora of commercial and open source platforms (e.g., Apache
> >
> >    Hadoop), tools (e.g., Apache Spark), and techniques (e.g., machine
> >
> >    learning).  Thanks to the advance of computing and storage
> >
> >    technologies, network big data analytics gives network operators an
> >
> >    opportunity to gain network insights and move towards network
> >
> >    autonomy.  Some operators start to explore the application of
> >
> >    Artificial Intelligence (AI) to make sense of network data.  Software
> >
> >    tools can use the network data to detect and react on network faults,
> >
> >    anomalies, and policy violations, as well as predicting future
> >
> >    events.  In turn, the network policy updates for planning, intrusion
> >
> >    prevention, optimization, and self-healing may be applied.
> >
> >
> >
> >    It is conceivable that an autonomic network [RFC7575] is the logical
> >
> >    next step for network evolution following Software Defined Network
> >
> >    (SDN), aiming to reduce (or even eliminate) human labor, make more
> >
> >    efficient use of network resources, and provide better services more
> >
> >    aligned with customer requirements.  Intent-based Networking (IBN)
> >
> >    [I-D.irtf-nmrg-ibn-concepts-definitions] requires network visibility
> >
> >    and telemetry data in order to ensure that the network is behaving as
> >
> >    intended.  Although it takes time to reach the ultimate goal, the
> >
> >    journey has started nevertheless.
> >
> > RW:
> >
> > It would be helpful for the text to link autonomic networking and Intent
> based
> > networking, perhaps:
> >
> > The related technique of Intent-based Networking [...] requires ...
> >
> >
> >
> > RW:
> >
> > Not sure that the last sentence of the paragraph is required.
> >
> >
> >
> >
> >
> >    However, while the data processing capability is improved and
> >
> >    applications are hungry for more data, the networks lag behind in
> >
> >    extracting and translating network data into useful and actionable
> >
> >    information in efficient ways.  The system bottleneck is shifting
> >
> >    from data consumption to data supply.  Both the number of network
> >
> >    nodes and the traffic bandwidth keep increasing at a fast pace.  The
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 6]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >    network configuration and policy change at smaller time slots than
> >
> >    before.  More subtle events and fine-grained data through all network
> >
> >    planes need to be captured and exported in real time.  In a nutshell,
> >
> >    it is a challenge to get enough high-quality data out of the network
> >
> >    in a manner that is efficient, timely, and flexible.  Therefore, we
> >
> >    need to survey the existing technologies and protocols and identify
> >
> >    any potential gaps.
> >
> >
> >
> >    In the remainder of this section, first we clarify the scope of
> >
> >    network data (i.e., telemetry data) concerned in the context.  Then,
> >
> >    we discuss several key use cases for today's and future network
> >
> >    operations.  Next, we show why the current network OAM techniques
> and
> >
> >    protocols are insufficient for these use cases.  The discussion
> >
> >    underlines the need of new methods, techniques, and protocols which
> >
> >    we assign under the umbrella term - Network Telemetry.
> >
> >
> >
> > RW:
> >
> > We should also include the possibilty of extending existing protocols,
> methods,
> > techniques.
> >
> >
> >
> >
> >
> > 3.1.  Telemetry Data Coverage
> >
> >
> >
> >    Any information that can be extracted from networks (including data
> >
> >    plane, control plane, and management plane) and used to gain
> >
> >    visibility or as basis for actions is considered telemetry data.  It
> >
> >    includes statistics, event records and logs, snapshots of state,
> >
> >    configuration data, etc.  It also covers the outputs of any active
> >
> >    and passive measurements [RFC7799].  Specially, raw data can be
> >
> >    processed in-network before being sent to a data consumer.  Such
> >
> >    processed data is also considered telemetry data.  A classification
> >
> >    of telemetry data is provided in Section 5.
> >
> >
> >
> > RW:
> >
> > Specially - I would expand this.  Perhaps: "In some cases, raw data is
> processed
> > before being sent .."
> >
> > We should also discuss the quality of data, i.e., less, higher quality data
> may be
> > better than lots of low quality data.
> >
> >
> >
> >
> >
> > 3.2.  Use Cases
> >
> >
> >
> >    The following set of use cases is essential for network operations.
> >
> >    While the list is by no means exhaustive, it is enough to highlight
> >
> >    the requirements for data velocity, variety, volume, and veracity in
> >
> >    networks.
> >
> >
> >
> >    o  Security: Network intrusion detection and prevention systems need
> >
> >       to monitor network traffic and activities and act upon anomalies.
> >
> >       Given increasingly sophisticated attack vector coupled with
> >
> >       increasingly severe consequences of security breaches, new tools
> >
> >       and techniques need to be developed, relying on wider and deeper
> >
> >       visibility into networks.
> >
> >
> >
> > RW:
> >
> > I agree with this, but it might be good to emphasize that the goal is
> >
> > to get to a place where this can be done without any, or only minimal,
> >
> > human intervention.
> >
> >
> >
> >
> >
> >    o  Policy and Intent Compliance: Network policies are the rules that
> >
> >       constraint the services for network access, provide service
> >
> >       differentiation, or enforce specific treatment on the traffic.
> >
> >       For example, a service function chain is a policy that requires
> >
> >       the selected flows to pass through a set of ordered network
> >
> >       functions.  Intent, as defined in
> >
> >
> >
> > RW:
> >
> > constraint => constrain
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 7]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >       [I-D.irtf-nmrg-ibn-concepts-definitions], is a set of operational
> >
> >       goal that a network should meet and outcomes that a network is
> >
> >       supposed to deliver, defined in a declarative manner without
> >
> >       specifying how to achieve or implement them.  An intent requires a
> >
> >       complex translation and mapping process before being applied on
> >
> >       networks.  While a policy or an intent is enforced, the compliance
> >
> >       needs to be verified and monitored continuously, relying on
> >
> >       visibility that is provided through network telemetry data, and
> >
> >       any violation needs to be reported immediately.
> >
> >
> >
> > RW:
> >
> > Does it not also rely on visibility of the network to potentially modify
> >
> > the mapping to ensure that the intent remains in force?
> >
> >
> >
> >    o  SLA Compliance: A Service-Level Agreement (SLA) defines the level
> >
> >       of service a user expects from a network operator, which include
> >
> >       the metrics for the service measurement and remedy/penalty
> >
> >       procedures when the service level misses the agreement.  Users
> >
> >       need to check if they get the service as promised and network
> >
> >       operators need to evaluate how they can deliver the services that
> >
> >       can meet the SLA based on realtime network telemetry data,
> >
> >       including data from network measurements.
> >
> >
> >
> >    o  Root Cause Analysis: Any network failure can be the effect of a
> >
> >       sequence of chained events.  Troubleshooting and recovery require
> >
> >       quick identification of the root cause of any observable issues.
> >
> >       However, the root cause is not always straightforward to identify,
> >
> >       especially when the failure is sporadic and the number of event
> >
> >       messages, both related and unrelated to the same cause, is
> >
> >       overwhelming.  While machine learning technologies can be used for
> >
> >       root cause analysis, it up to the network to sense and provide the
> >
> >       relevant data to feed into machine learning applications.
> >
> >
> >
> > RW:
> >
> > In these sorts of scenarios, I would expect additional detailed diagnostics
> > information to be requested from the device to figure out the root cause.
> Or
> > specifically, I think that this would contain data that wouldn't normally be
> > exported via telemetry.
> >
> >
> >
> >
> >
> >    o  Network Optimization: This covers all short-term and long-term
> >
> >       network optimization techniques, including load balancing, Traffic
> >
> >       Engineering (TE), and network planning.  Network operators are
> >
> >       motivated to optimize their network utilization and differentiate
> >
> >       services for better Return On Investment (ROI) or lower Capital
> >
> >       Expenditures (CAPEX).  The first step is to know the real-time
> >
> >       network conditions before applying policies for traffic
> >
> >       manipulation.  In some cases, micro-bursts need to be detected in
> >
> >       a very short time-frame so that fine-grained traffic control can
> >
> >       be applied to avoid network congestion.  Long-term planning of
> >
> >       network capacity and topology requires analysis of real-world
> >
> >       network telemetry data that is obtained over long periods of time.
> >
> >
> >
> >    o  Event Tracking and Prediction: The visibility into traffic path
> >
> >       and performance is critical for services and applications that
> >
> >       rely on healthy network operation.  Numerous related network
> >
> >       events are of interest to network operators.  For example, Network
> >
> >       operators want to learn where and why packets are dropped for an
> >
> >       application flow.  They also want to be warned of issues in
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 8]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >       advance so proactive actions can be taken to avoid catastrophic
> >
> >       consequences.
> >
> >
> >
> > 3.3.  Challenges
> >
> >
> >
> >    For a long time, network operators have relied upon SNMP [RFC3416],
> >
> >    Command-Line Interface (CLI), or Syslog to monitor the network.  Some
> >
> >    other OAM techniques as described in [RFC7276] are also used to
> >
> >    facilitate network troubleshooting.  These conventional techniques
> >
> >    are not sufficient to support the above use cases for the following
> >
> >    reasons:
> >
> >
> >
> >    o  Most use cases need to continuously monitor the network and
> >
> >       dynamically refine the data collection in real-time.  The poll-
> >
> >       based low-frequency data collection is ill-suited for these
> >
> >       applications.  Subscription-based streaming data directly pushed
> >
> >       from the data source (e.g., the forwarding chip) is preferred to
> >
> >       provide enough data quantity and precision at scale.
> >
> >
> >
> >    o  Comprehensive data is needed from packet processing engine to
> >
> >       traffic manager, from line cards to main control board, from user
> >
> >       flows to control protocol packets, from device configurations to
> >
> >       operations, and from physical layer to application layer.
> >
> >       Conventional OAM only covers a narrow range of data (e.g., SNMP
> >
> >       only handles data from the Management Information Base (MIB)).
> >
> >       Traditional network devices cannot provide all the necessary
> >
> >       probes.  More open and programmable network devices are therefore
> >
> >       needed.
> >
> >
> >
> >    o  Many application scenarios need to correlate network-wide data
> >
> >       from multiple sources (i.e., from distributed network devices,
> >
> >       different components of a network device, or different network
> >
> >       planes).  A piecemeal solution is often lacking the capability to
> >
> >       consolidate the data from multiple sources.  The composition of a
> >
> >       complete solution, as partly proposed by Autonomic Resource
> >
> >       Control Architecture(ARCA)
> >
> >       [I-D.pedro-nmrg-anticipated-adaptation], will be empowered and
> >
> >       guided by a comprehensive framework.
> >
> >
> >
> >    o  Some of the conventional OAM techniques (e.g., CLI and Syslog)
> >
> >       lack a formal data model.  The unstructured data hinder the tool
> >
> >       automation and application extensibility.  Standardized data
> >
> >       models are essential to support the programmable networks.
> >
> >
> >
> >    o  Although some conventional OAM techniques support data push (e.g.,
> >
> >       SNMP Trap [RFC2981][RFC3877], Syslog, and sFlow), the pushed data
> >
> >       are limited to only predefined management plane warnings (e.g.,
> >
> >       SNMP Trap) or sampled user packets (e.g., sFlow).  Network
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021                [Page 9]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >       operators require the data with arbitrary source, granularity, and
> >
> >       precision which are beyond the capability of the existing
> >
> >       techniques.
> >
> >
> >
> >    o  The conventional passive measurement techniques can either consume
> >
> >       excessive network resources and render excessive redundant data,
> >
> >       or lead to inaccurate results; on the other hand, the conventional
> >
> >       active measurement techniques can interfere with the user traffic
> >
> >       and their results are indirect.  Techniques that can collect
> >
> >       direct and on-demand data from user traffic are more favorable.
> >
> >
> >
> >    These challenges were addressed by newer standards and techniques
> >
> >    (e.g., IPFIX/Netflow, PSAMP, IOAM, and YANG-Push) and more are
> >
> >    emerging.  These standards and techniques need to be recognized and
> >
> >    accommodated in a new framework.
> >
> >
> >
> > 3.4.  Network Telemetry
> >
> >
> >
> >    Network telemetry has emerged as a mainstream technical term to refer
> >
> >    to the network data collection and consumption techniques.  Several
> >
> >    network telemetry techniques and protocols (e.g., IPFIX [RFC7011] and
> >
> >    gRPC [grpc]) have been widely deployed.  Network telemetry allows
> >
> >    separate entities to acquire data from network devices so that data
> >
> >    can be visualized and analyzed to support network monitoring and
> >
> >    operation.  Network telemetry covers the conventional network OAM and
> >
> >    has a wider scope.  It is expected that network telemetry can provide
> >
> >    the necessary network insight for autonomous networks and address the
> >
> >    shortcomings of conventional OAM techniques.
> >
> >
> >
> >    Network telemetry usually assumes machines as data consumers rather
> >
> >    than human operators.  Hence, the network telemetry can directly
> >
> >    trigger the automated network operation, while in contrast some
> >
> >    conventional OAM tools are designed and used to help human operators
> >
> >    to monitor and diagnose the networks and guide manual network
> >
> >    operations.  Such a proposition leads to very different techniques.
> >
> >
> >
> >    Although new network telemetry techniques are emerging and subject to
> >
> >    continuous evolution, several characteristics of network telemetry
> >
> >    have been well accepted.  Note that network telemetry is intended to
> >
> >    be an umbrella term covering a wide spectrum of techniques, so the
> >
> >    following characteristics are not expected to be held by every
> >
> >    specific technique.
> >
> >
> >
> >    o  Push and Streaming: Instead of polling data from network devices,
> >
> >       telemetry collectors subscribe to streaming data pushed from data
> >
> >       sources in network devices.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021               [Page 10]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >    o  Volume and Velocity: The telemetry data is intended to be consumed
> >
> >       by machines rather than by human being.  Therefore, the data
> >
> >       volume can be huge and the processing is optimized for the needs
> >
> >       of automation in realtime.
> >
> >
> >
> >    o  Normalization and Unification: Telemetry aims to address the
> >
> >       overall network automation needs.  Efforts are made to normalize
> >
> >       the data representation and unify the protocols, so to simplify
> >
> >       data analysis and provide integrated analysis across heterogeneous
> >
> >       devices and data sources across a network.
> >
> >
> >
> >    o  Model-based: The telemetry data is modeled in advance which allows
> >
> >       applications to configure and consume data with ease.
> >
> >
> >
> >    o  Data Fusion: The data for a single application can come from
> >
> >       multiple data sources (e.g., cross-domain, cross-device, and
> >
> >       cross-layer) and needs to be correlated to take effect.
> >
> >
> >
> >    o  Dynamic and Interactive: Since the network telemetry means to be
> >
> >       used in a closed control loop for network automation, it needs to
> >
> >       run continuously and adapt to the dynamic and interactive queries
> >
> >       from the network operation controller.
> >
> >
> >
> >    In addition, an ideal network telemetry solution may also have the
> >
> >    following features or properties:
> >
> >
> >
> >    o  In-Network Customization: The data that is generated can be
> >
> >       customized in network at run-time to cater to the specific need of
> >
> >       applications.  This needs the support of a programmable data plane
> >
> >       which allows probes with custom functions to be deployed at
> >
> >       flexible locations.
> >
> >
> >
> >    o  In-Network Data Aggregation and Correlation: Network devices and
> >
> >       aggregation points can work out which events and what data needs
> >
> >       to be stored, reported, or discarded thus reducing the load on the
> >
> >       central collection and processing points while still ensuring that
> >
> >       the right information is ready to be processed in a timely way.
> >
> >
> >
> >    o  In-Network Processing: Sometimes it is not necessary or feasible
> >
> >       to gather all information to a central point to be processed and
> >
> >       acted upon.  It is possible for the data processing to be done in
> >
> >       network, allowing reactive actions to be taken locally.
> >
> >
> >
> >    o  Direct Data Plane Export: The data originated from the data plane
> >
> >       forwarding chips can be directly exported to the data consumer for
> >
> >       efficiency, especially when the data bandwidth is large and the
> >
> >       real-time processing is required.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021               [Page 11]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >    o  In-band Data Collection: In addition to the passive and active
> >
> >       data collection approaches, the new hybrid approach allows to
> >
> >       directly collect data for any target flow on its entire forwarding
> >
> >       path [I-D.song-opsawg-ifit-framework].
> >
> >
> >
> >    It is worth noting that a network telemetry system should not be
> >
> >    intrusive to normal network operations by avoiding the pitfall of the
> >
> >    "observer effect".  That is, it should not change the network
> >
> >    behavior and affect the forwarding performance.  Otherwise, the whole
> >
> >    purpose of network telemetry is compromised.
> >
> >
> >
> >    Although in many cases a system for network telemetry involves a
> >
> >    remote data collecting and consuming entity, it is important to
> >
> >    understand that there are no inherent assumptions about how a system
> >
> >    should be architected.  Telemetry data producers and consumers can
> >
> >    work in distributed or peer-to-peer fashions rather than assuming a
> >
> >    centralized data consuming entity.  In such cases, a network node can
> >
> >    be the direct consumer of telemetry data from other nodes.
> >
> >
> >
> > 4.  The Necessity of a Network Telemetry Framework
> >
> >
> >
> > RW: I think that the structure of the document might be better if this was a
> > section 3.5 of the background rather than it's own top level section?
> >
> >
> >
> >    Network data analytics and machine-learning technologies are applied
> >
> >    for network operation automation, relying on abundant and coherent
> >
> >    data from networks.  Data acquisition that is limited to a single
> >
> >    source and static in nature will in many cases not be sufficient to
> >
> >    meet an application's telemetry data needs.  As a result, multiple
> >
> >    data sources, involving a variety of techniques and standards, will
> >
> >    need to be integrated.  It is desirable to have a framework that
> >
> >    classifies and organizes different telemetry data source and types,
> >
> >    defines different components of a network telemetry system and their
> >
> >    interactions, and helps coordinate and integrate multiple telemetry
> >
> >    approaches across layers.  This allows flexible combinations of data
> >
> >    for different applications, while normalizing and simplifying
> >
> >    interfaces.  In detail, such a framework would benefit application
> >
> >    development for the following reasons:
> >
> >
> >
> >    o  Future networks, autonomous or otherwise, depend on holistic and
> >
> >       comprehensive network visibility.  All the use cases and
> >
> >       applications are better to be supported uniformly and coherently
> >
> >       under a single intelligent agent using an integrated, converged
> >
> >       mechanism and common telemetry data representations wherever
> >
> >       feasible.  Therefore, the protocols and mechanisms should be
> >
> >       consolidated into a minimum yet comprehensive set.  A telemetry
> >
> >       framework can help to normalize the technique developments.
> >
> >
> >
> >    o  Network visibility presents multiple viewpoints.  For example, the
> >
> >       device viewpoint takes the network infrastructure as the
> >
> >       monitoring object from which the network topology and device
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021               [Page 12]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> >       status can be acquired; the traffic viewpoint takes the flows or
> >
> >       packets as the monitoring object from which the traffic quality
> >
> >       and path can be acquired.  An application may need to switch its
> >
> >       viewpoint during operation.  It may also need to correlate a
> >
> >       service and its impact on user experience to acquire the
> >
> >       comprehensive information.
> >
> >
> >
> >    o  Applications require network telemetry to be elastic in order to
> >
> >       make efficient use of network resources and reduce the impact of
> >
> >       processing related to network telemetry on network performance.
> >
> >       For example, routine network monitoring should cover the entire
> >
> >       network with a low data sampling rate.  Only when issues arise or
> >
> >       critical trends emerge should telemetry data source be modified
> >
> >       and telemetry data rates boosted as needed.
> >
> >
> >
> >    o  Efficient data fusion is critical for applications to reduce the
> >
> >       overall quantity of data and improve the accuracy of analysis.
> >
> >
> >
> >    A telemetry framework collects together all of the telemetry-related
> >
> >    works from different sources and working groups within IETF.  This
> >
> >    makes it possible to assemble a comprehensive network telemetry
> >
> >    system and to avoid repetitious or redundant work.  The framework
> >
> >    should cover the concepts and components from the standardization
> >
> >    perspective.  This document describes the modules which make up a
> >
> >    network telemetry framework and decomposes the telemetry system into
> >
> >    a set of distinct components that existing and future work can easily
> >
> >    map to.
> >
> >
> >
> > 5.  Network Telemetry Framework
> >
> >
> >
> >    The top level network telemetry framework partitions the network
> >
> >    telemetry into four modules based on the telemetry data object source
> >
> >    and represents their relationship.  At the next level, the framework
> >
> >    decomposes each module into separate components.  Each of the
> modules
> >
> >    follows the same underlying structure, with one component dedicated
> >
> >    to the configuration of data subscriptions and data sources, a second
> >
> >    component dedicated to encoding and exporting data, and a third
> >
> >    component instrumenting the generation of telemetry related to the
> >
> >    underlying resources.  Throughout the framework, the same set of
> >
> >    abstract data acquiring mechanisms and data types are applied.  The
> >
> >    two-level architecture with the uniform data abstraction helps
> >
> >    accurately pinpoint a protocol or technique to its position in a
> >
> >    network telemetry system or disaggregate a network telemetry system
> >
> >    into manageable parts.
> >
> >
> >
> >
> >
> > RW: Relationship of telemetry data vs get requests.  I.e., isn't telemtry 
> > just
> push
> > rather than pulling data.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Song, et al.             Expires August 23, 2021               [Page 13]
> >
> >
> >
> >
> > Internet-Draft         Network Telemetry Framework         February 2021
> >
> >
> >
> >
> >
> > 5.1.  Top Level Modules
> >
> >
> >
> >    Telemetry can be applied on the forwarding plane, the control plane,
> >
> >    and the management plane in a network, as well as other sources out
> >
> >    of the network, as shown in Figure 1.  Therefore, we categorize the
> >
> >    network telemetry into four distinct modules with each having its own
> >
> >    interface to Network Operation Applications.
> >
> >
> >
> >                    +------------------------------+
> >
> >                    |                              |
> >
> >                    |       Network Operation      |<-------+
> >
> >                    |          Applications        |        |
> >
> >                    |                              |        |
> >
> >                    +------------------------------+        |
> >
> >                         ^      ^           ^               |
> >
> >                         |      |           |               |
> >
> >                         V      |           V               V
> >
> >                    +-----------|---+--------------+  +-----------+
> >
> >                    |           |   |              |  |           |
> >
> >                    | Control Pl|ane|              |  | External  |
> >
> >                    | Telemetry | <--->            |  | Data and  |
> >
> >                    |           |   |              |  | Event     |
> >
> >                    |      ^    V   |  Management  |  | Telemetry |
> >
> >                    +------|--------+  Plane       |  |           |
> >
> >                    |      V        |  Telemetry   |  +-----------+
> >
> >                    | Forwarding    |              |
> >
> >                    | Plane       <--->            |
> >
> >                    | Telemetry     |              |
> >
> >                    |               |              |
> >
> >                    +---------------+--------------+
> >
> >
> >
> >                 Figure 1: Modules in Layer Category of NTF
> >
> >
> >
> > RW:
> >
> > In this diagram, for me at least, I think that it would more natural to have
> > Management Plane on the left, and Control/ Forwarding Plane on the right.
> >
> >
> >
> >    The rationale of this partition lies in the different telemetry data
> >
> >    objects which result in different data source and export locations.
> >
> >    Such differences have profound implications on in-network data
> >
> >    programming and processing capability, data encoding and transport
> >
> >    protocol, and required data bandwidth and latency.
> >
> >
> >
> > RW:
> >
> > Data can be sent directly, or proxied via the control and management
> planes

_______________________________________________
OPSAWG mailing list
OPSAWG@ietf.org
https://www.ietf.org/mailman/listinfo/opsawg

Reply via email to