Re: [netmod] WG Last Call: draft-ietf-netmod-yang-instance-file-format-06

Schönwälder , Jürgen Wed, 12 Feb 2020 01:08:03 -0800

Trimming things down where things are clear, not fixing the quoting
problems...


> * Abstract
> 
>   I think we should avoid referring to some <get> operation. Here is a
>   proposal of a rewrite:
> 
> OLD
> 
>    running server available.  This document specifies a standard file
>    format for YANG instance data (which follows the syntax and semantic
>    from existing YANG models, re-using the same format as the reply to a
>    <get> operation/request) and annotates it with metadata.
> 
> NEW
> 
>    running server available.  This document specifies a standard file
>    format for YANG instance data, which follows the syntax and semantic
>    of existing YANG models, and annotates it with metadata.
> BALAZS: Other have expressly asked for a reference to "get" but if you want
> I can remove it.

Thanks. So also further below...
 
>   - I fail to see the difference between 'content-schema' and 'content
>     defining YANG module(s)'. The 'content-schema' is already a set of
>     YANG modules. I suggest to remove 'Content defining YANG module(s)
>     as it is not a necessary term. Rewrite all places where the phrase
>     'content defining YANG modules' is used.
> BALAZS: a schema is a full set of YANG modules needed to define the 
> structure and properties of the instance data (+features, deviations).  
> A  "content defining YANG module" is an individual YANG module is 
> part of the content-schema. So the difference is a set versus one item. 
> I updated the description to emphasize this difference.

OK. But then what is a non-content defining YANG module? Or are these
schema-defining YANG modules? I still do not get why we need 'content
defining YANG modules' - we did not need that in other specifications
so far that define schemas. So why do we need new terms here?

>   - Is it necessary to describe P2 in terms of (presumably) NETCONF
>     operations? I would prefer to have the document written in a
>     protocol agnostic style. Perhaps simply drop "similar to the
>     response of a <get> operation/request".
> BALAZS: This is a reference both to NETCONF and RESTCONF. It was explicitly
> asked for by other reviewers.

Well, then the correct wording would be "similar to the response of a
NETCONF <get> operation or the RESTCONF response to a GET method
invocation on the (unified) datastore resource". Sounds complex and I
still prefer the text to be agnostic to specific operations - in
particular since <get> and the unified datastore have their
limitations. The format is simply reusing the already defined data
model encoding formats, i.e., the format has nothing to do with the
operations used to retrieve the data. So I suggest:

   P2  Instance data shall reuse existing encoding rules for
       YANG defined data.

There is no need to refer to specific protocol operations.

>   - I do not understand that text about the default attribute. Section
>     4.8.9 defines a query parameter, not an attribute. And I do not
>     know how that fits into content data.
> BALAZS: https://tools.ietf.org/html/rfc8040#section-4.8.9:
> " If the "with-defaults" parameter is set to "report-all-tagged", then
>    the server MUST adhere to the default-reporting behavior defined in
>    Section 3.4 of [RFC6243].  Metadata is reported by the server as
>    specified in Section 5.3.  The XML encoding for the "default"
>    attribute sent by the server for default nodes is defined in
>    Section 6 of [RFC6243].  The JSON encoding for the "default"
>    attribute MUST use the same values, as defined in [RFC6243], but
>    encoded according to the rules in [RFC7952].  The module name
>    "ietf-netconf-with-defaults" MUST be used for the "default"
>    attribute. "
> Here the usage of the default ATTRIBUTE is defined.

I am still confused about terminology here, an attribute is an XML way
of representing meta data, JSON does this differently. Perhaps some
good examples would clear the confusion.

>   - Similarly, I do not understand why implementation specific
>     metadata may be included in the content-data. This seems to be the
>     wrong place, no? Should metadata not go into the header?
> BALAZS: As this might be meta-data about the individual instance 
> data nodes (e.g.  metadata following the principles from rfc7952) it belongs
> here.

OK, perhaps my confusion is that it was not clear to me what kind of
metadata is meant here...
 
>   - Why MUST XML attributes be ignored, why is there no text about
>     unknown JSON data, 'attributes' (or annotations)? What should
>     implementations generally do about unknown elements, attributes,
>     objects, arrays, ...)? Why are we specific about only one specific
>     case?
> BALAZS:  Generally we want to allow users/creators to decorate the data 
> with additional information, that is not standardized. Like YANG extensions
>  these may be useful, but at least should not cause problems.
> XML attributes are often used as meta-data and I was asked to list them
> specifically.
> 
> It is not stated what an application should do with additional unknown data 
> (XML elements, JSON data) that do not fit the above categories.  Should we
> say something about it? 
> IMHO no. We don't want to be too restrictive, as there are many potential 
> users with different needs. We could state
>  "Users of the instance data MAY discard any other unknown data".
> However that does not mean much.

I do not understand why there are specific rules for XML encodings but
not equivalent JSON rules. It looks like either the XML rules are not
needed or equivalent JSON rules are missing if the XML rules are
needed or there should be an explanation why the different encodings
lead to different results (which is operationally rather surprising).

If we want rules that apply to all encodings, they should be expressed
in an encoding neutral way. The current text and your response leaves
me puzzled what the specification really wants to say here. And do we
have to say something at all?

>     It is unclear what "will be very similar" really means but perhaps
>     this is clarified later. If not, this sentence says nothing in
>     terms of a technical specification.

So what is the meaning of "will be very similar"?

>   - The introduction contains several MAYs and MUSTS that are not
>     understandable yet and they do not seem to belong into an
>     'Introduction' in the first place.
> BALAZS: Section 2 Introduction 1 'may'
> " further instance data formats may be specified"
> I was specifically asked to include this. Why is this not understandable? 
> Where should this be if not in the introduction chapter?
> Section 2 does not contain the word must. 
> Maybe I am not understanding your comment.

Hm, I do not recall what made me write this, so lets ignore this
comment.
 
>   - Why is EXTERNAL in all caps but Inline in capitalized form?  In
>     the YANG definitions, EXTERNAL seems to be uri. I think we reduce
>     ambiguity by being consistent with how we name things.
> BALAZS: OK, EXTERNAL should not be all caps. 
> Here external means that the content-schema is defined externally 
> to the instance data set, not even a URI is included.

So if I have no case in the content-schema-spec choice, then it is
external or how does this work? Perhaps define external differently?
Another attempt:

External method: Do not include the content-schema, the user needs
to obtain the information through external documents.

I removed "already known" since a tool and human producing an instance
file will in general have no clue what a user of that instance file
already knows.
 
>   - 3.1.1 How are the details specified in the anydata? Perhaps a
>     forward reference might help. What are 'version labels'?
> BALAZS: Added reference to example.
> Version/Revision labels are defined in
> draft-verdt-netmod-yang-module-versioning; 
> added as a reference. I added them here (only as an example) as they are 
> highly relevant to specifying module versions even if they are not 
> agreed in Netmod yet. The name was changed from version-label to
> revision-label lately.

Lets use a single term then, lets say "revision labels" if that is the
most recent once. Right now, both terms seem to be used.

>   - I like to understand why we need several methods to specify the
>     schema. Having N solution is always bad for interoperability and
>     also for maintainability. Perhaps the WG failed to reach consensus
>     on a single solution.  Or there are strong technical reasons - but
>     then they should be clearly stated. What are implementations
>     expected to support, all methods? Or whatever the implementer
>     prefers? How do we achieve interoperability across tools?
> BALAZS: Different people in the WG wanted different solutions.
> - Some (as I remember you too) asked for a full flexible solution 
> which can use multiple modules potentially not even the 
> ietf-yang-library to define the schema  (Inline solution)
> - some asked for a simple solution listing the content schema modules
> - some wanted just to use a reference (If any this is the one, I would
> remove)
> - some stated that they do not want to define the 
> content-schema at all because it is already known 
> So we ended up with 4 methods

But reaching consensus by doing all four is not necessarily cheap. So
what are compliant tools required to implement. All 4 method?
Whatever the implementer prefers? Or is there a mandatory to support
method (other than external ;-)? The WG needs to understand the costs
of having N ways to do the same thing.
 
>   - In the second paragraph, I like to see some discussion of snapshot
>     consistency.  How much consistency can be expected? Are there
>     indicators for the level of consistency? I would remove the
>     sentence about "valid values can be retrieved at run-time" as this
>     is obvious but then I am not sure why 'valid' values? Perhaps the
>     authors meant 'current' values?
> BALAZS: OK< Changed to current. I want to keep the second sentence 
> as it describes the duality between the original documented values and
>  the current values that can be read in run-time.
> Consistency is out of scope. No indicators are provided. It is very 
> much use-case and implementation specific.

In this case, I think it helps to spell out that users cannot assume
that instance data always represents consistent snapshots.

>   - How do I implement the "SHOULD be described"? The default is that
>     data can change, only in rare cases data is static. But how does a
>     tool creating instance data know 'when and how' data changes in the
>     future? I suggest to remove the SHOULD. The text saying that instance
>     data is a snapshot is in my view sufficient.
> BALAZS: We do not want to specify the how the changes should be described, 
> But we do want to state that this information should be made available.
> Just a few ideas how this could be done. Provide
> - some plain text in the description of the instance data set
> - some additional metadata e.g. etags, timestamp for the individual data
> nodes.
> - a change indicator in the content defining yang module itself 

I do not know how I implement such a SHOULD. I admit that I do not
understand RFC 2119 language but a lowercase should would make me feel
better.  The concern here is that it is entirely implementation
specific how I make this information available and hence whether I
have followed the SHOULD or not is rather unclear.

>   - This section talks about YANG instance data but it likely should
>     talk about YANG instance data sets.
> BALAZS: I think both are acceptable terms here. Naturally if the data
> changes 
> the data set containing it also changes.

Well, yes, but the text says "instance data set's description
statement", i.e., the change is documented on the set and not on an
individual instance. But yes, this is a minor nit.

> * Delivery of Instance Data
> 
>   - Why do we need this SHOULD? I do not think we should use RFC 2119
>     keywords to define how organizations may use the instance data
>     format. My proposal is to delete this entire section.
> BALAZS: I will change it to lower case may.
> I was asked to and I want to state that we want to use instance data 
> both for offline delivery of design time information and for run-time
> delivery of other data.

But should this not be stated in the use cases and principles list in
section 2? I think section 5 is a mixture of a use-case concern and a
requirement (oops principle):

  PX: Instance data sets may be read from or produced by a live server
      [is YANG server the proper term?] or they can be the result of a
      specification or design effort that does not involve a live
      server.

I think the essence of section 5 should be integrated into section 2.
What it says seems misplaced in the middle of the document. (I
personally prefer to talk about objectives rather than principles but
that may be just me.)
 
> (The first 3 users of this format all want to use this for early delivery of
> 
> server capabilities. It is for now the dominant use case for which the  
> 2119 SHOULD is important.). 

I do not think this specification should define SHOULDs for specific
use cases. See my proposal for a possible PX to capture what I think
is the core idea.

> * Backwards Compatibility
> 
>   - I do not think 'managed entity' is a YANG term.
> BALAZS: What term do you propose for something that is managed like 
> an interface or user etc. ? I was told managed entity is a generic term 
> that is commonly understood . Would "managed item" or "managed thing" be
> better?
> 
>   - I think this text is use case specific and the items are kind of
>     conflicting with each other (2nd says changing the semantics of a
>     list should lead to a change of the key while the 1st suggests
>     that changing keys may lead to misinterpretation of something
>     being new).
> 
>   - My proposal is to simply drop this entire section. If use case
>     specific text is needed, add it to the use cases in the appendix.
> BALAZS: You don't know how many trouble reports we got in 
> multiple use-cases for violating these recommendations. While 
> they may not be important for all use-cases, the are important for many.
> Actually we met the problem or had to avoid it in all but one of 
> the listed use-cases.

This text seems specific to certain use cases or best practices and as
such I suggest to integrate it into the appendix C. I do not think
this advice needs to be part of the technical instance data format
specification. One could even argue that some of this also concerns
config changes to live servers. My issue is that I find this
discussion misplaced - I like to to see the format definition
separated from any guidelines how to use it.
 
> * YANG Model
> 
>   - How is the inline-content-schema feature used? Which component
>     does indicate that inline content-schema is supported? Do all
>     implementations have to support simplified-inline? If
>     inline-schema is used, how do I find out which schema formats are
>     supported? The more formats there are, the more interoperability
>     issues will arise.
> Balazs:
> - case inline { is decorated with "if-feature inline-content-schema"

OK

> - feature support is generally indicated as part of the ietf-yang-library

OK

> - simplified-inline is mandatory to support. It is relatively simple, so
> IMHO not a problem

How do I know whether the feature inline-content-schema is supported
in this case?

How much simplification is there really compared to the inline method
if I only list modules in the yang library schema without derivations
etc? See my earlier point about which schema formats are mandatory to
implement and whether the simplification is worth the extra code and
possible interoperability issues.

> - what do you mean with schema-formats? The yang schema is not actually
> included anywhere.
> If the "inline" case is used, instance data corresponding to the 
> inline-modules is included, not the schema. 
> anydata inline-schema {
>              description
>                "Instance data corresponding to the YANG modules
>                 specified in the inline-module nodes defining the set
>                 of content defining YANG modules for this
>                 instance-data-set."

My understanding is that the inline-module indicates a variant of the
yang library used and the inline-schema then follows that indicated
yang library variant and provides the schema. Am I entirely wrong
here?
 
> * Security Considerations
> 
>   - "is designed as a wrapper" - what does this tell me? I suggest to
>     rewrite the first paragraph and to remove this phrase or to explain
>     what it means.
> 
>   - Why is the header part not security sensitive? Almost all data is
>     security sensitive in certain situations.
> BALAZS: IMHO it is a valid and meaningful statement to differentiate 
> between security sensitive data like passwords and non-sensitive data 
> like a revision date. RFC8341 states:
> "One of the most important aspects of the data model documentation,
> and one of the biggest concerns during deployment, is the
> identification of security-sensitive content." 
> So the differentiation between sensitive and non-sensitive information is
> important.
> In your opinion which part of the header data is sensitive?

In today's world the idea of 'non-sensitive data' is rather difficult.
The name of an instance-data-set can reveal information, the timestamp
can be sensitive, the description can be sensitive, the combination
can be sensitive. I find the claim that instance data may be sensive
but the header part is not sensitive a very big claim. But since this
section will also be read by SECDIR reviewers, we can see whether they
will raise a point. I certainly would if I were a the assigned SECDIR
reviewer.

>   - Since instance data files may require protection, is there any
>     recommendation how to do this, e.g., by wrapping everything into a
>     cryptographic message syntax or so? It would be important in
>     certain use cases to be able to verify that instance data is
>     authentic (i.e., it is signed by the original source). In other
>     cases, it may be crucial to protect the instance data itself
>     against occasional readers.
> BALAZS: File security is an important but really big topic and
>  I was instructed by multiple people to avoid a half baked discussion on the
> topic.

OK
 
>   - It may be useful to explain that data in instance data sets may
>     have been filtered by access control rules like NACM and that data
>     in instance data sets itself won't be filtered anymore by access
>     control rules like NACM. In other words, if I take snapshots and
>     stored them as instance data files, these snapshots may leak
>     information that is otherwise protected. Hence it is important
>     that NACM rules and file access control rules are consistent.
> BALAZS: We do not know if the instance data set was originally 
> filtered by NACM or not. We don't know if the users on 
> Netconf/Restconf/cli are the same as the users defined in the 
> file system., so I fear defining what consistent means would be impossible. 
> It is stated that " The same kind of handling should be applied, that would
>    be needed for the result of a <get> operation returning the same
>    data." IMHO we can't really say more.

Yes, I guess what I was trying to say is that a live server may apply
certain access control rules while instance files may not apply the
same rules. In other words, an instance file obtained from a live
server by joe and passed on to lucy may reveal information that lucy
will not be allowed to see on the live server. By passing around
instance files, information may accidentally leak. Yes, we can't solve
this, put we can point this out.

/js

-- 
Juergen Schoenwaelder           Jacobs University Bremen gGmbH
Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen | Germany
Fax:   +49 421 200 3103         <https://www.jacobs-university.de/>

_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Re: [netmod] WG Last Call: draft-ietf-netmod-yang-instance-file-format-06

Reply via email to