Hi,

wor...@ariadne.com (Dale R. Worley) wrote:
> Martin Bjorklund <m...@tail-f.com> wrote on Mon, 23 May 2016 15:43:09 +0200 
> (CEST):
> > wor...@ariadne.com (Dale R. Worley) wrote:
> 
> Almost all of these points I'm happy with the authors fixing in the
> manner they suggest.  The points where I have further comment are:
> 
> > > The incompatibilities should be marked whether the old, incompatible
> > > usage always causes an error "at compile time", "at run time", or
> > > changed behavior.
> > 
> > Ok, but I'd like to avoid the term "compile-time" here.
> 
> True...  The distinction I'm looking for is "Incompatible usage will
> cause an error report before the software is put into operation."
> vs. "Incompatible usage will cause an error report when the software is
> in operation." vs. "Incompatible usage will cause a difference in
> behavior without an error report."

The spec defines what is legal YANG.  How this is enforced by tools
must be up to the tools, right?

> > How about:
> > 
> >    The following changes are not backwards compatible with YANG version
> >    1:
> > 
> >    o  Changed the rules for the interpretation of escaped characters in
> >       double quoted strings.  This is an backwards incompatible change
> >       from YANG version 1.  A module that uses a character sequence that
> >       is now illegal must change the string to match the new rules.  See
> >       Section 6.1.3 for details.
> > 
> >    o  An unquoted string cannot contain any single or double quote
> >       characters.  This is an backwards incompatible change from YANG
> >       version 1.  A module that uses such quote characters must change
> >       the string to match the new rules.  See Section 6.1.3 for details.
> 
> I assume that the first two situations will cause an error report before
> the module is put into production.
> 
> >    o  Made noncharacters illegal in the built-in type "string".  This
> >       change affects the run-time behavior of YANG-based protocols.
> 
> What happens if you attempt to use a noncharacter in a string?

This depends on the protocol.  noncharacters are illegal.  Maybe in
protocol A you can't even encode noncharacters, but in protocol B
you'll get a protocol error from the other side if you send
noncharacters.

> > >    o  leaf: A data node that exists in at most one instance in the data
> > >       tree.  A leaf has a value but no child nodes.
> > > 
> > > I'm not sure that this is correct; a leaf schema node inside a list
> > > schema node can have many instances in a data tree.  The real
> > > criterion is that it has a value but no child nodes.
> > 
> > Would "one instance per parent node" work?
> 
> I think that's correct.  The case I would expect to be tricky is be a
> leaf contained in a list, but the XML for lists wraps an element around
> each list member, so each leaf has "one instance per parent node".

Maybe "one instance per parent instance"?  Put into context:

  - container: An interior data node that exists in at most one
    instance per parent instance in the data tree. 

etc.

> It might be worth inserting a definition of "node" as shorthand for
> "data node", as there seems to be no definition of "node" alone, but it
> is commonly used to mean "data node".  In particular, the reader needs
> to be fully aware that "node" is a node in the schema tree, not an
> instantiation thereof.

But saying that "node" means "data node" is not correct.  Sometimes
the text talk about "schema nodes" etc.  I think it is better to
qualify the usage of "node" where necessary to avoid the ambiguity.

> > > It would be useful to insert a note somewhere that all data is either
> > > "configuration" or "state" data.  It's hard to learn that now, because
> > > the terms "configuration data" and "state data" are defined only by
> > > reference to RFC 6241.
> > 
> > But this is not strictly correct.  For example RPC input is neither
> > config nor state.  Section 3 has:
> > 
> >   o data tree: An instantiated tree of any data modeled with YANG,
> >     e.g., configuration data, state data, combined configuration and
> >     state data, RPC or action input, RPC or action output, or
> >     notification.
> 
> Section 4.1 contains a similar statement.
> 
> I think the problem I'm having is with the first sentences of 4.2.3:
> 
>    YANG can model state data, as well as configuration data, based on
>    the "config" statement.
> 
> At first read, this suggests that data is partitioned into
> "configuration data" and "state data".  What's really happening is that
> there are various sorts of data, but some contexts admit a mixture of
> configuration and state data, so the data model definitions of those
> contexts requires that individual data items can be flagged as
> configuration vs. data.  Maybe starting 4.2.3 like this would work:
> 
>    In some contexts, data modeled by YANG can contain mixtures of state
>    data and configuration data, based on the "config" statement.  When a
>    node is tagged with "config false", its subhierarchy is flagged as
>    state data.

I think that this might be a bit confusing (which contexts?  what
exacyly is this mixture?) and too detailed for this introductory
section.   I would perfer to keep the original text.

> > > - section 4.2.7
> > > 
> > >    YANG allows the data model to segregate incompatible nodes into
> > >    distinct choices using the "choice" and "case" statements.  The
> > >    "choice" statement contains a set of "case" statements that define
> > >    sets of schema nodes that cannot appear together.  Each "case" may
> > >    contain multiple nodes, but each node may appear in only one "case"
> > >    under a "choice".
> > > 
> > >    When a node from one case is created in the data tree, all nodes from
> > >    all other cases are implicitly deleted.  The server handles the
> > >    enforcement of the constraint, preventing incompatibilities from
> > >    existing in the configuration.
> > > 
> > >    The choice and case nodes appear only in the schema tree but not in
> > >    the data tree.  The additional levels of hierarchy are not needed
> > >    beyond the conceptual schema.
> > > 
> > > This description reads very oddly.  AFAICT, "choice" simply defines a
> > > union type, where each alternative is a container (usually called a
> > > structure or record).  Or rather, it's conceptually a container, but
> > > in an XML instance of data, there is no start-end tags for the
> > > structure as a whole (paralleling the lack of start-end tags for lists
> > > as a whole).  Once you state that, it's clear why there are "sets of
> > > schema nodes that cannot appear together".
> > 
> > I'm worried that if we describe "case" as a special kind of
> > "container" it will be pretty confusing.
> 
> Perhaps so.  But it seems to me that it needs to be pointed out that the
> "choice" and "case" are not materialized as XML nodes themselves; you
> can only tell which case is present by the fact that one or more of its
> elements are present.  Hmmm...  The third paragraph says that, more or
> less.  I think the second paragraph would be clearer if the third
> paragraph was moved before it, and some text added:
> 
>    The choice and case nodes appear only in the schema tree but not in
>    the data tree.  The additional levels of hierarchy are not needed
>    beyond the conceptual schema.  >>The presence of a case is indicated by
>    the presence of one or more of the nodes within it.<<
> 
>    When a node from one case is created in the data tree, all nodes from
>    all other cases are implicitly deleted.  The server handles the
>    enforcement of the constraint, preventing incompatibilities from
>    existing in the configuration.

Ok.

> > > - section 4.2.8
> > > 
> > > It would help to give a general discussion of augmenting.  If one
> > > module augments another, is the augmented data definition part of the
> > > data structures defined by augmenting module or the augmented one?
> > > 
> > > If I've got it right, this module:
> > > 
> > >    module /system/login/user {
> > >      namespace "urn:user";
> > >      prefix "user";
> > > 
> > >      list user {
> > >        key "name";
> > >        leaf name {
> > >          type string;
> > >        }
> > >        leaf full-name {
> > >          type string;
> > >        }
> > >        leaf class {
> > >          type string;
> > >        }
> > >      }
> > >    }
> > > 
> > > gives this as the XML encoding of a typical data tree:
> > > 
> > >      <user xmlns="urn:user">
> > >        <name>alicew</name>
> > >        <full-name>Alice N. Wonderland</full-name>
> > >        <class>drop-out</class>
> > >        <other:uid>1024</other:uid>
> > >      </user>
> > 
> > Did you mean:
> > 
> >       <user xmlns="urn:user">
> >         <name>alicew</name>
> >         <full-name>Alice N. Wonderland</full-name>
> >         <class>drop-out</class>
> >       </user>
> > 
> > ?
> 
> Yes, I made a mistake there.  (I must have.)
> 
> > > whereas the *existence* (in some sense) of this module:
> > > 
> > >    module /system/login/user-augmenter {
> > >      namespace "urn:user-extension";
> > >      prefix "ext";
> > > 
> > >      augment /system/login/user {
> > >        when "class != 'wheel'";
> > >        leaf uid {
> > >          type uint16 {
> > >            range "1000 .. 30000";
> > >          }
> > >        }
> > >      }
> > >    }
> > > 
> > > changes the encoding of the data tree to:
> > > 
> > >      <user xmlns="urn:user" xmlns:ext="urn:user-extension">
> > >        <name>alicew</name>
> > >        <full-name>Alice N. Wonderland</full-name>
> > >        <class>drop-out</class>
> > >        <ext:uid>1024</ext:uid>
> > >      </user>
> > > 
> > > What is it that triggers this action-at-a-distance?
> > 
> > The fact that the server implements both modules.
> 
> It might be helpful to state that explicitly.  Could we add to the
> second paragraph of 4.2.8 this:
> 
>    (( The "augment" statement defines the location in the data model
>    hierarchy where new nodes are inserted, and the "when" statement
>    defines the conditions when the new nodes are valid. ))  When a
>    server implements a module containing an "augment" statement, that
>    implies that the server's implementation of the augmented module
>    contains the additional nodes.

Ok.

> > > - section 5.1.2
> > > 
> > >    YANG allows modeling of data in multiple hierarchies, where data may
> > >    have more than one top-level node.  Models that have multiple top-
> > >    level nodes are sometimes convenient, and are supported by YANG.
> > > 
> > > Any single module seems to define only one data structure, the one
> > > whose elements are the items listed in the module, and Yang doesn't
> > > seem to provide for any sort of "model" definition other than a
> > > module.  Or are the nodes at the top level within the module all
> > > usable as independent data structures?
> > 
> > The term "data structure" isn't used in the document.  YANG allows
> > a module to define more than one subtree.
> 
> OK, I think I see where I was confused.  I think that would have been
> avoided if the 5.1.2 started:
> 
>    YANG allows modeling of data in multiple hierarchies.  Each top-level
>    data node in a module defines an independent hierarchy.  Models that
>    have ...

What does "independent" imply?  They are not necessarily semantically
independent.  Maybe s/independent/separate/ ? 

> > >    NETCONF is capable of carrying any XML content as the payload in the
> > >    <config> and <data> elements.  The top-level nodes of YANG modules
> > >    are encoded as child elements, in any order, within these elements.
> > > 
> > > This isn't very clear.  AFAICT from the example:
> > > 
> > >    NETCONF <config> and <data> elements each contain sequences of
> > >    instances of top-level nodes of the appropriate YANG module.
> > 
> > I am not sure that this proposed text is more clear, maybe b/c I don't
> > see what in the original text is unclear.
> 
> See the preceding item for what I think the problem is.
> 
> > > Given that this example is about a particular module, it would help if
> > > the module namespace and prefix was used correctly in the XML example.
> > 
> > I think the example uses the XML namespace correctly.
> 
> What I was noticing is that the module statement starts:
> 
>      module example-config {
>        yang-version 1.1;
>        namespace "urn:example:config";
>        prefix "co";
> 
>        container system { ... }
>        ...
> 
> And the XML starts:
> 
>      <rpc message-id="101"
>           xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"
>           xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
>        <edit-config>
>          <target>
>            <running/>
>          </target>
>          <config>
>            <system xmlns="urn:example:config">
>              <!-- system data here -->
>              ...
> 
> But it seemed to me that the recommended style for the XML has the
> specified prefix "co" used on the data nodes defined in the module:
> 
>            ...
>            <co:system xmlns:co="urn:example:config">
>              <!-- system data here -->
>              ...
> 
> Looking again at 7.1.4 and the examples, it seems that the style of the
> examples is to use an XMLNS prefix only when a data node is from a
> different namespace than the top-level node containing it, i.e., data
> nodes with imported definitions.  I was expecting to see prefixes used
> even for non-imported data nodes.  But reading 7.1.4 suggests that I am
> incorrect there.

We don't make any recommendation on using explicit prefixes or default
namespace.

> > > - section 5.2
> > > 
> > >    YANG modules and submodules are typically stored in files, one module
> > >    or submodule per file.
> > > 
> > > This might be clearer as "... one module or submodule statement per
> > > file."
> > 
> > How is "one module per file" interpreted differently than "one module
> > statement per file"?
> 
> The problem is that "module" is ambiguous.  One meaning is the
> collection of all data definitions that are usable in the NETCONF
> operations.  The other is the contents of the module statement.  In a
> sense, module (1) = module (2) plus zero or more submodules.

Ok, I see what you mean.  This makes sense.  I'll change to "one
module or submodule statement".

> See this in 4.2.1:
> 
>    A module may be divided into submodules, based on the needs of the
>    module owner.  The external view remains that of a single module,
>    regardless of the presence or size of its submodules.
> 
> So one could read the current sentence as meaning that a file could
> contain a module (1) statement and all of the submodule statements that
> are included by the module (1) statement, and so the contents of the
> file as a whole constitutes a module (2).  The question is, does one
> module (1) go in a file, or one module (2) go in a file?
> 
> > >    XML namespaces for private modules are assigned by the organization
> > >    owning the module without a central registry.  Namespace URIs MUST be
> > >    chosen so they cannot collide with standard or other enterprise
> > >    namespaces, for example by using the enterprise or organization name
> > >    in the namespace.
> > > 
> > > I don't see the use of "without a central registry"; haven't you
> > > already said the module is is private?  (Or is "without a central
> > > registry" a term of art?)
> > 
> > It could be that all namespaces had to be registered with IANA for
> > example.
> 
> I guess I consider that namespaces registered with IANA or another
> organization are the uncommon case.  But perhaps in network management
> the reverse is the case.

The text simply says that a central registry is not needed.  It seems
you agree with this.

> > > - section 5.5
> > > 
> > > This section talks a great deal about why scoping is done, but isn't
> > > specifically clear what the rules are.  (Traditionally, the rule is
> > > stated as either what range of source text a particular definition is
> > > visible in, or given an identifier use, how the matching definition is
> > > found.)
> > > 
> > >    Scoped definitions MUST NOT shadow definitions at a higher scope.  A
> > >    type or grouping cannot be defined if a higher level in the schema
> > >    hierarchy has a definition with a matching identifier.
> > > 
> > > I think the second sentence is confusing, because (to me) the term
> > > "schema hierarchy" implies the data tree structure, whereas the
> > > scoping rule seems to be intended to be entirely lexical.  In either
> > > case, this needs to be clarified.
> > 
> > The first sentence says:
> > 
> >   Typedefs and groupings may appear nested under many YANG statements,
> >   allowing these to be lexically scoped by the hierarchy under which
> >   they appear.
> > 
> > Maybe it would help to move paragraph 2 and 3 to the end, so that the
> > motivational text is at the end of the section?
> 
> It looks like current paragraphs 1, 4, and 5 are prescriptive, and 2 and
> 3 are motivational.  Putting 1, 4, and 5 together should make them
> clearer.  But since paragraph 2 starts with "Scoping...", it really
> needs to be after 1.  So, yes, moving 2 and 3 to the end seems like the
> clearest order.  Does it make sense to you to put all of the motivation
> at the end of this section?

I prefer to have motivational text before the rules ("why" before
"how").  Hmm.  The preferred order is "what", "why", "how", and that
is really the order we currently have (1 is "what", 2 + 3 is "why" and
4 +5 is "how").

But obviously this was confusing to you, so we probably need to make
some clarification.

Since the problem was with the term "schema hierarchy", maybe we could
change this to "statement hierarchy" instead, to stress the fact that
it is lexical scoping?

[sorry for missing your original point]

> > > - section 5.6.5
> > > 
> > > I suspect that if a server implements module A, and A imports B, the
> > > server is not required to implement B.  (Even if A references
> > > groupings in B.)  The answer to that should probably be stated
> > > explicitly.
> > 
> > It is not so simple.  The rest of the text defines the rules for when
> > the server is required to implement B.
> 
> Yes, I was wrong there.  (I can't see why I made that mistake.)
> 
> > > Is this the correct way to specify multiple revisions of this module?
> > > Or is this an informal notation for the two module definitions:
> > > 
> > >      module b {
> > >        yang-version 1.1;
> > >        namespace "urn:example:b";
> > >        prefix "b";
> > > 
> > >        revision 2015-04-04;
> > >        revision 2015-01-01;
> > > 
> > >        typedef myenum {
> > >          type enumeration {
> > >            enum zero; // added in 2015-01-01
> > >            enum one;  // added in 2015-04-04
> > >          }
> > >        }
> > > 
> > >        container x {  // added in 2015-01-01
> > >          container y; // added in 2015-04-04
> > >        }
> > >      }
> > > 
> > >      module b {
> > >        yang-version 1.1;
> > >        namespace "urn:example:b";
> > >        prefix "b";
> > > 
> > >        revision 2015-01-01;
> > > 
> > >        typedef myenum {
> > >          type enumeration {
> > >            enum zero; // added in 2015-01-01
> > >          }
> > >        }
> > > 
> > >        container x {  // added in 2015-01-01
> > >        }
> > >      }
> > > 
> > > If it is the latter, it would help the new reader to explain the
> > > convention (as no example of a multi-revision module is given in the
> > > document).
> > 
> > The comments are not required.  There is no correct (mandated) way to
> > specify which changes are done in a new revision.
> 
> What I was concerned with is what does the above module statement do?
> If I understand correctly, it defines the 2015-04-04 revision of b, and
> also tells us that a 2015-01-01 revision exists, but does not define
> that revision.  Implicitly, there must be a .yang file somewhere
> containing the module statement that defines the 2015-01-01 revision of
> b.
> 
> But despite that the text talks about the 2015-01-01 revision ('can
> implement revision "2015-01-01" or "2015-04-04" of module "b"'), that
> file is not included in the example, which is not what I expected, which
> led me to wonder if the 2015-01-01 revision was somehow also defined by
> the presented "module b" statement.  I would have had less trouble with
> the example if the definition of the earlier revision of b was also
> given.

Ok, I see what you mean.  This makes sense.  I'll add 

  module b {
    yang-version 1.1;
    namespace "urn:example:b";
    prefix "b";

    revision 2015-01-01;

    typedef myenum {
      type enumeration {
        enum zero;
      }
    }

    container x {
    }
  }

and

  module c {
    yang-version 1.1;
    namespace "urn:example:c";
    prefix "c";

    revision 2015-02-02;

    typedef bar {
      ...
    }
  }


to the example.


> > > - section 6
> > > 
> > >    YANG modules use the UTF-8 [RFC3629] character encoding.
> > > 
> > > Better to say that YANG modules use the Unicode character set and are
> > > stored in files using the UTF-8 character encoding.
> > 
> > Do you mean:
> > 
> > OLD:
> > 
> >    YANG modules use the UTF-8 [RFC3629] character encoding.
> > 
> >    Legal characters in YANG modules are the Unicode and ISO/IEC 10646
> >    [ISO.10646] characters, including tab, carriage return, and line feed
> >    but excluding the other C0 control characters, the surrogate blocks,
> >    and the noncharacters.  The character syntax is formally defined by
> >    the rule "yang-char" in Section 14.
> > 
> > NEW:
> > 
> >    Legal characters in YANG modules are the Unicode and ISO/IEC 10646
> >    [ISO.10646] characters, including tab, carriage return, and line feed
> >    but excluding the other C0 control characters, the surrogate blocks,
> >    and the noncharacters.  The character syntax is formally defined by
> >    the rule "yang-char" in Section 14.
> > 
> >    YANG modules and submodules are stored in files using the UTF-8
> >    [RFC3629] character encoding.
> 
> Yes, that's right.

Ok, fixed.

> > > This might be a good place to talk about line-breaks in Yang source
> > > files.  It looks from section 14 that line-breaks MUST be either CRLF
> > > or LF.
> > > 
> > > It is worth noting (either here or in 6.1.3) how line-breaks inside
> > > quoted strings are transcribed into the string's value.  As now
> > > written, it seems that the line-break is transcribed identically to
> > > how it is represented in the source.  That means (1) If the source is
> > > recoded with the other type of line break, the semantics of the Yang
> > > code change; and (2) if the source uses line-breaks of one type (CRLF
> > > or LF), only that type can be directly transcribed into string values.
> > > (But regardless of the source line-breaks, an LF can be transcribed
> > > into a double-quoted string with "\n".  But a CRLF cannot be
> > > transcribed into a double-quoted string with escape sequences.  Was
> > > that intended, or was "\r" intended to be legal?)
> 
> Don't forget this.

Adding "\r" would be backwards incompatible with YANG 1.  You can
always get the same effect by using single quoted strings.

> > > - section 6.1
> > > 
> > >    This section details the rules for recognizing tokens from an input
> > >    stream.
> > > 
> > > Generally, language definitions intersperse the narrative text with
> > > the relevant grammar definitions.  Yang's statement grammar is simple
> > > enough that one doesn't need to see the context-free part of the
> > > grammar to understand the narrative for statements.  But when reading
> > > about tokenization, not having the grammar presented at the same time
> > > is quite a burden.  I'd recommend duplicating the relevant productions
> > > from section 14 into the subsections of section 6.
> > > 
> > > There is some sort of exposition problem.  The result of
> > > "tokenization" is that the sequence of characters of the source is
> > > converted into a sequence of "tokens".  Then some subset of the tokens
> > > is discarded as being non-significant (e.g., whitespace and comments),
> > > and the remainder is parsed with a context-free grammar.  Here I can't
> > > figure out what the set of tokens is.  Looking at the grammar in
> > > section 14, it seems to be a context-free grammar on characters.  But
> > > that implies that there is no separate tokenization phase.
> > > 
> > > An example that shows the problems:
> > > 
> > >    mod:ext
> > > 
> > > Is this one token, which is also an extension keyword, or is it a
> > > sequence of three tokens?
> > 
> > The text says:
> > 
> >   A token in YANG is either a keyword, a string, a semicolon (";"), or
> >   braces ("{" or "}").
> > 
> > and:
> > 
> >   A keyword is [...] or a prefix identifier, followed by a colon
> >   (":"), followed by a language extension keyword.
> > 
> > So "mod:ext" is one token.
> 
> Certainly it can be one token.  My question is how do verify that it is
> not a string?  I think that may be the origin of my confusion here is
> that I haven't spotted a clear syntax for unquoted string.  In most
> programming languages, mod:ext would be parsed as an identifier, a
> colon, and an identifier.  In YANG, identifiers are usually tokenized as
> strings, so I ask whether YANG tokenizes it as a string, a colon, and a
> string.
> 
> Looking at the beginning of 6.1.3, it doesn't appear that an unquoted
> string is forbidden from containing a colon.
> 
> I think that the underlying problem is that I'm not clear on what gets
> tokenized as an unquoted string.

Note that this is legal YANG:

   leaf type {
     type string;
   }
     
I think there are two ways to look at this.  Either we describe the
tokenizer as being context-dependent, or we describe the "argument" in
a "statement" to be a "string or keyword".

In the latter case maybe we can do:

OLD:

  If a string contains any space, tab, or newline characters, a single
  or double quote character, a semicolon (";"), braces ("{" or "}"),
  or comment sequences ("//", "/*", or "*/"), then it MUST be enclosed
  within double or single quotes.

NEW:

  An unquoted string is any sequence of characters that does not start
  with a double or single quote character, is not a keyword, and does
  not contain any space, tab, or newline characters, a single or
  double quote character, a semicolon (";"), braces ("{" or "}"), or
  comment sequences ("//", "/*", or "*/").

In section 6.3 we must also do:

OLD:

   The argument is a string, as defined in Section 6.1.2.

NEW:

   The argument is a string or a keyword, as defined in Section 6.1.2.



> > > -- it must be an unquoted string.
> > > 
> > >    If a double-quoted string contains a line break followed by space or
> > >    tab characters that are used to indent the text according to the
> > >    layout in the YANG file, this leading whitespace is stripped from the
> > >    string, up to and including the column of the double quote character,
> > >    or to the first non-whitespace character, whichever occurs first.  In
> > >    this process, a tab character is treated as 8 space characters.
> > > 
> > > This description isn't quite careful enough.  Better:
> > > 
> > >    If a double-quoted string contains a line break followed by space or
> > >    tab characters, an initial part of this whitespace is removed from the
> > >    string.  The amount removed is the longest prefix whose width is no
> > >    larger than the width of the prefix of Yang source line containing
> > >    the opening double quote character of the string to and including the
> > >    opening double quote character.  For this purpose, the width of a
> > >    tab character is 8 and the width of any other character is 1.
> > > 
> > > This does assume that all tabs are considered to have width 8, that
> > > is, tabs do not have the usual semantics of "advance to the next
> > > column that is divisible by 8".  That will sometimes cause unexpected
> > > results, e.g., if some source lines start with SPC TAB.  (Consider
> > > that whitespace before a line break is removed, which suggests the
> > > intention is that the value of the string should depend only on its
> > > visual appearance.)
> > > 
> > > Also, we're using the convention that "whitespace" does NOT include CR
> > > or LF, which is not always how the term is used.  Perhaps a definition
> > > of "whitespace" should be put in section 3.
> > > 
> > > There is also the special case:
> > > 
> > >    SPC " LF
> > >    TAB x "
> > > 
> > > Is the initial TAB of the second line to be removed or not?  There is
> > > no whitespace removal in the second line that will exactly reach the
> > > opening double quote.  As I've written it, the TAB is not removed.
> 
> Don't forget this ugly special case.

So, let's follow the rules.  We need to trim to the column of the
double quote character (2).  The second line starts with "space or
tab" so we do whitespace trimming, while treating the tab as 8
spaces.  So from 8 spaces we subtract 2, and get the resulting string
of 6 characters:

  LF SPC SPC SPC SPC SPC SPC x

> > >    Within a double-quoted string (enclosed within " "), a backslash
> > >    character introduces a special character, which depends on the
> > >    character that immediately follows the backslash:
> > > 
> > > s/introduces a special character/introduces a representation of a
> > > special character/
> > 
> > Hmm.  I think the original text is correct.  In the string "x\ny", the
> > backslash introduces the LF character into the string.  What is the
> > representation of LF that it would introduce?
> 
> I would call the character pair \n a "representation" of LF, and the
> first character of that pair, \, "introduces" the representation, in the
> sense "to make (something) known by formal announcement or
> recommendation", because it is the distinctive first character of the
> representation.  There may be an ambiguity in that "introduce" can also
> be used to mean "to add (something) to a system, a mixture, or a
> container", in this case, "to put a newline character into the
> string".

Ok, this makes sense.  I'll do the substituion you suggested

s/introduces a special character/introduces a representation of a
special character/


> > > - section 7.7.2
> > > 
> > > It appears to be assumed that the server's behavior is the same if (1)
> > > a leaf-list node is specified with zero values, and (2) if the
> > > leaf-list node is absent and has no default values.
> > > 
> > >    Otherwise, if the leaf-list's type has
> > >    a default value, and the leaf-list does not have a "min-elements"
> > >    statement with a value greater than or equal to one, then the leaf-
> > >    list's default value is the type's default value.  In all other
> > >    cases, the leaf-list does not have any default values.
> > > 
> > > I think it would help to s/the type's default value/one instance of
> > > the type's default value/.
> > 
> > I don't understand why this is better?
> 
> If I have an integer leaf, then values of the leaf are integers and it's
> proper to say that the default value of the leaf is 1.  But if I have an
> integer leaf-list, the values of the leaf-list are sequences of
> integers, and it's not quite proper to say that the default value of the
> leaf-list is the integer 1; properly said, the default value is a
> sequence containing one integer, which integer is 1 -- a sequence
> containing one item is not the same as the one item.

Ok; I read your propsal as "an instance of" rather than "one instance
of".  I'll make this change.

> > > Is the final condition correct?  E.g., if the leaf-list's type has a
> > > default value, and it has a min-elements with value 2, then the
> > > leaf-list has no default values
> > 
> > Correct.
> > 
> > > -- which is invalid.
> > 
> > I don't understand what you mean?  An exmaple is this:
> > 
> >   typedef foo {
> >     type int32;
> >     default 42;
> >   }
> > 
> >   leaf-list bar {
> >     type foo;
> >     min-elements 1;
> >   }
> > 
> > In this example, "bar" doesn't have a default value - since
> > min-elements means that the user MUST provide at least one value, and
> > the default value is only used if no value has been provided.
> 
> Let me run though the cases and you can check that I understand them
> correctly:
> 
>    leaf-list bar {
>      type foo;
>    }
>    -or-
>    leaf-list bar {
>      type foo;
>      min-elements 0;
>    }
> 
> The default value is one instance of 42.  If the data tree doesn't
> supply a value, the effect is the same as <bar>42</bar>.
> 
>    leaf-list bar {
>      type foo;
>      min-elements 1;
>    }
> 
> There is no default value.  If the data tree doesn't supply a value,
> then no values are implied.  But that violates the restriction, so the
> data tree must supply one or more values -- if there are no values, that
> violates the restriction and the data tree is invalid.
> 
> Am I correct?

Yes.

> > > - section 7.7.7.1
> > > 
> > >    The entries in the list are sorted according to an order determined
> > >    by the system.  The "description" string for the list may suggest an
> > >    order to the server implementor.  If not, an implementation is free
> > >    to sort the entries in the most appropriate order.
> > > 
> > > This is mealy-mouthed about what is required and what is recommended.
> > > Better would be to omit "If not, an implementation is free to sort the
> > > entries in the most appropriate order."
> > 
> > I think the last sentence makes it clear that there you cannot assume
> > anything about the order, e.g., that a list that has an int32 as the
> > key is sorted in numerical order.
> 
> I guess what is bothering me is "appropriate", which to me implies that
> there is some standard of appropriateness.  I think I would prefer "to
> sort the entries in any order".

Ok.

> Actually, there is a somewhat subtle problem:  If I say "the system can
> sort them any way it wants", I am asserting that *there is a sorting
> order*.  Which means that if value A is put before value B at one time,
> then if values A and B are in the list at some other time, A will
> precede B.

The next sentences says:

  An implementation SHOULD use the same order for the same data,
  regardless of how the data were created.  Using a deterministic
  order will make comparisons possible using simple tools like "diff".

> I think the way to remove that implication is to replace "sort" with
> "order" -- a change which you devised for 7.7.7.2, I see:

Ok.  (To me, this means the same thing, and the repeated usage of
"order" makes the text a bit clumsy: "The entries in the list are
ordered according to an order defined by the user." -- but if you
think that the usage of "sort" is problematic, it is fine with me to
avoid that word.)

>     The entries in the list are ordered as determined by the system.
>     The "description" string for the list may suggest an order to the
>     server implementor.  But an implementation is free to sort the
>     entries in any other manner.
> 
> > > - section 7.7.7.2
> > > 
> > >    The entries in the list are sorted according to an order defined by
> > >    the user.  This order is controlled by using special XML attributes
> > >    in the <edit-config> request.
> > > 
> > > Actually, the entries aren't "sorted", as there is no requirement that
> > > any particular set of values always has a particular order; the user
> > > can insert 1, 2, 3 at one time, and 3, 2, 1 at another.  The correct
> > > meaning is splicing these two sentences:
> > > 
> > >    The user orders entries in the list by using special XML attributes
> > >    in the <edit-config> request.
> > 
> > I prefer to keep the second sentence, and actually modify it to be:
> > 
> >   In NETCONF, this order is controlled by using special XML attributes
> >   in the <edit-config> request.
> 
> Ah, yes, of course that only applies in NETCONF.
> 
> > Would it help to s/sorted/ordered/ in the first sentence?
> 
> Yes, that is a definite improvement.
> 
> > > - section 7.9.5.  XML Encoding Rules
> > > 
> > >    The choice and case nodes are not visible in XML.
> > > 
> > > Use this sort of statement for lists and leaf-lists, which have no
> > > visible XML for the lists as a whole.
> > 
> > The current text says:
> > 
> >   A list is encoded as a series of XML elements, one for each entry in
> >   the list.
> > 
> > I think this text is clear.  I think it might be a bit confusing to
> > say that a list node is not visible in the XML - it is visible, per
> > entry.
> 
> My difficulty is that by default, I would expect that XML representing a
> "list of things" would have a pattern like
> 
>     <list-of-things>
>         thing
>         thing
>         thing
>     </list-of-things>
> 
> or perhaps 
> 
>     <list-of-things>
>         <thing>...</thing>
>         <thing>...</thing>
>         <thing>...</thing>
>     </list-of-things>
>        
> whereas Yang uses
> 
>     <thing>...</thing>
>     <thing>...</thing>
>     <thing>...</thing>
> 
> that is, there is an element for each list item, but not one for the
> list as a whole.  I don't know why I expect there to be an XML element
> that wraps the whole list, I suspect it is from my background in
> programming language theory.  OTOH, YANG seems to use the opposite
> convention of often avoiding XML elements to wrap aggregate types as a
> whole, and only having XML elements for the component data items.
> (Although containers have enwrapping XML elements.)
> 
> Perhaps this revision would work:
> 
>    A list is encoded as a series of XML elements, one for each entry in
>    the list.  There is no XML element surrounding the list as a whole.

Ok.

> > > - section 7.10
> > > 
> > >    The "anydata" statement is used to represent an unknown set of nodes
> > >    that can be modelled with YANG, except anyxml, ...
> > > 
> > > What is the practical consequence of this restriction?  It's not at
> > > all clear to me which chunks of XML could be modelled with Yang and
> > > which could not.
> > 
> > In practice, this means that if you see an instance of anydata, there
> > is a YANG model behind that chunk somewhere, although you might not
> > know what it is.  In an implementation it means you can use your
> > normal parser (which maybe doesn't handle mixed content and processing
> > instructions etc).
> 
> Ah, yes.  That last sentence is the practical consequence, and yes, it's
> directly implied by the text in the draft.
> 
> > > - section 7.12
> > > 
> > >    A grouping is like a "structure" or a "record" in conventional
> > >    programming languages.
> > > 
> > > Add, "... though no grouping node exists in the XML."
> > 
> > Since we try to decouple the defintion of the language from the
> > encodings, I'd prefer to not add this.
> 
> Yes...  What I'm talking about is actually the XML encoding of a uses
> statement.  7.13.3 specifies by omission that the XML for the set of
> nodes in the grouping is not wrapped in a "grouping" element.
>  
> > > - section 7.18
> > > 
> > > It would help to insert a discussion of the global purpose and use of
> > > identities.  In particular, what things can refer to them?  It looks
> > > like only identityref leafs can do so, but I could easily be wrong.
> > > Also, more discussion in the example (7.18.3) would be helpful.
> > > 
> > >    The "identity" statement is used to define a new globally unique,
> > >    abstract, and untyped identity.  Its only purpose is to denote its
> > >    name, semantics, and existence.
> > > 
> > > The two uses of "its" in the second sentence are ambiguous.  I think
> > > you mean "The statement's only purpose is to denote the identity's
> > > name, semantics, and existence."
> > 
> > No, rather "The identity's only purpose...".  The point is that this
> > just an abstract concept.
> 
> OK, but I think you may want to replace the first "Its" with "The
> identity's".

Ok.

> > > - section 7.21.4
> > > 
> > >    The "reference" statement takes as an argument a string ...
> > > 
> > > Perhaps s/a string/a human-readable string/.
> > 
> > "string" refers to the YANG token "string".  The same wording is used
> > across the document for all arguments.
> 
> I was thinking that it is a string, but in this particular case, it is
> supposed to be human-readable, whereas strings in other contexts aren't
> expected to be.

Ok.  Maybe:

OLD:

  The "reference" statement takes as an argument a string that is used
  to specify a textual cross-reference to an external document,

NEW:

  The "reference" statement takes as an argument a string that is used
  to specify a human-readable cross-reference to an external document,


> > > - section 7.21.5
> > > 
> > > Note that if a data definition has both an "if-feature" and a "when",
> > > then the "if-feature" is tested first.
> > > 
> > >    If the XPath expression references any node that also has associated
> > >    "when" statements, these "when" expressions MUST be evaluated first.
> > >    There MUST NOT be any circular dependencies in these "when"
> > >    expressions.
> > > 
> > > I think this could be better phrased:
> > > 
> > >    If the XPath expression references any node that also has
> > >    associated "when" statements, then the "when" expressions of the
> > >    referenced nodes MUST be evaluated first.  There MUST NOT be any
> > >    circular dependencies among "when" expressions.
> > 
> > Ok to the last sentence.  Do you think that the word "these" in the
> > first sentence is ambigious?
> 
> I must have thought it was unclear when I read it, otherwise I would not
> have suggested changing it.  But reading it again, I think that there is
> no ambiguity.  Perhaps it would be a little clearer to use 'those "when"
> expressions' rather than 'these "when" expressions'.  (I can't explain
> clearly why "those" seems less ambiguous than "these".)

Ok, as a non-native english speaker I trust you that "those" is better.

> > > - section 9.9
> > > 
> > >    The leafref type is used to declare a constraint on the value space
> > >    of a leaf, based on a reference to a set of leaf instances in the
> > >    data tree.  The "path" substatement (Section 9.9.2) selects a set of
> > >    leaf instances, and the leafref value space is the set of values of
> > >    these leaf instances.
> > > 
> > > The first sentence isn't quite right.  Perhaps:
> > > 
> > >    The value space of a leaf with leafref type is one of the values of
> > >    a set of leaf instances in the data tree.  The "path" substatement
> > >    (Section 9.9.2) selects the set of leaf instances, and the leafref
> > >    value space is the set of values of these leaf instances.
> > 
> > The point is that "leafref" introduces a constraint on valid data.
> > In for example the candidate configuration of NETCONF, it is ok if a
> > leafref leaf points to something that doesn't exist.  However it must
> > exist at commit-time.  See below.
> 
> Let me think this through...  Really, the leafref *type* has the same
> set of values as the type of the nodes its path selects (let's call it
> the base type), but it also has an attached constraint (which must be
> satisfied at the times specified in section 8).  In particular, you do
> not expect the constraint to be satisfied within the data tree of an
> RPC.

Yes.

> By implication, the leafref's value is considered to be a pointer to a
> particular leaf instance, the one with the matching value.  But that
> idea is not embedded in the Yang semantics of leafref types in any way
> (other than the output of the deref function), so the fact that there
> might be more than one matching leaf instance does not matter.
> 
> As stated in 9.9.4 and 9.9.5, the lexical representations of its values
> are the same as those of the referenced nodes.
> 
> How is the leafref's value compared to the values of the referenced
> nodes?  I can see that question getting ugly for the more complex types
> (e.g., anyxml)

You can't have a leafref to an anyxml node; just to a leaf or
leaf-list.

> which do not have canonical forms.  I suspect the
> intention is that values are equal if they have the same canonical form

No, the idea is that they are equal if their *value* is equal,
regardless of the lexical representation.

> -- and that only base types with canonical forms are allowed.
> 
> If I'm reading correctly, the constraint is only applied to
> configuration leafs which have require-instance=true.  Any other leafref
> leaf has no constraint (though its values are required to be compatible
> with the base type).  I'm not sure I'm reading this correctly,
> though.

This is correct.

> But look at the second paragraph of section 9.9:
> 
>    If the leaf with the leafref type represents configuration data, and
>    the "require-instance" property (Section 9.9.3) is "true", the leaf
>    it refers to MUST also represent configuration.  Such a leaf puts a
>    constraint on valid data.  All such nodes MUST reference existing
>    leaf instances or leafs with default values in use (see Section 7.6.1
>    and Section 7.7.2) for the data to be valid.  This constraint is
>    enforced according to the rules in Section 8.
> 
> The second sentence starts "Such a leaf puts a constraint ...".  But
> "such a leaf" must mean "the leaf with the leafref type represents
> configuration data, and the "require-instance" property (Section 9.9.3)
> is "true"".  I suspect that is not the intended meaning.
> 
> In 9.9.2,
> 
>    The "path" expression evaluates to a node set consisting of zero,
>    one, or more nodes.  If the leaf with the leafref type represents
>    configuration data, this node set MUST be non-empty.
> 
> This appears to be a constraint that applies even if
> require-instance=false.  Is that correct?  If so, it represents a
> constraint that is independent of the main constraint (that there is a
> leaf with a matching value), because that constraint is only effective
> if require-instance=true.  That probably should be mentioned in 9.9
> along with the other constraint.

Note that I started a ML thread for this issue
("leafref value space and constraint").

See more below.

> > > - section 9.9.3
> > > 
> > >    If "require-instance" is "true", it means that the instance being
> > >    referred MUST exist for the data to be valid.  This constraint is
> > >    enforced according to the rules in Section 8.
> > > 
> > >    If "require-instance" is "false", it means that the instance being
> > >    referred MAY exist in valid data.
> > > 
> > > What does "the instance being referred" mean?  Perhaps you meant "the
> > > instance being referred to"?
> > 
> > Yes.
> 
> I think this could be made clearer, because it's not quite right to say
> that "the XXX does not exist" -- if it doesn't exist, there is no "the
> XXX".  Perhaps,
> 
>     If "require-instance" is "true", it means that the value of the
>     leafref or instance-identifier leaf must equal the value of one of
>     the leafs in the set that the "path" expression evaluates to.  This
>     constraint is enforced according to the rules in Section 8.
>  
>     If "require-instance" is "false", it means that the value of the
>     leafref or instance-identifier leaf is not constrained in this way.
> 
> Though this constraint only applies if the leafref leaf is configuration
> data (per section 9.9), which limitation needs to be stated in this text
> also.
> 
> There seem to be significant differences between the application of
> require-instance to leafref types and instance-identifier types.  The
> general idea is similar, but the specific semantics are different to the
> point that the instance-identifier case might require a seperate
> statement.

I think the semantics is the same - in both cases you have a reference
to some other node.  The syntax is quite different, obviously.

> If I understand correctly, require-instance is a restriction, and so can
> be applied to a derived type which itself has a require-instance
> restriction.  Generally, that sort of restriction requires that the new
> restriction is compatible with the old one.  E.g., in 9.2.4:
> 
>    If a range restriction is applied to an already
>    range-restricted type, the new restriction MUST be equal or more
>    limiting, that is raising the lower bounds, reducing the upper
>    bounds, removing explicit values or ranges, or splitting ranges into
>    multiple ranges with intermediate gaps.
> 
> In the case of require-instance, the new restriction should presumably
> have the same value as that of the base type.  OTOH, that adds no
> expressive power, so perhaps applying a require-instance restriction to
> a derived type can be forbidden.  Or perhaps it is allowed to add
> require-instance=true to a base type with require-instance=false?  That
> override does make the type definition more restrictive, and so is
> compatible with the general idea of restrictions on derived types.

Yes; however, at least with the proposed changes, the *value space* is
the same, regardless of the value of "require-instance".  But setting
this changes the semantic constraints.  I agree that it is a bit
non-intuitive to allow relaxing of such a constraint, but this is how
it is done in YANG 1, so changing this would be backwards incompatible.

> OK, now back to what you said:
> 
> > Right; so what we want to say is:
> > 
> >   - the value space is the same as the value space of the referred
> >     leaf
> > 
> >   - there is an additional constraint if require-instance is true that
> >     the referred to leaf instance must exist
> > 
> > How about
> > 
> > OLD:
> > 
> >    The leafref type is used to declare a constraint on the value space
> >    of a leaf, based on a reference to a set of leaf instances in the
> >    data tree.  The "path" substatement (Section 9.9.2) selects a set of
> >    leaf instances, and the leafref value space is the set of values of
> >    these leaf instances.
> > 
> >    If the leaf with the leafref type represents configuration data, and
> >    the "require-instance" property (Section 9.9.3) is "true", the leaf
> >    it refers to MUST also represent configuration.  Such a leaf puts a
> >    constraint on valid data.  All such nodes MUST reference existing
> >    leaf instances or leafs with default values in use (see Section 7.6.1
> >    and Section 7.7.2) for the data to be valid.  This constraint is
> >    enforced according to the rules in Section 8.
> > 
> > NEW:
> > 
> >    The leafref type is used to declare a constraint on the value space
> >    of a leaf, based on a reference to a set of leaf instances in the
> >    data tree.  The "path" substatement (Section 9.9.2) is used to refer
> >    to another leaf node.  The leafref value space is the value space of
> >    this leaf node.
> > 
> >    If the "require-instance" property is "true", there MUST exist an
> >    instance, or a leaf with a default value in use (see Section 7.6.1
> >    and Section 7.7.2), of the leaf being referred to with the same value
> >    as the leafref value in a valid data tree.
> > 
> >    If the leaf with the leafref type represents configuration data, and
> >    the "require-instance" property (Section 9.9.3) is "true", the leaf
> >    it refers to MUST also represent configuration.
> 
> I think the first paragraph needs to also say that the leafref type is
> sort of a derived type of the type of the leaf node (which is in the
> schema).

The current proposed text is:

   The leafref type is restricted to the value space of some leaf or
   leaf-list node in the schema tree and optionally further restricted
   by corresponding instance nodes in the data tree.  The "path"
   substatement (Section 9.9.2) is used to identify the referred leaf or
   leaf-list node in the schema tree.  The value space of the referring
   node is the value space of the referred node.

   If the "require-instance" property (Section 9.9.3) is "true", there
   MUST exist an node in the data tree, or a node with a default value
   in use (see Section 7.6.1 and Section 7.7.2), of the referred schema
   tree leaf or leaf-list node with the same value as the leafref value
   in a valid data tree.

> "If the "require-instance" property is "true", there MUST exist ..." is
> really "If the "require-instance" property is "true", the constraint is
> that there MUST exist ...".
> 
> For section 9.9.6, I think "an existing address of an interface" was
> probably meant to be "the address of an existing interface".  Also,
> shouldn't there be require-instance restrictions in the example Yang
> code?
> 
>      container default-address {
>        leaf ifname {
>          type leafref {
>            path "../../interface/name";
> >>>      require-instance true;

This is not necessary b/c the default is "true".

>          }
>        }
>        leaf address {
>          type leafref {
>            path "../../interface[name = current()/../ifname]"
>               + "/address/ip";
> >>>      require-instance true;
>          }
>        }
>      }
> 
> > This needs to be verified on the ML (I'll start a separate thread for
> > this).
> 
> Yes...  I get the feeling that I still don't have an accurate
> understanding of what is intended.
> 
> > >    In the XML encoding, a value representing a union data type is
> > >    validated consecutively against each member type, in the order they
> > >    are specified in the "type" statement, until a match is found.  The
> > >    type that matched will be the type of the value for the node that was
> > >    validated.
> > > 
> > > I think a distinction needs to be made here between generating XML and
> > > interpreting XML:
> > > 
> > >    When generating an XML encoding, a value is encoded according to
> > >    the rules of the member type to which the value belongs.  When
> > >    interpreting an XML encoding, a value is validated consecutively
> > >    against each member type, in the order they are specified in the
> > >    "type" statement, until a match is found.  The type that matched
> > >    will be the type of the value for the node that was validated, and
> > >    the encoding is interpreted according to the rules for that type.
> > 
> > Yes, better, but I don't understand what value the last part of the
> > last sentence adds?
> 
> It makes explicit how the value is to be extracted from the encoding.

Fine.

> > > - section 10.3.1
> > > 
> > >    If the first argument node is of type leafref, the function returns a
> > >    node set that contains the nodes that the leafref refers to.
> > > 
> > > I *think* that this means the set of nodes that the leafref's XPath
> > > selects, but a plausible alternative is the subset of those nodes
> > > which have the same value as the leafref does.  This hinges on the
> > > meaning of "the leafref refers to a node", which isn't defined
> > > unambiguously in section 9.9.
> > 
> > It is the latter.  Howabout:
> > 
> > NEW:
> > 
> >    If the first argument node is of type leafref, the function returns
> >    a node set that contains the nodes that the leafref refers to.
> >    Specifically, this set contains the nodes selected by the leafref's
> >    "path" statement (Section 9.9.2) that has the same value as the
> >    first argument node.
> 
> Yes.  (Although "that has the same value" should be "that have the same
> value".)

Fixed.

> > >    instance-identifier-specification =
> > >                          [require-instance-stmt]
> > > 
> > > Why is require-instance-stmt in [...]?  The only use of
> > > instance-identifier-specification is in type-body-stmts, and the only
> > > use of type-body-stmts is in type-stmt, where it is in [...].  Compare
> > > with numerical-restrictions:
> > > 
> > >    numerical-restrictions = range-stmt
> > > 
> > > --
> > > 
> > >    binary-specification = [length-stmt]
> > 
> > You're right that it isn't necessary.  However, imo it is more clear;
> > in an instance-identifier spec require-instance is optional, but the
> > only numerical-restriction is range.
> 
> As far as I can see, numerical-restrictions, decimal64-specification,
> and numerical-restrictions are to be used similarly, as they are
> alternatives of type-body-statements.  But there seems to be no
> uniformity as to whether one of these alternatives can be empty:
> 
>    numerical-restrictions = range-stmt
> 
>    instance-identifier-specification =
>                          [require-instance-stmt]
> 
>    decimal64-specification = ;; these stmts can appear in any order
>                              fraction-digits-stmt
>                              [range-stmt]
> 
> It seems to me that whatever reasoning applies to whether range-stmt is
> optional within numerical-restrictions applies to require-instance-stmt
> within instance-identifier-specification.

You're right.  By this logic we should do:

OLD:

    numerical-restrictions = range-stmt

NEW:

    numerical-restrictions = [range-stmt]

> > > Similarly.
> > > 
> > >    quoted-string       = (DQUOTE string DQUOTE) / (SQUOTE string SQUOTE)
> > > 
> > >    string              = < an unquoted string as returned by >
> > >                          < the scanner, that matches the rule >
> > >                          < yang-string >
> > > 
> > > The handling of "string" is hard to follow and/or inexact.  First, we
> > > need a term for the "value" of a string written in Yang source, so we
> > > can say
> > > 
> > >    prefix-arg-str      = < a string whose ??? matches the rule >
> > >                          < prefix-arg >
> > > 
> > > The text that is now used is not too difficult to understand, but it's
> > > inexact and makes it hard to write specifications about how the value
> > > is determined.
> > 
> > It's not clear to me which problem you want to solve, and also it's
> > not clear exactly what you propose.
> 
> I think the problem is that I don't see the rule for extracting unquoted
> strings from the Yang source.  Certainly, not all yang-string's can be
> written directly in the source as unquoted strings, not even all those
> that satisfy the constraint of 6.1.3, "If a string contains any space,
> tab, or newline characters, a single or double quote character, a
> semicolon (";"), braces ("{" or "}"), or comment sequences ("//", "/*",
> or "*/"), then it MUST be enclosed within double or single quotes."
> 
> E.g., the null string matches yang-string, but it can't be written as an
> unquoted string at all (else tokenizing any source would be ambiguous).
> Similarly, mod:ext matches yang-string and is not required to be quoted
> by 6.1.3, but it seems that those characters in the source are not
> intended to be tokenized as a string.
> 
> > > (Also, there seem to be too many <...>, dividing the narrative text
> > > into two parts.  The natural way to write it would be:
> > > 
> > >    prefix-arg-str      = < a string whose ??? matches the rule
> > >                            prefix-arg >
> > 
> > Agreed, but LF isn't allowed in prose-val in ABNF (see RFC 5234).
> 
> Ah, I hadn't known that <...> was a defined usage.
> 
> > > We need a production that tells exactly the syntax of "an unquoted
> > > string as return by the scanner".
> > > 
> > > We also need productions for quoted strings.  I can reconstruct these:
> > > 
> > >    quoted-string       = (single-quoted-string / double-quoted-string)
> > >                            *(optsep "+" optsep
> > >                              (single-quoted-string / 
> > > double-quoted-string))
> > > 
> > >    single-quoted-string = SQUOTE *(yang-char excluding "'") SQUOTE
> > > 
> > > (I don't know a good way to exclude one character from a character set
> > > in ABNF.)
> > > 
> > >    double-quoted-string = DQUOTE *dq-char DQUOTE
> > > 
> > >    dq-char = yang-char excluding BACKSLASH and DQUOTE /
> > >              BACKSLASH ( "n" / "t" / DQUOTE / BACKSLASH )
> > > 
> > > Verify that "optsep" is what is allowed to appear around "+".
> > > 
> > > Needs a comment pointing to Section 6.1.3 as telling how to interpret
> > > quoted-strings.
> 
> The current ABNF doesn't allow for "+" for joining quoted strings.
> Also, it doesn't show that \" can be included in a double quoted string
> to include a literal ", and allows the string contents to continue --
> the current ABNF "DQUOTE string DQUOTE" matches "abcd\", despite that
> the latter is not a proper double-quoted string.

Note that the prose text (within <...>) says "a string that
matches...".  That string can be any YANG token string, for example
one of:

   "hello"
   "he" + "llo"


/martin

_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Reply via email to