Members of Routing Area Yang DT have had some discussions about the handling of 
various variants of regular expressions. The followings are the current state, 
and we are thinking that if this topic can be added to RFC6087bis:

1. Regular Expression Usage
YANG uses regular expressions to restrict string values. Such a restriction can 
be a part of a "pattern" statement or a string matching function. [RFC7950] 
specifies that YANG regular expressions will conform to Appendix F in 
[XSD-TYPES].
YANG models have been implemented in many different environments and the XSD 
variant of the regular expressions is not supported in many of these 
environments. There are currently more than a dozen popular regular expression 
variants implemented in various environments. While the usage of the XSD 
variant of regular expression described in [RFC7950] remains the preferred 
standard, a few conventions are prescribed to maximize the portability of YANG 
models between environments.

1.1. Regular Expression Variant Choice Precedence
YANG model designers SHOULD use the most portable syntax whenever possible. 
Under the condition that XSD compliance is satisfied and there are multiple 
choices for a given expression, the following precedence SHOULD be used to 
choose a regular expressions variant:

o    POSIX base

o    POSIX extended

o    BSD

o    GNU Regular Expression Extensions

o    C++ Regular Expressions with std::regex

o    Others

For example, either \d or [0-9] can be used with equivalent semantics and they 
are both compliant to [XSD-TYPES]. [0-9] is recommended because [0-9] is 
supported by POSIX base but \d is not.

1.2.  Convention Guidelines
1.2.1. Avoid Character Category Escapes
For example, in XSD regular expression, \d is a Character Category Escape 
denoting the range of digits, i.e.,  [0-9]. To maximize portability, the model 
designers SHOULD use [0-9] instead of \d.

1.2.2. Avoid Unicode Characters
Unicode characters are allowed in XSD regular expressions, but are not 
supported in the POSIX variant. If possible, the model designers SHOULD avoid 
using Unicode characters, such as: \p{L} and \p{N}.

1.3. Conversion Tools
Tools can automatically convert regular expressions from one variant to 
another. When a YANG model is implemented in an environment where XSD regular 
expressions are not supported, the recommended approach is to use a conversion 
tool. For example, if needed, anchor position characters, i.e., '^' and '$', 
can be added by a regular expression conversion tool.

1.4. Validation Tools
Tools can be used to validate regular expressions in YANG files. The followings 
are some of these tools:

o    YANG W3C Regex Expression Validator: https://yangcatalog.org/yangre

This an on-line tool with a WEB interface.

o    yangre as a part of the libyang package:
https://github.com/CESNET/libyang

This is an open source tool with a command line interface.

Usage:

    yangre [-hvV] -p <regexp1> [-i] [-p <regexp2> [-i] ...] <string>



Returns 0 if string matches the pattern(s), 1 if not and -1 on error.



Options:

  -h, --help              Show this help message and exit.

  -v, --version           Show version number and exit.

  -V, --verbose           Print the processing information.

  -i, --invert-match      Invert-match modifier for the closest preceeding

                          pattern.

  -p, --pattern="REGEXP"  Regular expression including the quoting,

                          which is applied the same way as in a YANG module.



Examples:

  pattern "[0-9a-fA-F]*";      -> yangre -p '"[0-9a-fA-F]*"' '1F'

  pattern '[a-zA-Z0-9\-_.]*';  -> yangre -p "'[a-zA-Z0-9\-_.]*'" 'a-b'

  pattern [xX][mM][lL].*;      -> yangre -p '[xX][mM][lL].*' 'xml-encoding'



References



[XSD-TYPES] Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes Second 
Edition", World Wide Web Consortium Recommendation REC-xmlschema-2-20041028, 
October 2004, <http://www.w3.org/TR/2004/REC-xmlschema-2-20041028>

[POSIX]   IEEE Std 1003.1-2008, 2016, 
<https://standards.ieee.org/findstds/standard/1003.1-2008.html>

[BSD]     Regular Expression, <http://www.bsd.org/regexintro.html>

[C++]     ISO/IEC DIS 14882: Programming Languages - C++, 2017.

Thanks,
- Xufeng
_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Reply via email to