On 2020-02-17 23:14, Christian Hopps wrote:
>
>
>> On Feb 17, 2020, at 4:42 PM, Randy Presuhn 
<randy_pres...@alumni.stanford.edu> wrote:
>>
>> Hi -
>>
>> On 2/17/2020 11:47 AM, Christian Hopps wrote:
>>>> On Feb 17, 2020, at 11:51 AM, Randy Presuhn 
<randy_pres...@alumni.stanford.edu> wrote: Hi - On 2/17/2020 3:15 AM, Christian Hopps 
wrote: ...
>>>>> BTW, I did look at the "SHOULD be avoided" (occurs twice that I saw) once 
dealing with LFs and CRs which lucky for us is not part of a tags allowable characters.
>>>> There are lots of other things that complicate life. The Yang string 
definition circumscribes some of them, but not all.
>>>>> " typedef tag { type string { length "1..max"; pattern '[\S ]+'; } "
>>>> This pattern doesn't make sense to me when I try to understand it using https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#charcter-classes It excludes "symbols", but permits, for example, paragraph separators and formatting characters and such delights as zero-width non-joiner. Also, in complementing the "all symbols" category, it seems to me it already permits space, so I don't see why it calls out space again. >>> The intent was to have the pattern match the description immediately below it: "A tag value is composed of a standard prefix followed by any type 'string' value that does not include carriage return, newline or tab characters." Does this pattern fail in doing that?
>>
>> Yes, what it accomplishes does not match the stated intent.
>
> I'm finding this hard to believe looking at the definition of "\S" which is "everything but space, tab, newline and carriage return" and then adding "space". Seems to match the definition unless we quibble over the prefix (which I don't think we are).

+1

>> I suspect you may have intended something like '[\Z ]+'
>> See https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#charcter-classes
>
> I don't think that's a valid pattern.

+1

> If you are talking about the property categories (where I see 'Z' mentioned as "All separators") then there doesn't appear to be a "lower means include, upper means exclude" relationship. Also it appears that to refer to one of these things the syntax is actually "\P{Z}" or "\p{Z}" not just "Z". So translating maybe that's "[\P{Z} ]"? I see nothing that defines how "catEsc" (\p{}) vs "compEsc" (\P{}) are different, but maybe the upper here means exclude.

The description is right above the definitions:

  The set containing all characters that have property X, can be
  identified with a category escape \p{X}. The complement of this set
  is specified with the category escape \P{X}. ([\P{X}] = [^\p{X}])

So yes, \P{Z} would be the complement of "All Separators", while your
original \S is the complement of \s ([#x20\t\n\r]). I.e. \P{Z} would
exclude "more separators", but is hardly worth the trouble I think -
and it is *not* the "stated intent".

> I'm more inclined to just ditch any pattern or restriction the more this gets discussed. Let the user do what they want. If they want to include crazy unicode stuff (almost certainly they dont) then I guess that's what they want.

FWIW, as I already wrote, I think your original pattern is fine (and I
think Randy needs to have a closer look at the section he references).

--Per

> Thanks,
> Chris.
>
>>
>> Randy
>>
>> _______________________________________________
>> netmod mailing list
>> netmod@ietf.org
>> https://www.ietf.org/mailman/listinfo/netmod
>>
>
> _______________________________________________
> netmod mailing list
> netmod@ietf.org
> https://www.ietf.org/mailman/listinfo/netmod
>

_______________________________________________
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod

Reply via email to