Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Juergen Schoenwaelder
On Wed, Sep 26, 2018 at 07:21:49PM +, Kent Watsen wrote:
> 
> 
> > This is not an improvement. Just more complexity and noise.
> 
> Disagree.  The RFC should not mandate the tool exits, that would be
> overreaching.  How about this simpler language?
> 
> NEW:
> 
>Scan the artwork for horizontal tab characters.  If any horizontal
>tab characters appear, either resolve them to space characters or
>exit, forcing the input provider to convert them first.

Sounds good.

/js

-- 
Juergen Schoenwaelder   Jacobs University Bremen gGmbH
Phone: +49 421 200 3587 Campus Ring 1 | 28759 Bremen | Germany
Fax:   +49 421 200 3103 

___
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod


Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Kent Watsen



> This is not an improvement. Just more complexity and noise.

Disagree.  The RFC should not mandate the tool exits, that would be
overreaching.  How about this simpler language?

NEW:

   Scan the artwork for horizontal tab characters.  If any horizontal
   tab characters appear, either resolve them to space characters or
   exit, forcing the input provider to convert them first.

Kent // co-author



___
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod


Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Juergen Schoenwaelder
On Wed, Sep 26, 2018 at 05:38:22PM +, Kent Watsen wrote:
> 
> The only change we're envisioning is to the following paragraph from
> Section 6.1 (Automated Folding):
> 
> OLD:
> 
>Scan the artwork to ensure the horizontal tab character does not
>appear.  If any horizontal tab character appears, exit (this artwork
>cannot be folded).
> 
> NEW:
> 
>Scan the artwork to ensure the horizontal tab character does not
>appear.  If any horizontal tab character appears, either 1) do
>something to support tabs in the folded output (out of scope to
>this draft), 2) do something to convert the horizontal tabs in
>the input to space characters in the folded output, or 3) exit,
>forcing the input provider to convert the horizontal tabs to
>space characters first.
> 
> Note: the "do something" phraseology seems too informal, would welcome
> suggestions for tightening it up.
>

This is not an improvement. Just more complexity and noise.

/js

-- 
Juergen Schoenwaelder   Jacobs University Bremen gGmbH
Phone: +49 421 200 3587 Campus Ring 1 | 28759 Bremen | Germany
Fax:   +49 421 200 3103 

___
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod


Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Kent Watsen




> Given that 2) has a)-c), I do not understand what the recommendation
> actually is. The recommendation is hopefully 2a) and we are done.



For the script that we include in the Appendix, the authors wish to
keep the current "2a" behavior and have no plans to change that.

The only change we're envisioning is to the following paragraph from
Section 6.1 (Automated Folding):

OLD:

   Scan the artwork to ensure the horizontal tab character does not
   appear.  If any horizontal tab character appears, exit (this artwork
   cannot be folded).

NEW:

   Scan the artwork to ensure the horizontal tab character does not
   appear.  If any horizontal tab character appears, either 1) do
   something to support tabs in the folded output (out of scope to
   this draft), 2) do something to convert the horizontal tabs in
   the input to space characters in the folded output, or 3) exit,
   forcing the input provider to convert the horizontal tabs to
   space characters first.

Note: the "do something" phraseology seems too informal, would welcome
suggestions for tightening it up.


Kent // co-author


___
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod


Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Martin Bjorklund
Juergen Schoenwaelder  wrote:
> On Wed, Sep 26, 2018 at 04:28:18PM +0100, Robert Wilton wrote:
> > My interpretation:
> > 
> > Option 2 is to disallow tabs in the output, but leave it to the
> > implementation to decide how to handle tabs in the input document, so a
> > script would be allowed to do a, b, or c.
> > 
> > Just supporting "option 2(a)" is the same as "option 1":
> > 
> >   1) RFC disallows TABS in both the source-input and folded-output.
> >  ***This is what we currently have***
> > 
> > Perhaps you mean that you prefer option 1?
> >
> 
> 1) or 2a) since 2b) and 2c) are not implementable in tools that are
> detached from the author and his/her editor and they will ultimately
> require to carry some metadata around and this means additional
> complexity for no real gain since the output has no TABs anyway.

I agree.  I prefer option 1.  If you do have tabs in your input,
simply translate them before running the folding algorithm.





/martin

___
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod


Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Juergen Schoenwaelder
On Wed, Sep 26, 2018 at 04:28:18PM +0100, Robert Wilton wrote:
> My interpretation:
> 
> Option 2 is to disallow tabs in the output, but leave it to the
> implementation to decide how to handle tabs in the input document, so a
> script would be allowed to do a, b, or c.
> 
> Just supporting "option 2(a)" is the same as "option 1":
> 
>   1) RFC disallows TABS in both the source-input and folded-output.
>  ***This is what we currently have***
> 
> Perhaps you mean that you prefer option 1?
>

1) or 2a) since 2b) and 2c) are not implementable in tools that are
detached from the author and his/her editor and they will ultimately
require to carry some metadata around and this means additional
complexity for no real gain since the output has no TABs anyway.

/js

-- 
Juergen Schoenwaelder   Jacobs University Bremen gGmbH
Phone: +49 421 200 3587 Campus Ring 1 | 28759 Bremen | Germany
Fax:   +49 421 200 3103 

___
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod


Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Robert Wilton

My interpretation:

Option 2 is to disallow tabs in the output, but leave it to the 
implementation to decide how to handle tabs in the input document, so a 
script would be allowed to do a, b, or c.


Just supporting "option 2(a)" is the same as "option 1":

  1) RFC disallows TABS in both the source-input and folded-output.
 ***This is what we currently have***

Perhaps you mean that you prefer option 1?

Thanks,
Rob


On 26/09/2018 16:22, Juergen Schoenwaelder wrote:

On Wed, Sep 26, 2018 at 02:09:52PM +, Kent Watsen wrote:

I recommend that we select "option-2" (see bottom).


[...]
  

   2) RFC disallows TABS only in the folded-output, per RFC 7991,
  leaving it to the folding-logic (the script) to decide if it
  wants to:
   a) disallow TABS in the source input (curr script does this)
   b) detect TABS exist and prompt user for TAB stop info
   c) detect TABS and query environment for cur TAB stop info
  (but tab-stops may differ in the shell the text editor,
  or whatever was used to create the text, right?)

Given that 2) has a)-c), I do not understand what the recommendation
actually is. The recommendation is hopefully 2a) and we are done.

/js



___
netmod mailing list
netmod@ietf.org
https://www.ietf.org/mailman/listinfo/netmod


Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Robert Wilton




On 26/09/2018 15:09, Kent Watsen wrote:

I recommend that we select "option-2" (see bottom).

Yes, option 2 seems reasonable to me.

Thanks,
Rob



- it easy to do.
- there's no current support for having tabs in folded output.
- doing so doesn't preclude an "option-3" someday in the future.

The authors will assume that this is WG consensus if there are no
objections within a week's time.

Kent // co-author


-Original Message-


[new subject line]

It is one thing for an editor to use tabs during the creation of text,
and another to publish text with an expectation that consumers will
render the tabs the same way.  Either the source editor converts tabs
to spaces, which is interoperable today, or keep the tabs while
publishing metadata in the text, using some TBD standard, enabling
consumers to use the same tab stops.

If there were a standard enabling the publishing of text including
tabs, it should work for all artwork, not just artwork that has been
folded.  This is similar to the discussion we had before about having
begin/end markers enabling perfect extractions, in that it is also
something that pertains to all artwork, not just artwork that has
been folded.

Thus, there are a total of three problems:
   P1: perfect extraction
   P2: tabs
   P3: long lines

Assuming all thing were solved problems, and assuming that we always
want perfect extraction, the possible combinations for the occurrence
of the other two problems are:
   - no tabs or long lines
   - tabs, but no long lines
   - long lines, but no tabs
   - tabs and long lines

How are they ordered?  Clearly supporting perfect extractions has
to be the outermost thing, but what about the other two?   Does it
matter?

Thinking about solutions:

  - the solution for long-lines is to use a header (not a footer)
because it's believed important to prime readers *before* they
read the text.

  - the solution for perfect-extraction could be either:
  - use both a header-and-footer marker (low tech)
  - or use either a header or a footer that encodes
something like a "num lines" value into the
marker.  (note: footer-only okay since the marker
is for programmatic processors, not the readers)

  - the solution for tabs could be to use either a header
or a footer that encodes the tab- stop metadata. (note:
footer-only okay since the marker is for programmatic
processors, not the readers)


If tabs were to be supported by the folding solution (note: it
doesn't make sense to talk about "folds being supporting by the
tabbing solution"), then either:

   a) tabs are handled *before* folding, and the folding-solution
  is aware of the tab-solution (i.e., it is able to process
  the metadata).

   - everybody nods ;)

   b) the folding-solution is really a folding+tab solution, that is,
  it has a built-in way of handling tabs (i.e., encoding tab stop
  metadata) independent of how tabs are handled for text that has
  not been folded.

   - this may be technically possible, but we should avoid having
 two solutions to solve the tab problem.  We would be better
 off solving the tab-problem directly and then use (a).

   c) the folding-solution folds using the source tab stops, but does
  not itself encode metadata about the tab stops, assuming that
  there is a "promise" that the encoding of the metadata will
  occur in a wrapper layer around it.

   - this feels icky, but it seems viable and, would possible
 allow us to proceed with this draft without having to solve
 the tabbing problem now.


Options:

   1) RFC disallows TABS in both the source-input and folded-output.
  ***This is what we currently have***

   2) RFC disallows TABS only in the folded-output, per RFC 7991,
  leaving it to the folding-logic (the script) to decide if it
  wants to:
   a) disallow TABS in the source input (curr script does this)
   b) detect TABS exist and prompt user for TAB stop info
   c) detect TABS and query environment for cur TAB stop info
  (but tab-stops may differ in the shell the text editor,
  or whatever was used to create the text, right?)

   3) RFC allows TABS in the folded output, and solves it by
  depending on a tab-solution, as described by (a).

   4) RFC allows TABS in the folded output, but does not solves it,
  as described by (c). This would probably NOT be allowed from
  a standardization perspective.


Moving to (2) would be easy and probably resolves most concerns
here.

Moving to (3) is possible, but we would do so only to:

  - support non-IETF use cases

  - or pave the way for an rfc7991bis that could depend on the
solutions we define here.

That is, rfc7991bis could *allow* long-lines and tabs while
`xml2rfc` applies the solutions being discussed here only
for when exporting the "plain-text" format (other formats
may have better ways 

Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-26 Thread Kent Watsen


I recommend that we select "option-2" (see bottom).

- it easy to do.
- there's no current support for having tabs in folded output.
- doing so doesn't preclude an "option-3" someday in the future.

The authors will assume that this is WG consensus if there are no 
objections within a week's time.

Kent // co-author


-Original Message-


[new subject line]

It is one thing for an editor to use tabs during the creation of text,
and another to publish text with an expectation that consumers will
render the tabs the same way.  Either the source editor converts tabs
to spaces, which is interoperable today, or keep the tabs while
publishing metadata in the text, using some TBD standard, enabling
consumers to use the same tab stops.  

If there were a standard enabling the publishing of text including
tabs, it should work for all artwork, not just artwork that has been
folded.  This is similar to the discussion we had before about having
begin/end markers enabling perfect extractions, in that it is also
something that pertains to all artwork, not just artwork that has 
been folded.  

Thus, there are a total of three problems:
  P1: perfect extraction
  P2: tabs
  P3: long lines

Assuming all thing were solved problems, and assuming that we always
want perfect extraction, the possible combinations for the occurrence
of the other two problems are:
  - no tabs or long lines
  - tabs, but no long lines
  - long lines, but no tabs
  - tabs and long lines

How are they ordered?  Clearly supporting perfect extractions has 
to be the outermost thing, but what about the other two?   Does it
matter?

Thinking about solutions:

 - the solution for long-lines is to use a header (not a footer)
   because it's believed important to prime readers *before* they
   read the text.

 - the solution for perfect-extraction could be either:
 - use both a header-and-footer marker (low tech)
 - or use either a header or a footer that encodes
   something like a "num lines" value into the 
   marker.  (note: footer-only okay since the marker
   is for programmatic processors, not the readers)

 - the solution for tabs could be to use either a header
   or a footer that encodes the tab- stop metadata. (note:
   footer-only okay since the marker is for programmatic
   processors, not the readers)


If tabs were to be supported by the folding solution (note: it
doesn't make sense to talk about "folds being supporting by the
tabbing solution"), then either:

  a) tabs are handled *before* folding, and the folding-solution 
 is aware of the tab-solution (i.e., it is able to process 
 the metadata).

  - everybody nods ;)

  b) the folding-solution is really a folding+tab solution, that is,
 it has a built-in way of handling tabs (i.e., encoding tab stop
 metadata) independent of how tabs are handled for text that has
 not been folded.

  - this may be technically possible, but we should avoid having
two solutions to solve the tab problem.  We would be better
off solving the tab-problem directly and then use (a).

  c) the folding-solution folds using the source tab stops, but does
 not itself encode metadata about the tab stops, assuming that
 there is a "promise" that the encoding of the metadata will
 occur in a wrapper layer around it.

  - this feels icky, but it seems viable and, would possible
allow us to proceed with this draft without having to solve
the tabbing problem now.


Options:

  1) RFC disallows TABS in both the source-input and folded-output.
 ***This is what we currently have***

  2) RFC disallows TABS only in the folded-output, per RFC 7991,
 leaving it to the folding-logic (the script) to decide if it
 wants to:
  a) disallow TABS in the source input (curr script does this)
  b) detect TABS exist and prompt user for TAB stop info
  c) detect TABS and query environment for cur TAB stop info
 (but tab-stops may differ in the shell the text editor,
 or whatever was used to create the text, right?)

  3) RFC allows TABS in the folded output, and solves it by 
 depending on a tab-solution, as described by (a).

  4) RFC allows TABS in the folded output, but does not solves it,
 as described by (c). This would probably NOT be allowed from 
 a standardization perspective.


Moving to (2) would be easy and probably resolves most concerns
here.  

Moving to (3) is possible, but we would do so only to:

 - support non-IETF use cases

 - or pave the way for an rfc7991bis that could depend on the 
   solutions we define here.  

   That is, rfc7991bis could *allow* long-lines and tabs while
   `xml2rfc` applies the solutions being discussed here only
   for when exporting the "plain-text" format (other formats
   may have better ways to support perfect extractions and/or
   not care about long-lines or tabs).

   PS: as a corollary, realize that when we pre-textualizing
   

Re: [netmod] perfect extraction, tabs, and long lines - oh my

2018-09-24 Thread Ladislav Lhotka
Hi,

it is quite funny: we are using XML in an area that it wasn't really
designed for - representation of hierarchical data - but, on the other
hand, we *don't* use XML where it could effectively help us avoid
awkward formatting and extraction problems such as those discussed in
this thread.

Since the beginning, I've been writing YANG source in the YIN XML
notation (and most people find it weird). However, when one uses a
schema-aware editor such as nxml mode of emacs, writing YIN is really a
pleasant experience:

- no need to remember YANG syntax, all statements are autocompleted in a
  context-sensitive way

- no need to care about the prescribed order of substatements (for the
  pyang --ietf check)

- (mostly) no need to care about long lines and whitespace.

A schema-aware editor takes care about the first item and XSLT scripts
about the rest. All the tools are available in my GitHub skeleton project

https://github.com/llhotka/YANG-I-D

And as for extracting YANG modules from I-D text: given that I-D
submission format is XML (xml2rfc), the most natural way for including
YANG modules in an I-D would be to use YIN directly instead of
. Both formatting and extracting the module would then be
absolutely painless - it is a different XML namespace, so tools should
have no problems with it.

I suspect I won't attract many supporters but I couldn't help myself. It
bothers me that we have all the tools and technologies available but -
because the general opinion is that they are hard to use - we instead
struggle with brittle and tricky technicalities like those below. Isn't
it much harder after all?

Lada

Kent Watsen  writes:

> [new subject line]
>
> It is one thing for an editor to use tabs during the creation of text,
> and another to publish text with an expectation that consumers will
> render the tabs the same way.  Either the source editor converts tabs
> to spaces, which is interoperable today, or keep the tabs while
> publishing metadata in the text, using some TBD standard, enabling
> consumers to use the same tab stops.  
>
> If there were a standard enabling the publishing of text including
> tabs, it should work for all artwork, not just artwork that has been
> folded.  This is similar to the discussion we had before about having
> begin/end markers enabling perfect extractions, in that it is also
> something that pertains to all artwork, not just artwork that has 
> been folded.  
>
> Thus, there are a total of three problems:
>   P1: perfect extraction
>   P2: tabs
>   P3: long lines
>
> Assuming all thing were solved problems, and assuming that we always
> want perfect extraction, the possible combinations for the occurrence
> of the other two problems are:
>   - no tabs or long lines
>   - tabs, but no long lines
>   - long lines, but no tabs
>   - tabs and long lines
>
> How are they ordered?  Clearly supporting perfect extractions has 
> to be the outermost thing, but what about the other two?   Does it
> matter?
>
> Thinking about solutions:
>
>  - the solution for long-lines is to use a header (not a footer)
>because it's believed important to prime readers *before* they
>read the text.
>
>  - the solution for perfect-extraction could be either:
>  - use both a header-and-footer marker (low tech)
>  - or use either a header or a footer that encodes
>something like a "num lines" value into the 
>marker.  (note: footer-only okay since the marker
>is for programmatic processors, not the readers)
>
>  - the solution for tabs could be to use either a header
>or a footer that encodes the tab- stop metadata. (note:
>footer-only okay since the marker is for programmatic
>processors, not the readers)
>
>
> If tabs were to be supported by the folding solution (note: it
> doesn't make sense to talk about "folds being supporting by the
> tabbing solution"), then either:
>
>   a) tabs are handled *before* folding, and the folding-solution 
>  is aware of the tab-solution (i.e., it is able to process 
>  the metadata).
>
>   - everybody nods ;)
>
>   b) the folding-solution is really a folding+tab solution, that is,
>  it has a built-in way of handling tabs (i.e., encoding tab stop
>  metadata) independent of how tabs are handled for text that has
>  not been folded.
>
>   - this may be technically possible, but we should avoid having
> two solutions to solve the tab problem.  We would be better
> off solving the tab-problem directly and then use (a).
>
>   c) the folding-solution folds using the source tab stops, but does
>  not itself encode metadata about the tab stops, assuming that
>  there is a "promise" that the encoding of the metadata will
>  occur in a wrapper layer around it.
>
>   - this feels icky, but it seems viable and, would possible
> allow us to proceed with this draft without having to solve
> the tabbing problem now.
>
>
> Options:
>
>   1) RFC