Re: [Fwd: Re: [RFC] A more extensible/flexible POD (ROUGH-DRAFT)]

Damian Conway Thu, 17 Mar 2005 16:54:01 -0800

[No, I'm not back; I'm just passing by. But I feel that I need to comment on this whole issue]

Even before Brian announced Kwid, I was privately suggesting to Larry that Markdown (http://daringfireball.net/projects/markdown/) was an excellent evolution of mark-up notations and might be well suited to Perl 6. At least...as a second allowable syntax.

And, in my view, Kwid kicks Markdown's butt in terms of its suitability for Perl documentation. POD itself is brilliant and we should certainly not abandon it, but it's critical to remember that POD is just an *interface* (or B<interface>, if you prefer ;-) to Perl's built-in documentation systems. I strongly believe that Kwid is, for many purposes, a cleaner and less-intrusive interface, and I for one will be using it (even if I have to build a kwid2pod translator).

But frankly, I'd rather just be able to write:

=kwid

in place of

=pod

within standard Perl 6.

As for the larger issue of redoing pod, I've appended my notes on where the Design Team left their discussions when last we discussed it. This might spark some ideas (but note that I will not be able to respond to them any time soon -- alas, bread-winning must, for the moment, take precedence over most of my public activities).

Damian

-----cut----------cut----------cut----------cut----------cut-----

There would be a single consistent rule that says that every POD block
(except raw text blocks) has one of the following three equivalent
syntactic forms:

    =begin  TYPE  OPTIONAL_MULTIWORD_LABEL_TO_END_OF_LINE
    BLOCK_CONTENTS_START_HERE_AND_CONTINUE_OVER_MULTIPLE_LINES_UNTIL...
    =end  TYPE  OPTIONAL_SAME_MULTIWORD_LABEL

or:

    =for  TYPE  OPTIONAL_MULTIWORD_LABEL_TO_END_OF_LINE
    BLOCK_CONTENTS_START_HERE_AND_CONTINUE_OVER_MULTIPLE_LINES_UNTIL...
    <first whitespace-only line or next pod directive>

or:

    =TYPE  BLOCK_CONTENTS_START_HERE_AND_CONTINUE_OVER_MULTIPLE_LINES_UNTIL...
    <first whitespace-only line or pod directive>

For example:

    =begin table Table of Contents
        Constants           1
        Variables           10
        Subroutines         33
        Everything else     57
    =end table

    =begin list
    =begin item *
        Doh
    =end item
    =begin item *
        Ray
    =end item
    =begin item *
        Me
    =end item
    =end list

    =begin comment
        This is the most verbose way to write all this
    =end comment

Or equivalently:

    =for table Table of Contents
        Constants           1
        Variables           10
        Subroutines         33
        Everything else     57

    =begin list
    =for item *
        Doh

    =for item *
        Ray

    =for item *
        Me

    =end list

    =for comment
        This is a less verbose way to write all this

Or also equivalently:

    =for table Table of Contents
        Constants           1
        Variables           10
        Subroutines         33
        Everything else     57

    =for list
    =item * Doh
    =item * Ray
    =item * Me

    =comment This is the least verbose way to write all this


POD formatters could then be simply and consistently implemented by
inheriting from a standard Pod::Base class, which would provide a
C<.parse_pod> method that sequentially extracts each block construct (from
whichever of the three syntaxes), including raw text blocks (which are
actually just unlabelled C<=for body> blocks), and raw code blocks
(which are actually just unlabelled C<=for verbatim> blocks).

C<.parse_pod> would be something like:

    multi method parse_pod ($self: Str $from_str) {
        # Get sequence of POD blocks to be parsed
        # Using standard rules...
        my @blocks = $self.extract_pod($from_str);

        # Dispatch each block to be processed by the
        # appropriate method...
        for @blocks -> $block {
            my ($type, $label, $contents) = $block<type label contents>;
            $self.$type($label, $contents);
        }
    }

When each C<.$type()> method is called, both the label and contents would
passed as simple strings (either of which might, of course, be empty if
the corresponding component had been omitted from the block). The
(multi)method thus selected would then be responsible for
formatting/processing/whatevering the label and contents passed to it:

    method head1 ($label, $contents) {...}
    method head2 ($label, $contents) {...}

    method list ($label, $contents) {...}

    method item ($label, $contents) {...}

    # etc.

Note that under this scheme the Perl5 syntax for:

    =head1 Title here

    =head2 Subtitle here

    =head3 Subsubtitle here

    =head4 Subsubsubsubtitle here

    =item  Bullet  Item text

    =cut

    =pod

would mostly all continue to work (though, of course, C<=cut> and
C<=pod> would actually be dealt with directly within C<.extract_from>).

The most noticable change would be that something like:

    =item Bullet

    Text of item here

would now have to be written either as:

    =item  Bullet  Text of item here

(an improvement, I suspect), or as:

    =item  Bullet
    Text of item here

(assuming the .item() method was clever enough to remove leading
 whitespace from the contents), or as:

    =for item  Bullet
    Text of item here

or:

    =begin item Bullet

    Text of item here

    =end text


Of course:

    =over 4
    ...
    =back

would no longer work; they would have to be written something like:

    =begin indent 4
    ...
    =end indent

Or better still, removed entirely and replaced with:

    =begin list
    ...
    =end list

At the moment they're odd-fish: not a mark-up block, but a layout block.
And hence intrinsically evil. ;-)

And if you wanted to *change* how POD is processed by perl6, you'd just
use a C<=use> directive to install your own class:

    =use Pod::Quibble

as the POD handler. That class would probably be derived from Pod::Base
with some polymorphic or multimorphic adjustments to one or more of
C<.extract_pod>, C<.parse_pod>, or the various C<.head1>, C<.head2>,
C<.list>, C<.item>, C<.table>, C<.data>, etc. methods.


We also intend to unify __DATA__ and POD, and make both accessible (at
compile time and run time) to the program.

The single Perl 5 __DATA__ section would become:

    =begin data
    ...
    =end data

and you could define multiple separate data sections (a la Inline::Files)
with:

    =begin data LABEL1
    ...
    =end data

    =begin data LABEL2
    ...
    =end data

    # etc.

Of course, under the synactic equivalences described above,
you could also write those as:

    =for data LABEL1
    ...

    =for data LABEL2
    ...

    # etc.

or:

    =data LABEL1 ...

    =data LABEL2 ...

    # etc.

These would simply be parsed by the standard Pod::Inline class (or whatever
it's eventually called), running as part of the perl6 parser.

Perl 6 would provide two standard file-scoped variables named
C<%=POD> and C<%=DATA>, which would provide access to all the file-
related metadata:

    %=POD                 --> structured POD object

    %=DATA                --> structured DATA object (part of %=POD)

The "structured POD object" is an object that provides both sequential
and named access (lazily, of course!) to the overall POD structure of the
current file (including any =data sections):

    %=POD<head1>          --> Array of POD objects representing C<=head1>
                              chunks

    %=POD<head1>[$n]      --> structured POD object representing Nth
                              C<=head1> chunk

    %=POD[$n]             --> structured POD object representing Nth
                              C<=head1> chunk (shorthand)

    %=POD[$n].text        --> Text of Nth C<=head1> directive

    %=POD[$n].loc         --> Line range of the Nth C<=head1> directive


    %=POD<head1>[$n]<head2>[$m]
                          --> structured POD object representing the
                              Mth C<=head2> chunk within Nth C<=head1>
                              section

    %=POD[$n]<head2>[$m]  --> structured POD object representing the
                              Mth C<=head2> chunk within Nth C<=head1>
                              section (shorthand)

    %=POD[$n][$m]         --> structured POD object representing the
                              Mth C<=head2> chunk within Nth C<=head1>
                              section (evenshorterhand)


    %=POD<head2>          --> Array of POD objects representing C<=head2>
                              chunks (from all C<=head1> sections)


    %=POD<table>[$t]      --> POD object representing the Tth C<=table>

    %=POD<table>[$t].text --> Caption of the Tth C<=table>

    %=POD<table>[$t].loc  --> Line range of the Tth C<=table>

    %=POD<table>[$t][$r]  --> The Rth row of the Tth C<=table>


    %=POD<html>[$h]       --> POD object representing the Hth C<=begin html>
                              section

    etc.

Meanwhile, the "DATA hash" would contain the (lazily extracted!) text of
just the C<=data> sections, with the keys of the hash being the names of
the sections. The value of each entry would be an object with stringific
and arrayific overloadings:

    %=DATA                --> Hash of objects representing C<=data>
                              sections, keyed by name

    %=DATA<LABEL1>        --> Data object representing all C<=data LABEL1>
                              sections

    ~ %=DATA<LABEL1>      --> Concatenated text from all C<=data LABEL1>
                              sections

    %=DATA<LABEL1>[$n]    --> Text from only the Nth C<=data LABEL1>
                              section

Of course, in-line data is accessed from within the program far more
frequently that POD is likely to be, so there might also be convenience
bindings of entries in the data hash to named C<$=NAME> variables (much
as $1, $2, etc. are convenience bindings into components of the $/ match
variable):

    $=LABEL2                --> Data object representing all C<=data LABEL2>
                                sections

    ~ $=LABEL2              --> Concatenated text from all C<=data LABEL2>
                                sections

    $=LABEL2[$n]            --> Text from only the Nth C<=data LABEL2>
                                section

"Data objects" would also have an iterator overloading, so that:

    for = $=DATA {...}

would work as expected.

Re: [Fwd: Re: [RFC] A more extensible/flexible POD (ROUGH-DRAFT)]

Reply via email to