Re: perlpodspec, draft 1

Ronald J Kimball Mon, 20 Aug 2001 07:33:15 -0700
On Fri, Aug 10, 2001 at 04:28:50AM -0600, Sean M. Burke wrote:

> Pod content is contained in B<pod blocks>.  A pod block starts with a
> line that matches <m/^=[a-zA-Z]/>, and continues up to the next line
> that matches C<m/^=cut/> -- or up to the end of the file, if there is
> no C<m/^=cut/> line.
> 
> =for comment
>  The current perlsyn says:
>  [beginquote]
>    Note that pod translators should look at only paragraphs beginning
>    with a pod directive (it makes parsing easier), whereas the compiler
>    actually knows to look for pod escapes even in the middle of a
>    paragraph.  This means that the following secret stuff will be ignored
>    by both the compiler and the translators.
>       $a=3;
>       =secret stuff
>        warn "Neither POD nor CODE!?"
>       =cut back
>       print "got $a\n";
>    You probably shouldn't rely upon the warn() being podded out forever.
>    Not all pod translators are well-behaved in this regard, and perhaps
>    the compiler will become pickier.
>  [endquote]
>  I think that those paragraphs should just be removed; paragraph-based
>  parsing  seems to have been largely abandoned, because of the hassle
>  with non-empty blank lines messing up what people meant by "paragraph".

I agree with Philip, that paragraph-based parsing should *not* be removed.
The problem of non-empty blank lines is not a reason remove paragraph-based
parsing; just declare non-empty blank lines to be the same as empty blank
lines.


>  Even if the "it makes parsing easier" bit were especially true,
>  it wouldn't be worth the confusion of having perl and pod2whatever
>  actually disagree on what can constitute a pod block.

Until pod2whatever can parse Perl, perl and pod2whatever will always
disagree on what can constitute a pod block.

By the way, it would be much easier to add paragraph-based detection of pod
to perl, than parsing of perl to pod2whatever.


> (In other words, the pod processor for "head1" will apply the same
> processing to "Did You Remember to CE<lt>use strict;>?" that it would
> to an ordinary paragraph -- i.e., interior sequences (like
> "CE<lt>...>" are parsed (and presumably formatted appropriately), and
> whitespace in the form of literal spaces and/or tabs is not significant.

This paragraph should be rewritten.  It's confusing, and there are three
open parens and only one close paren.  :)


> =item *
> 
> A B<verbatim paragraph>.  The first line of this paragraph must be a
> literal space or tab, and this paragraph must not be inside a "=begin
> I<identifier>", ... "=end I<identifier>" sequence where
> "I<identifier>" begins with something other than a colon (":").

The significance of the colon has not been introduced yet, which makes this
sentence even harder to grasp than it already is.


> =item *
> 
> An B<ordinary paragraph>.  An ordinary paragraph is distinguished by
> the fact that the first line matches neither C<m/^=[a-zA-Z]/> nor
> C<m/^ \t/>, I<and> by not being inside a "=begin I<identifier>",
> ... "=end I<identifier>" sequence, where "I<identifier>" begins with
> something other than a colon (":").

Same here.


> =item *
> 
> A B<data paragraph>.  This is a paragraph that I<is> inside a "=begin
> I<identifier>" ... "=end I<identifier>" sequence where
> "I<identifier>" does I<not> begin with a literal colon (":").  In
> some sense, a data paragraph is not part of pod at all, since it's
> not subject to pod parsing; but it is specified here, since pod
> parsers need to be able to call an event for it, or store it in some
> form in a parse tree, or at least just parse I<around> it.

Same here.


What is the name for a paragraph that is inside a =begin/=end sequence with
a colon?


> =head1 About LE<lt>...> Sequences
> 
> As you can tell from a glance at L<perlpod|perlpod>, the LE<lt>...>
> sequence is the most complex sequence in POD.  These points will
> hopefully clarify what it means and how processor should deal with
> it.
> 
> =over
> 
> =item *
> 
> In parsing an LE<lt>...> sequence, pod parsers must note four
> attributes:
> 
> =over
> 
> =item 1
> 
> The link-text.  If there is none, this must be undef.  (E.g., in
> "LE<lt>Perl Functions/perlfunc>", the link-text is "Perl Functions".
> In "LE<lt>Time::HiRes>" and even "LE<lt>|Time::HiRes>", there is no
> link text.  Note that link text may contain formatting.)
> 
> =item 2
> 
> The link-text -- but if there was none, then the text that we'll
> infer in its place.  (E.g., for "LE<lt>Getopt::Std>", the inferred
> link text is "Getopt::Std".)

Attributes 1 and 2 are both "the link-text".  This is confusing.  I would
suggest calling these attributes "the specified link-text" and "the inferred
link-text".

=item 1

The specified link-text.  If there is none, this must be undef.  (E.g., in
"LE<lt>Perl Functions/perlfunc>", the specified link-text is "Perl
Functions".  In "LE<lt>Time::HiRes>" and even "LE<lt>|Time::HiRes>", there
is no specified link-text.  Note that the link-text may contain
formatting.)

=item 2

The inferred link-text.  This is the same as the specified link-text if
there was any.  Otherwise, some reasonable link-text is inferred.  (E.g.,
for "LE<lt>Getopt::Std>", the inferred link-text is "Getopt::Std".)


> =back
> 
> Pod parsers may also note additional attributes including:
> 
> =over
> 
> =item 5

This part of the spec violates the requirement that all =over sections must
start with 1.  See below.


> Note that you can distinguish URL-links from anything else by the
> fact that they match C<m/^\w+\:[^:]\S+$/>.  So
> C<LE<lt>http://www.perl.comE<gt>> is a URL, but
> C<LE<lt>HTTP::ResponseE<gt>> isn't.

I think that regex should be m/^\w+:[^:\s]\S*$/




> =item *
> 
> In case of LE<lt>...> sequences with no "text|" part in them,
> formatters have exhibited great variation in actually displaying the
> link or cross reference.  For example, LE<lt>crontab(5)> might render
> as "in the C<crontab(5)> manpage", or "in the C<crontab(5)> manpage"
> or just "C<crontab(5)>".
> 
> It is recommended that processors use as little wording as possible,
> in these cases where the link text has to be inferred.  Render
> "LE<lt>Foo::Bar>" as just "Foo::Bar", and render
> LE<lt>http://www.perl.org> as just "http://www.perl.org";.  Section
> links, which are less commonly used, should probably be rendered as
> follows: Render "LE<lt>/Constructor Methods>" as "Constructor
> Methods", and render "LE<lt>Foo::Bar/Constructor Methods>" as
> "Constructor Methods in Foo::Bar"
> 
> Of course, if the authors providing LE<lt><text|...>, then this

s/providing/provide/

> avoids the problem entirely; but the above formatting conventions

s/;/,/

> mean authors won't have to keep using the absurd formatting as in:
> "Whatever can't be done with LE<lt>LWP::Simple|LWP::Simple> should be
> done with LE<lt>LWP::UserAgent|LWP::UserAgent>."

Since the above conventions are just a recommendation (and not even a
"processors should...", I don't think this keeps authors from having to use
that absurd formatting.  Speaking for myself, if parsers are still allowed
to expand L<LWP::Simple> as "the LWP::Simple manpage", then I will still
write L<LWP::Simple|LWP::Simple> (and still hate having to do it).


> =item *
> 
> An "=over" ... "=back" region containing only
> C<m/^=item\s+\d+\.?\s*$/> lines, each one (or each group of them)
> followed by some number of ordinary/verbatim paragraphs, other nested
> "=over" ... "=back" regions, or "=for..." paragraphs, and
> "=begin"..."=end" sequences.  Note that the numbers must start at one
> in each section, and must proceed in order and without skipping
> numbers.

This requirement was not followed earlier in the specification.  See above.


> (Pod processors must tolerate lines like "=item 1" as if they were
> "=item 1.", with the period.)
> 
> =item *
> 
> An "=over" ... "=back" region containing only "=item [text]"
> commands, each one (or each group of them) followed by some number of
> ordinary/verbatim paragraphs, other nested "=over" ... "=back"
> regions, or "=for..." paragraphs, and "=begin"..."=end" regions.
> 
> The text in the "=item [text]" paragraph should not match
> C<m/^=item\s+\d+\.?\s*$/> or C<m/^=item\s+\*\s*$/>, nor should it
> match just C<m/^=item\s*$>.

The third regex is redundant with the second.  :)


> =item *
> 
> Some parsers may also support an additional kind of "=over"
> ... "=back" region: one with no "=item" paragraphs, but I<only>
> ordinary/verbatim paragraphs, and possibly also some nested "=over"
> ... "=back" regions, "=for..." paragraphs, and "=begin"..."=end"
> regions.  Such an itemless "=over" ... "=back" region in pod is
> equivalent in meaning to an "<blockquote>...</blockquote>" element in
> HTML.

This contradicts the rewritten perlpod.pod, which states as a basic rule
for using =item; "Use at least one inside an =over/=back block."


Ronald
Re: perlpodspec, draft 1

Reply via email to