On Fri, Aug 10, 2001 at 04:28:50AM -0600, Sean M. Burke wrote:
> Pod content is contained in B<pod blocks>. A pod block starts with a
> line that matches <m/^=[a-zA-Z]/>, and continues up to the next line
> that matches C<m/^=cut/> -- or up to the end of the file, if there is
> no C<m/^=cut/> line.
>
> =for comment
> The current perlsyn says:
> [beginquote]
> Note that pod translators should look at only paragraphs beginning
> with a pod directive (it makes parsing easier), whereas the compiler
> actually knows to look for pod escapes even in the middle of a
> paragraph. This means that the following secret stuff will be ignored
> by both the compiler and the translators.
> $a=3;
> =secret stuff
> warn "Neither POD nor CODE!?"
> =cut back
> print "got $a\n";
> You probably shouldn't rely upon the warn() being podded out forever.
> Not all pod translators are well-behaved in this regard, and perhaps
> the compiler will become pickier.
> [endquote]
> I think that those paragraphs should just be removed; paragraph-based
> parsing seems to have been largely abandoned, because of the hassle
> with non-empty blank lines messing up what people meant by "paragraph".
I agree with Philip, that paragraph-based parsing should *not* be removed.
The problem of non-empty blank lines is not a reason remove paragraph-based
parsing; just declare non-empty blank lines to be the same as empty blank
lines.
> Even if the "it makes parsing easier" bit were especially true,
> it wouldn't be worth the confusion of having perl and pod2whatever
> actually disagree on what can constitute a pod block.
Until pod2whatever can parse Perl, perl and pod2whatever will always
disagree on what can constitute a pod block.
By the way, it would be much easier to add paragraph-based detection of pod
to perl, than parsing of perl to pod2whatever.
> (In other words, the pod processor for "head1" will apply the same
> processing to "Did You Remember to CE<lt>use strict;>?" that it would
> to an ordinary paragraph -- i.e., interior sequences (like
> "CE<lt>...>" are parsed (and presumably formatted appropriately), and
> whitespace in the form of literal spaces and/or tabs is not significant.
This paragraph should be rewritten. It's confusing, and there are three
open parens and only one close paren. :)
> =item *
>
> A B<verbatim paragraph>. The first line of this paragraph must be a
> literal space or tab, and this paragraph must not be inside a "=begin
> I<identifier>", ... "=end I<identifier>" sequence where
> "I<identifier>" begins with something other than a colon (":").
The significance of the colon has not been introduced yet, which makes this
sentence even harder to grasp than it already is.
> =item *
>
> An B<ordinary paragraph>. An ordinary paragraph is distinguished by
> the fact that the first line matches neither C<m/^=[a-zA-Z]/> nor
> C<m/^ \t/>, I<and> by not being inside a "=begin I<identifier>",
> ... "=end I<identifier>" sequence, where "I<identifier>" begins with
> something other than a colon (":").
Same here.
> =item *
>
> A B<data paragraph>. This is a paragraph that I<is> inside a "=begin
> I<identifier>" ... "=end I<identifier>" sequence where
> "I<identifier>" does I<not> begin with a literal colon (":"). In
> some sense, a data paragraph is not part of pod at all, since it's
> not subject to pod parsing; but it is specified here, since pod
> parsers need to be able to call an event for it, or store it in some
> form in a parse tree, or at least just parse I<around> it.
Same here.
What is the name for a paragraph that is inside a =begin/=end sequence with
a colon?
> =head1 About LE<lt>...> Sequences
>
> As you can tell from a glance at L<perlpod|perlpod>, the LE<lt>...>
> sequence is the most complex sequence in POD. These points will
> hopefully clarify what it means and how processor should deal with
> it.
>
> =over
>
> =item *
>
> In parsing an LE<lt>...> sequence, pod parsers must note four
> attributes:
>
> =over
>
> =item 1
>
> The link-text. If there is none, this must be undef. (E.g., in
> "LE<lt>Perl Functions/perlfunc>", the link-text is "Perl Functions".
> In "LE<lt>Time::HiRes>" and even "LE<lt>|Time::HiRes>", there is no
> link text. Note that link text may contain formatting.)
>
> =item 2
>
> The link-text -- but if there was none, then the text that we'll
> infer in its place. (E.g., for "LE<lt>Getopt::Std>", the inferred
> link text is "Getopt::Std".)
Attributes 1 and 2 are both "the link-text". This is confusing. I would
suggest calling these attributes "the specified link-text" and "the inferred
link-text".
=item 1
The specified link-text. If there is none, this must be undef. (E.g., in
"LE<lt>Perl Functions/perlfunc>", the specified link-text is "Perl
Functions". In "LE<lt>Time::HiRes>" and even "LE<lt>|Time::HiRes>", there
is no specified link-text. Note that the link-text may contain
formatting.)
=item 2
The inferred link-text. This is the same as the specified link-text if
there was any. Otherwise, some reasonable link-text is inferred. (E.g.,
for "LE<lt>Getopt::Std>", the inferred link-text is "Getopt::Std".)
> =back
>
> Pod parsers may also note additional attributes including:
>
> =over
>
> =item 5
This part of the spec violates the requirement that all =over sections must
start with 1. See below.
> Note that you can distinguish URL-links from anything else by the
> fact that they match C<m/^\w+\:[^:]\S+$/>. So
> C<LE<lt>http://www.perl.comE<gt>> is a URL, but
> C<LE<lt>HTTP::ResponseE<gt>> isn't.
I think that regex should be m/^\w+:[^:\s]\S*$/
> =item *
>
> In case of LE<lt>...> sequences with no "text|" part in them,
> formatters have exhibited great variation in actually displaying the
> link or cross reference. For example, LE<lt>crontab(5)> might render
> as "in the C<crontab(5)> manpage", or "in the C<crontab(5)> manpage"
> or just "C<crontab(5)>".
>
> It is recommended that processors use as little wording as possible,
> in these cases where the link text has to be inferred. Render
> "LE<lt>Foo::Bar>" as just "Foo::Bar", and render
> LE<lt>http://www.perl.org> as just "http://www.perl.org". Section
> links, which are less commonly used, should probably be rendered as
> follows: Render "LE<lt>/Constructor Methods>" as "Constructor
> Methods", and render "LE<lt>Foo::Bar/Constructor Methods>" as
> "Constructor Methods in Foo::Bar"
>
> Of course, if the authors providing LE<lt><text|...>, then this
s/providing/provide/
> avoids the problem entirely; but the above formatting conventions
s/;/,/
> mean authors won't have to keep using the absurd formatting as in:
> "Whatever can't be done with LE<lt>LWP::Simple|LWP::Simple> should be
> done with LE<lt>LWP::UserAgent|LWP::UserAgent>."
Since the above conventions are just a recommendation (and not even a
"processors should...", I don't think this keeps authors from having to use
that absurd formatting. Speaking for myself, if parsers are still allowed
to expand L<LWP::Simple> as "the LWP::Simple manpage", then I will still
write L<LWP::Simple|LWP::Simple> (and still hate having to do it).
> =item *
>
> An "=over" ... "=back" region containing only
> C<m/^=item\s+\d+\.?\s*$/> lines, each one (or each group of them)
> followed by some number of ordinary/verbatim paragraphs, other nested
> "=over" ... "=back" regions, or "=for..." paragraphs, and
> "=begin"..."=end" sequences. Note that the numbers must start at one
> in each section, and must proceed in order and without skipping
> numbers.
This requirement was not followed earlier in the specification. See above.
> (Pod processors must tolerate lines like "=item 1" as if they were
> "=item 1.", with the period.)
>
> =item *
>
> An "=over" ... "=back" region containing only "=item [text]"
> commands, each one (or each group of them) followed by some number of
> ordinary/verbatim paragraphs, other nested "=over" ... "=back"
> regions, or "=for..." paragraphs, and "=begin"..."=end" regions.
>
> The text in the "=item [text]" paragraph should not match
> C<m/^=item\s+\d+\.?\s*$/> or C<m/^=item\s+\*\s*$/>, nor should it
> match just C<m/^=item\s*$>.
The third regex is redundant with the second. :)
> =item *
>
> Some parsers may also support an additional kind of "=over"
> ... "=back" region: one with no "=item" paragraphs, but I<only>
> ordinary/verbatim paragraphs, and possibly also some nested "=over"
> ... "=back" regions, "=for..." paragraphs, and "=begin"..."=end"
> regions. Such an itemless "=over" ... "=back" region in pod is
> equivalent in meaning to an "<blockquote>...</blockquote>" element in
> HTML.
This contradicts the rewritten perlpod.pod, which states as a basic rule
for using =item; "Use at least one inside an =over/=back block."
Ronald