Re: Synposis 26 - Documentation [alpha draft]

2006-10-23 Thread Christopher J. Madsen
On October 16th Damian Conway wrote: 
  If the contents are not a number, they are interpreted as an upper-case
  Unicode character name, or as a lower-case XHTML entity. For example:

One more problem:  not all XHTML entities are lower-case.  For example:

 ETH; THORN; Eacute; Theta;

For a complete list, see:

http://www.w3.org/TR/xhtml-modularization/dtd_module_defs.html#a_xhtml_character_entities


I was thinking that we could distinguish them because Unicode character
names are always multiple words, but a quick search turned up ANGLE
(U+2220), so that won't work.

We could special-case ETH and THORN (the only all-uppercase entities)
and require translators to recognize them as entities.

We could allow an ampersand to indicate that it's an entity reference:
EETH and ETHORN.  The ampersand would be optional if the entity
name contains lowercase:  either EEacute or EEacute would be ok.

We could disallow EETH  ETHORN and require the Unicode names:
ELATIN CAPITAL LETTER ETH  ELATIN CAPITAL LETTER THORN.

-- 
Chris Madsen[EMAIL PROTECTED]
  --  http://www.pobox.com/~cjm  --



Re: Synposis 26 - Documentation [alpha draft]

2006-10-16 Thread Smylers
On October 7th Damian Conway wrote:

 Before Christmas, as promised!
 
 [DRAFT] Synopsis 26 - Documentation

Thank you for that, Damian!  Apologies for taking a while to respond,
but I wanted to leave reading the document until I had a sufficient
chunk of time to do it justice.  And I was very impressed.

One quibble:

 To include named Unicode or XML entities, use the CE code.
 
 If the contents are not a number, they are interpreted as an upper-case
 Unicode character name, or as a lower-case XML entity. For example:
 
  Perl 6 makes considerable use of Elaquo and Eraquo.

I think the only standard XML entities are Clt;, Cgt;, and
Camp;.  Particular XML languages can define further entities which
use that syntax, but they aren't included by default.  However, the
examples you give are HTML entities, defined in the HTML 4 spec:

  http://www.w3.org/TR/REC-html40/sgml/entities.html

Smylers


Re: Synposis 26 - Documentation [alpha draft]

2006-10-16 Thread Danny Brian

On Oct 16, 2006, at 2:51 PM, Smylers wrote:
...


 Perl 6 makes considerable use of Elaquo and Eraquo.


I think the only standard XML entities are Clt;, Cgt;, and
Camp;.  Particular XML languages can define further entities which
use that syntax, but they aren't included by default.


The default entities are Clt;, Cgt;, Camp;, Capos;, and  
Cquot;.


So glad I could contribute that.

- Danny



Re: Synposis 26 - Documentation [alpha draft]

2006-10-16 Thread Damian Conway

Smylers pointed out (and Danny Brian confirmed):

The default entities are Clt;, Cgt;, Camp;, Capos;, and 
Cquot;.


I *knew* there was a good reason I shun XML! ;-)

Clearly five entities is Inot going to suffice. The synposis now reads:

To include named Unicode or XHTML entities, use the CE code.

If the contents are not a number, they are interpreted as an upper-case
Unicode character name, or as a lower-case XHTML entity. For example:


Thanks for that.

Damian


Re: Synposis 26 - Documentation [alpha draft]

2006-10-14 Thread Damian Conway

Brent wrote:


I've probably been hanging around Web standards nazis for too long,
but can we get a separate code to mark the title of a document that
can't be linked to (say, a book) along the lines of HTML's cite tag?


Hmm. Maybe. Care to nominate a letter for that? C, I, T, and E are 
all take already. ;-)




Actually, a couple more link schemes could probably handle my previous 
request:


   LPerl 6 and Parrot Essentials|urn:isbn:059600737X
   LParrot Magic Cookies in The Perl Review|urn:issn:1553667X/3/0#11


Why wouldn't that just be:

 LPerl 6 and Parrot Essentials|isbn:059600737X
 LParrot Magic Cookies in The Perl Review|issn:1553667X/3/0#11





Oooh, transclusion--shiny.  Perhaps the pipe character can be used to
provide alternative text:

   PSee standard copyright terms in the distribution.
|file:/shared/docs/std_copyright.pod


I like it. The only concern would be that, everywhere else that a pipe is 
valid, the LHS is rendered instead of the RHS. Here it would be reversed. 
Arguably, by that measure, it ought to be:


 Pfile:/shared/docs/std_copyright.pod
  |See standard copyright terms in the distribution.

Of course, you could always argue that the LHS is the text side and the RHS 
the URL side. Hm. I need to think about that a little more.




Also, what about non-textual files?  If I type
Phttp://www.perlfoundation.org/images/onion_64x64.png, will an onion
appear in my Pod document?  That would obviate custom =Image
directives.


That would depend on the renderer. The parser will certainly accept it. I'd 
expect that renderers that can render images would probably do so.



Perldoc provides a mechanism by which you can extend the syntax and 
semantics of your documentation notation: the C=use directive.


Um...how can this be made to work?  Are renderers going to have to
know about every possible plugin?  Are plugins going to have to know
about every possible renderer?  Will dogs and cats be living together?


To answer your questions in order: Easy. No. No. Hell no!

The parser doesn't change when you extend syntax and semantics. Plugins can 
only change the syntax of the *contents* of a new block type, not the way the 
parser parses those blocks. For example, to get Markdown syntax and semantics, 
you write:


=use Perldoc::Plugin::Markdown

=begin Markdown

*Markdown* syntax and semantics _in this block_

=end Markdown

The parser would still parse the Markdown block to create a 
Perldoc::Block::Markdown object, even if you hadn't C=use'd the module. The 
C=use merely allows the parser and/or renderer to load the class definition 
of the Perldoc::Block::Markdown class, so that the object can be constructed 
correctly (and, presumably, the contents of the block can be interpreted 
meaningfully).


So, in other words, the syntax of Pod blocks is invariant, allowing the parser 
to reduce Pod to a standard internal object stream, which each renderer (and 
any plug-in extension) can do with as it will.


I obviously need to make those points clearer in the synopsis. Thanks.



C=config specifications are lexically scoped to the block in which
they're specified.


   =config head3 :numbered
   =cut


There is no C=cut in Perl 6. And in your example it wasn't needed, BTW, 
since Pod reverts to ambient code after each block unless you're nested inside 
a =begin...=end pair.





   method foo($bar, $baz) {
  ...
   }

   =head3 Cfoo(RbarC, RbazC)
  ...


Is that =head3 numbered, or is it in a different lexical scope?


Assuming the =cut wasn't there, the =head3 would be numbered, since you'd be 
in the same lexical scope. Lexical scopes are defined by =begin..=end pairs, 
not by the chunking of Pod within ambient code.




(Actually, I don't see any reference to =cut in this spec.  Is it
still there or not?)


Not. :-)


Damian


Re: Synposis 26 - Documentation [alpha draft]

2006-10-13 Thread Damian Conway

Tim Bunce wrote:

 That's going to cause pain when people using older parsers try to read
 docs written for newer ones. Would a loud warning plus some best-efforts
 fail-safe parsing be possible?

Indeed. And that's a important use-case.

But best-effort is difficult when you're talking about future-compatibility
of core constructs, which these are supposed to be. I guess best-effort
for uppercase (semantic) mark-up is just to map:

=begin UNKNOWN
mumble mumble mumble
=end UNKNOWN

to:

=head1 UNKNOWN

=begin para
mumble mumble mumble
=end para

But it's harder to see how to cope with unknown all-lower directives:

=begin frobnication
...
=end frobnication

=for franistat

=wassname

Especially the last of those, since it might be either an abbreviated
block or a pure directive. I suspect that these should either still be
fatal, or they should warn-and-ignore.

Damian


Re: Synposis 26 - Documentation [alpha draft]

2006-10-13 Thread Damian Conway

Jonathan Lang wrote:


If I understand you correctly, the pain to which you're referring
would come from the possibility of a name that's reserved by the newer
version of Pod, but not by the older version.  Wouldn't the simplest
solution be to let a Pod document announce its own version, much like
Perl can?


That would presumably be:

=use 6.0.2

Though it's not quite an exact analogy. If a Perl interpreter isn't recent
enough, it can't really fall back on best attempt to execute a program.
Code is either valid or unusable.

For documentation, even if you don't know how to interpret a particular
mark-up, you can always just display it as raw text and the reader can
still get most of the benefit of it.

It's hard to imagine a circumstance in which a refusal to render Pod:

Perldoc v6.0.2 required--this is only v6.0.1, stopped at S26.pod, line 1

would be preferable to actually rendering that Pod, no matter how badly.

Damian


Re: Synposis 26 - Documentation [alpha draft]

2006-10-13 Thread Brent 'Dax' Royal-Gordon

On 10/7/06, Damian Conway [EMAIL PROTECTED] wrote:

The CI formatting code specifies that the contained text is
to be set in an Iitalic style


I've probably been hanging around Web standards nazis for too long,
but can we get a separate code to mark the title of a document that
can't be linked to (say, a book) along the lines of HTML's cite tag?


=begin item  :term('Chttp: and Chttps:')
A standard URL. For example:

  This module needs the LAME library
  (available from Lhttp://www.mp3dev.org/mp3/)

=end item

=begin item :termCfile:

A filename on the local system. For example:

  Next, edit the config file (Lfile:~/.configrc).

=end item

=begin item :termCman:

A link to the system man pages. For example:

  This module implements the standard
  Unix Lman:find(1) facilities.

=end item

=begin item :termCdoc:

A link to some other Perldoc documentation, typically a module or core
Perl documentation. For example:

  You may wish to use Ldoc:Data::Dumper to
  view the results.  See also: Ldoc:perldata.

=end item


Actually, a couple more link schemes could probably handle my previous request:

   LPerl 6 and Parrot Essentials|urn:isbn:059600737X
   LParrot Magic Cookies in The Perl Review|urn:issn:1553667X/3/0#11


If a renderer cannot find or access the external data source for a
placement link, it must issue a warning and render the URL directly in
some form. For example:

=begin indent

BCOPYRIGHT

See: /shared/docs/std_copyright.pod

BDISCLAIMER

See: http://www.megagigatera.com/std/disclaimer.txt

=end indent


Oooh, transclusion--shiny.  Perhaps the pipe character can be used to
provide alternative text:

   PSee standard copyright terms in the
distribution.|file:/shared/docs/std_copyright.pod

Also, what about non-textual files?  If I type
Phttp://www.perlfoundation.org/images/onion_64x64.png, will an onion
appear in my Pod document?  That would obviate custom =Image
directives.


Perldoc provides a mechanism by which you can extend the syntax and semantics
of your documentation notation: the C=use directive.


Um...how can this be made to work?  Are renderers going to have to
know about every possible plugin?  Are plugins going to have to know
about every possible renderer?  Will dogs and cats be living together?


C=config specifications are lexically scoped to the block in which
they're specified.


   =config head3 :numbered
   =cut

   method foo($bar, $baz) {
  ...
   }

   =head3 Cfoo(RbarC, RbazC)
  ...

Is that =head3 numbered, or is it in a different lexical scope?

(Actually, I don't see any reference to =cut in this spec.  Is it
still there or not?)

--
Brent 'Dax' Royal-Gordon [EMAIL PROTECTED]
Perl and Parrot hacker


Re: Synposis 26 - Documentation [alpha draft]

2006-10-12 Thread Tim Bunce
On Thu, Oct 12, 2006 at 02:55:57PM +1000, Damian Conway wrote:
 Dave Whipp wrote:
 
 I'm not a great fan of this concept of reservation when there is no 
 mechanism for its enforcement (and this is perl...).
 
 What makes you assume there will be no mechanism for enforcement? The 
 standard Pod parser (of which I have a 95% complete Perl 5 implementation) 
 will complain bitterly--as in cyanide--when unknown pure-upper or 
 pure-lower block names are used.

That's going to cause pain when people using older parsers try to read
docs written for newer ones. Would a loud warning plus some best-efforts
fail-safe parsing be possible?

Tim.

 The whole point of reserving these namespaces is not to prevent users from 
 misusing them, but to ensure that when we eventually get around to using a 
 particular block name, and those same users start screaming about it, we 
 can mournfully point to the passage in the original spec and silently shake 
 our heads. ;-)
 
 Damian


Synposis 26 - Documentation [alpha draft]

2006-10-12 Thread Jonathan Lang

Tim Bunce wrote:

Damian Conway wrote:
 Dave Whipp wrote:
 I'm not a great fan of this concept of reservation when there is no
 mechanism for its enforcement (and this is perl...).

 What makes you assume there will be no mechanism for enforcement? The
 standard Pod parser (of which I have a 95% complete Perl 5 implementation)
 will complain bitterly--as in cyanide--when unknown pure-upper or
 pure-lower block names are used.

That's going to cause pain when people using older parsers try to read
docs written for newer ones.


If I understand you correctly, the pain to which you're referring
would come from the possibility of a name that's reserved by the newer
version of Pod, but not by the older version.  Wouldn't the simplest
solution be to let a Pod document announce its own version, much like
Perl can?

--
Jonathan Dataweaver Lang


Re: Synposis 26 - Documentation [alpha draft]

2006-10-12 Thread Tim Bunce
On Thu, Oct 12, 2006 at 03:57:01PM -0700, Jonathan Lang wrote:
 Tim Bunce wrote:
 Damian Conway wrote:
  Dave Whipp wrote:
  I'm not a great fan of this concept of reservation when there is no
  mechanism for its enforcement (and this is perl...).
 
  What makes you assume there will be no mechanism for enforcement? The
  standard Pod parser (of which I have a 95% complete Perl 5 
 implementation)
  will complain bitterly--as in cyanide--when unknown pure-upper or
  pure-lower block names are used.
 
 That's going to cause pain when people using older parsers try to read
 docs written for newer ones.
 
 If I understand you correctly, the pain to which you're referring
 would come from the possibility of a name that's reserved by the newer
 version of Pod, but not by the older version.

Yes.

 Wouldn't the simplest solution be to let a Pod document announce its
 own version, much like Perl can?

How would that actually help? The old parser still wouldn't know what
new keywords have been added or how to parse them.

Tim.


Re: Synposis 26 - Documentation [alpha draft]

2006-10-11 Thread Damian Conway

Jonathan Lang wrote:


The only thing that I'd like to see changed would be to allow a more
flexible syntax for formatting codes - in particular, I'd rather use
something analogous to the 'embedded comments' described in S02,
replacing the leading # with an appropriate capital letter (as defined
by Unicode) and insisting on a word break just prior to it.


It was a deliberate decision to restrict the delimiters to angles. Unlike 
embedded comments, formatting codes are predominantly embedded in text, not 
code, so it's important to keep them easy-to-locate (i.e. with a consistent 
delimiter) and not to allow too many syntaxes (which increases the chance of 
unintended codes in normal text).


A leading word break is not really practical either, since documenters will 
need to use codes in the middle of words:


PractIise (and then practIice) saying GarEccedilon!




I'd also prefer a more Wiki-like dialect at some point (e.g.,
'__underlined text__', '_italicized text_' and '*bold*' instead of
'Uunderlined text', 'Iitalicized text' and 'Bbold'); but that
can wait.


That's Kwid. Which Ingy has proposed as a standard Perldoc dialect.
You'll be able to flip into kwid mode (for Perldoc parsers that support it) 
using:

=begin kwid

=end kwid


Damian


Re: Synposis 26 - Documentation [alpha draft]

2006-10-11 Thread Damian Conway

Dave Whipp wrote:

I'm not a great fan of this concept of reservation when there is no 
mechanism for its enforcement (and this is perl...).


What makes you assume there will be no mechanism for enforcement? The standard 
Pod parser (of which I have a 95% complete Perl 5 implementation) will 
complain bitterly--as in cyanide--when unknown pure-upper or pure-lower block 
names are used.


The whole point of reserving these namespaces is not to prevent users from 
misusing them, but to ensure that when we eventually get around to using a 
particular block name, and those same users start screaming about it, we can 
mournfully point to the passage in the original spec and silently shake our 
heads. ;-)


Damian


Re: Synposis 26 - Documentation [alpha draft]

2006-10-08 Thread Daniel Hulme
I liked it. Just one nit, near the end:

You can also preconfigure Lformatting codes|#Formatting codes, by
naming them with a pair of angles as a suffix. For example:

 =comment Always allow E codes in any (implicit or explicit) V
 code... =config V :allowE

 =comment All code to be italiciized...
^^
 =config C :formattedI

Note that, even though the code is named using single-angles, the
preconfiguration applies regardless of the actual delimiters used
on subsequent instances of the code.

s/italiciized/italicized/ in the marked place.

-- 
Customer Waiter, waiter! There's a fly in my soup!
Waiter That's not a bug, it's a feature.
http://surreal.istic.org/   It sounded right in my head.


Re: Synposis 26 - Documentation [alpha draft]

2006-10-08 Thread Jonathan Lang

The only thing that I'd like to see changed would be to allow a more
flexible syntax for formatting codes - in particular, I'd rather use
something analogous to the 'embedded comments' described in S02,
replacing the leading # with an appropriate capital letter (as defined
by Unicode) and insisting on a word break just prior to it.

I'd also prefer a more Wiki-like dialect at some point (e.g.,
'__underlined text__', '_italicized text_' and '*bold*' instead of
'Uunderlined text', 'Iitalicized text' and 'Bbold'); but that
can wait.

Otherwise, looks good.

--
Jonathan Dataweaver Lang


Re: Synposis 26 - Documentation [alpha draft]

2006-10-08 Thread Dave Whipp

Damian Conway wrote:
 Delimited blocks are bounded by C=begin and C=end markers...
 ...Typenames that are entirely lowercase (for example: C=begin
 head1) or entirely uppercase (for example: C=begin SYNOPSIS)
 are reserved.

I'm not a great fan of this concept of reservation when there is no 
mechanism for its enforcement (and this is perl...). Typical programmers 
ignore it, just as they ignore similar reservations of the type 
lower-case subroutine names are reserved.


If use strict will flag an error for their use, then perhaps is 
reserved would become must be predeclared (imported via =use). Then 
any module will be able to add its own typenames, without needing some 
distinguishing this is a core module trait to enable the typename. 
Reservation then simply becomes a note to module authors, not part of 
the language specification.


Synposis 26 - Documentation [alpha draft]

2006-10-07 Thread Damian Conway

Before Christmas, as promised!

I have a 95% complete Perl 5 implementation of a parser for this, but it is 
too large to fit in the margin. I may release the beta of that next week, once 
I'm home from my travels.


Damian

-cut--cut--cut--cut--cut-

=for comment
This file is deliberately specified in Perl 6 Pod format
Clearly a Perl 6 - Perl 5 documentation translator is a high priority ;-)


=head1 TITLE

[DRAFT] Synopsis 26 - Documentation


=head1 AUTHORS

Damian Conway [EMAIL PROTECTED]

Ingy dEoumlt Net [EMAIL PROTECTED]


=head1 VERSION

=for table
Maintainer: Damian Conway [EMAIL PROTECTED]
Date:   9 Apr 2005
Last Modified:  7 Oct 2006


=head1 Perldoc

Perldoc is an easy-to-use markup language with a simple, consistent
underlying document object model. Perldoc can be used for writing the
documentation for Perl 5 and Perl 6, and for Perl programs and modules,
as well as for other types of document composition.

Perldoc allows for multiple syntactic Idialects, all of which map onto
the same set of standard document objects. The standard dialect is named
LPod|#The Pod Dialect.


=head1 The Pod Dialect

IPod is an evolution of Perl 5's Plain Ol' Documentation (POD) markup.
Compared to Perl 5 POD, Perldoc's Pod dialect is much more uniform,
somewhat more compact, and considerably more expressive.

=head2 General syntactic structure

Pod blocks are specified using Idirectives, which always start with an
C= in the first column. Every Pod block directive may be written in
any of three equivalent forms: Idelimited style, Iparagraph style,
or Iabbreviated style.


=head3 Delimited blocks

Delimited blocks are bounded by C=begin and C=end markers, both of
which are followed by a valid identifierNA valid identifier is a
sequence of alphanumerics and/or underscores, beginning with an
alphabetic or underscore, which is the typename of the block. Typenames
that are entirely lowercase (for example: C=begin head1) or entirely
uppercase (for example: C=begin SYNOPSIS) are reserved.

After the typename, the rest of the C=begin marker line is treated as
configuration information for the block. This information is used in
different ways by different types of blocks, and is specified using
Perl6ish C:key{value} or C key=value  pairs (which must, of
course, be constants since Perldoc is a specification language, not a
programming language).
See LSynposis 2|http://dev.perl.org/perl6/doc/design/syn/S02.html#Literals
for a summary of the Perl 6 pair notation.

The configuration section may be extended over subsequent lines by
starting those lines with an C= in the first column followed by a
horizontal whitespace character.

The lines following the opening delimiter and configuration are the data
or contents of the block, which continue until the block's C=end marker
line. The general syntax is:

=begin code :allow R 
 =begin RBLOCK_TYPE  ROPTIONAL CONFIG INFO
 =  ROPTIONAL EXTRA CONFIG INFO
 RBLOCK CONTENTS
 =end RBLOCK_TYPE
=end code

For example:

 =begin table  :titleTable of Contents
 Constants   1
 Variables   10
 Subroutines 33
 Everything else 57
 =end Table

 =begin Name  :required
 =:width(50)
 The applicant's full name
 =end Name

 =begin Contact  :optional
 The applicant's contact details
 =end Contact

Note that no blank lines are required around the directives, and blank
lines within the contents are always treated as part of the contents.

Note also that in the following specifications, a blank line is a line that
is either empty or that contains only whitespace characters. That is, a blank
line matches C/^\s*?$/. Pod uses blank lines, rather than empty lines, as
delimiters (on the principle of least surprise).


=head3 Paragraph blocks

Paragraph blocks are introduced by a C=for marker and terminated by
the next Pod directive or the first blank line (which is Inot
considered to be part of the block's contents). The C=for marker is
followed by the name of the directive and optional
configuration information. The general syntax is:

=begin code :allow R 
 =for RBLOCK_TYPE  ROPTIONAL CONFIG INFO
 =ROPTIONAL EXTRA CONFIG INFO
 RBLOCK DATA

=end code

For example:

 =for table  :titleTable of Contents
 Constants   1
 Variables   10
 Subroutines 33
 Everything else 57

 =for Name  :required
 =  :width(50)
 The applicant's full name

 =for Contact  :optional
 The applicant's contact details

Once again, blank lines are not required around the directive (this is a
universal feature of Pod).


=head3 Abbreviated blocks

Abbreviated blocks are introduced by an C'=' sign in the
first column, which is followed immediately by the typename of the
block. The rest of the line is treated as block