<intro> Hi Mongers, Part one of this story, which I released last week on this list, was really provoking people to rethink the way work work now (not!)
The challenge was made by Damian, to design a documentation system which is easier to use: where code and doc work together, in stead of being orthogonally separated. Well, the gauntlet has been taken, resulting in this 0.0 version; of course incomplete. Most of the non-inlined features of this design are available for Perl5 via the OODoc distribution. It is a pity that Perl5 subroutines provide so little information about how they are to be used: therefore you have to specify much more explicitly than required for Perl6. As I have asked twice before, I would really like to see a demo by Damian on how to document a small Perl6 class (say 30 lines) using his S26 specs. Including everything what is needed to get it configured in a way that tools may swallow it. In return, I will use the design described here to do the same... for comparison. MarkOv </intro> ======== Documentation of Perl6 === part2: general syntax, version 0.0 The crucial difference between Damian's S26 design and my opinion about documentation, is that IMO the focus must on being able to create tools which produce the best end-user documentation (see part1 of this story) and the markup language is just one (minor) issue. Damians Synopsis S26 does design a markup language, but does not want to define code inter- connection nor predefined markup tags to help the tools. === Choosing Symbols Let's try to avoid reinventing the syntax wheel. In Perl5 we use POD for end-user documentation (text surrounded by lines which start with a '=') and comments to explain the program (texts with a leading '#'). In Perl6, the same symbols are allocated, and people are used to that. Let's stick with them, for now. Defined are two ways to add documentation fragments to the code: (1) the POD(5) way: =over 4 =item . one =back =cut As in POD(5), any line which starts with a '=' will start a documentation block, and '=cut' ends it. Extensions to POD(5) - "internal" blank lines are optional; only the line before the first line of a documentation fragment must be blank. This will make the documentation much more compact to write, so code will get more visible in the file. - each tag is matched with /^\=+(\w+)[\s+|$]/ By permitting more than one '=' as leader, the author can make his/her own visual helpers. For instance, the '=head1' in the file looks as heavy as the '=over', but is of course of much more importance. So: the author may decide to say '====head1'. The extra '=' are not significant. - a (long) list of logical markup tags will be used to add information about what is being documented. We will not write =item print OPTIONS but =sub print OPTIONS or ====sub print OPTIONS (if we need to) - the tag /^\=+end\s+$1/ will also terminate a pod block, like =cut, but then with a semantic role as well. (2) Inlined documentation, like comment Many (small) features in the program need to get some documentation, for instance, class attributes. To start a documentation block, as described in (1) for each of them is both a lot of work to type and makes it harder to get an overview on the code. Without extension of the available symbols for documentation in Perl6, our hands are bound. So, therefore (for the moment) I use /[^|\s|\;]\#\=/ for user documentation, where /[^|\s|;]\#[^=]/ is programmers comment. Inlined docs are automatically linked to *declaration* above or before them: .has Point $center; #= center coordinate of the universe method move(float $dx, float $dy, float $dz) { #= Jump to a different location, relative to the #= current position. #=param $dx in parsec # of course, time should be included in this # interface. We will do that later. [normal comment] } The last two lines are code comments. === Producing Document fragments Both inlined and block documentation provides the same features. The fragments start in one of three ways: (a) method print() {...} # implementation anywhere in the file #=method print =method print #=description =description #= line1 line1 #= line2 line2 #=end method =end method (b) method print() {...} # implementation anywhere in the file #=method print =method print #= line1 line1 #= line2 line2 #=end method =end method (c) method print() {...} method print() {...} #= line1 #= line2 =description line1 line2 =cut When followed by the next method/end section, no "=end" is required. Just after a container item, we always start with a description. Clearly, the inlined version of (c) is most compact. Also in the following cases, you will have information collected for the manual pages to be produced later... still without description. (d) has .$center; method print() {...} The description is always kept in the first lines of a container. Additional (nested) informational items start with #=<name> <parameters> Their description can start on the same line, or on the next line. Equivalent are: #=param $x #= the horizontal coordinate. #=param $x the horizontal coordinate. #=param $x the horizontal # coordinate. === Merging with code One of the targets of this design, is to avoid replication of information: when the program says that a parameter has default '10', then the documentation shouldn't say '42' (... unless you really want to) Long example from Apocalypse A06: method action ($self: int $x = 10, int ?$y, int ?$z, Adverb +$how, Beneficiary +$for, Location +$at is copy, Location +$toward is copy, Location +$from is copy, Reason +$why, *%named, [EMAIL PROTECTED] ) {...} Without any intervention, this will produce something comparable to the following. Some items are hidden for simplicity, and the (ignored) additional '=' and blank lines are used to show some structure. ====method action ===visibility public ===call ($self: int $x, int ?$y, int ?$z, OPTIONS, LIST) ==param $x =type int =use required =default 10 ==param $y =type int =use optional ==option $how =type Adverb ==option $from =type Location =pass copy ... etc ... =end call The purpose of the first phase of the documentation processor is to generate as much (consistent) information about the (public) interface as possible. The back-end manual-page generators will limit the amount of information they present: it is the user's decision what they want to read, not the authors! This automatic extraction process is the most complicated part of the whole implementation, for sure. It is wise to have this info not collected some POD tool, but directly extracted from the Perl6 AST. Now a complex example from Apocalypse A06: sub swap ([EMAIL PROTECTED] is rw) { @_[0,1] = @_[1,0] }; Could be documented overruling nearly all generated information. Explictly overruling the parameter list with "call" will override the automatically generated parameter information. This is especially useful to merge the details about multi methods/subs. sub swap ([EMAIL PROTECTED] is rw) { @_[0,1] = @_[1,0] }; #= exchange the content of two variables. #=call (A, B) #=return the reverse list #=param A will be replaced by B #=param B will be replaced by A # this is just to demonstrate programmers comment: # we probably should check the number of values passed. If you like to describe your code before it is used, you need a reference for the documentation fragment: #====sub swap #= exchange the content of two variables. #=call (A, B) #=return the reverse list #=param A will be replaced by B #=param B will be replaced by A sub swap ([EMAIL PROTECTED] is rw) { @_[0,1] = @_[1,0] }; # we probably should check the number of values passed. or ====sub swap exchange the content of two variables. =call (A, B) =return the reverse list =param A will be replaced by B =param B will be replaced by A =cut # or =end sub or =end sub swap sub swap ([EMAIL PROTECTED] is rw) { @_[0,1] = @_[1,0] }; # we probably should check the number of values passed. Adding some example to swap, anywhere in the file (probably close by, or in the same block without need for a reference) ====sub swap =example my ($a, $b) = (10,42) swap($a, $b); say ":$a:$b:"; # :42:10: =end sub In the same way, you can add descriptions of procedured error and warning messages ('=error', '=warning'). === Ordering Each documentation fragment types (both blocks and inlined) describe a specific kind of knowledge; therefore it is always defined where it belongs to; either implicit or explicit. When being processed, the document fragments are organized into a tree ("DocTree"), derived from the Perl6 AST (Abstract Syntax Tree). As the Perl6 can be distributed in compiled form, also this "DocTree" can be distributed as half-product. Back-end documentation tools will use (one or more of these) DocTrees as only source of information to produce user manuals: they should not (need to) process the source code themselves to collect additional information. The created DocTree looks something like this: root:: distribution(MyDist) file(MyDist.pm) chapter(copyrights) chapter(authors) package(MyDist) manual-data class(MyClass) inheritance-info manual-data manual-data:: chapter(name) chapter(description) chapter(methods) section(constructors) method(CLASS, dup) call option(Debug) type(BOOLEAN) default(false) parameter example You now can either explictly or implicitly add something to a block. (1) Explicitly is the clearest. It uses the predefined logical markup statements. Example: =chapter METHODS =section Constructors =method dup Duplicate an object. The DocTree generator will lookup additional information about the dup() method automatically, and place that on the explicitly indicated spot in the tree as well. (2) Implicit reference is a bit tricky: some tags found by the Perl6 parser will automatically insert tags. The parser will not only insert starting tags, but also ending tags: for instance, when a new chapter starts, then all lower and equal level nested block-structures will be closed automatically. An example which shows implicit and explicit references: class Mail::Message { #= a general message object. =chapter DESCRIPTION Implementation of... =cut method print() {...} #= output message header and body. } is equivalent to class Mail::Message { =class Mail::Message a general message object =chapter DESCRIPTION Implementation of... =end chapter =chapter METHODS =cut method print() {...} =method print output message header and body. =call () =end method =end chapter =end class } As should be clear from above example: implicit references make the live for the documentation authors a lot easier. Chosen is to use "=end method" instead of "=method end", to avoid conflicts with a method or subroutine named "end()". You may also use "=end chapter METHODS", in which case the provided name shall match. When there is author supplied information about a code feature, that block location will be used to collect all information about the feature. If there is no author supplied info, then the location of the implementation will be used. This concept will make it easy to create the whole manual-page below all code. The DocTree structural definition [sloppy] root: distribution* distribution: (file|package|class|grammar)* file: chapter* package: chapter*, exporter-info? class,grammar: chapter*, inheritance-info? chapter: description, (section|example|callable)* section: description, (subsection|example|callable)* subsection: description, (subsubsection|example|callable)* subsubsection: description, (example|callable)* callable: method|rule|sub|macro|.... method,rule,sub: description,(call|example|report)* call: (param|option)* param,option: name, description?, default?, type?, use? description,example: text? report: text? # describes errors and warnings text: unicode-string # a documentation fragment using markup. === The markup language The text blocks in the DocTree will use POD(5) block markup syntax, so the users may define their own markup syntax (like Synopsis S26 defines), as long as a POD(5) is recorded in the DocTree. For sake of better references between documentation elements, POD needs a more detailed reference syntax: A<Some::Module> refers to a package/class/grammar with that name A<do_something()> refers to a sub/method/rule in this resp. package/class/grammar. May be available via inheritance from some base class/grammar, or from import(). A<Some::Module::do_something()> refers specificly to an element in a different page A<do_something(option)> A<Some::Module::do_something(option)> references to a parameter or named-parameter description. All above define references. Some back-ends, like UNIX manual-pages and POD, may not support such fine resolution, and need to rewrite these links into text. The destination components do not need to register themselves as anchor points. This is very important, because the destination may very part of a different distribution. === User documentation A documentation generating back-end takes the DocTree of one or more distributions, and uses only the fragments it finds interesting. That sub-set of features is converted into end-user manuals, for instance into traditional POD, man, HTML, XML, or LaTeX. (User provided) templates can really simplify this process. developer | v Perl(6) files of a distribution | + Perl6 compilation v Perl AST ---> code | + fragment collector/splitter + markup translator to POD+ v DocTree (distributable) | | ,-----< DocTree* additional || ++ Manual-page generator (templates/style sheets) | static manuals (distributable) | v end-user === Syntax Alternatives In above syntax, only the currently allocated '=' and '#=' symbols are used. When additional symbols will be made available, then a visually cleaner syntax might be developed. (0) With current symbols method compute() method compute() #= some text = some text #= and more = and more These '#=' are quite visually heavy. Of course we are used to an extensive application of the '#' as comment, but still. Without '#', it is much prettier. (1) The Python look: method compute() """ some text and more """ There are already so many quotes is a text, that it is confusing. The character is pale, which doesn't feel pleasant. (2) like attaching a label method compute() method compute($x) ` some text ` some text ` and more =param $x ` the starting point (3) like a line method compute() method compute() | some text : some text | and more : and more Zillion other possibilities, which all require a change in the current Perl6 syntax definition. It could be a good plan to create a larger example module, and then experiment with above suggestions, within Perl6's boundaries.