Re: [O] [RFC] Org syntax (draft)

Carsten Dominik Thu, 07 Mar 2013 12:48:21 -0800

woooooow, this is awesome Nicolas, thank you!

- Carsten


On 7.3.2013, at 21:37, Nicolas Goaziou <n.goaz...@gmail.com> wrote:

> Hello,
> 
> As discussed a few days ago, here is a document describing the complete
> Org syntax as read by the parser. I also added some comments. I am going
> to put the Org file on Worg, so anyone can update it and fix mistakes.
> 
>                          ━━━━━━━━━━━━━━━━━━━━
>                           ORG SYNTAX (DRAFT)
>                          ━━━━━━━━━━━━━━━━━━━━
> 
> 
> Table of Contents
> ─────────────────
> 
> 1 Headlines and Sections
> 2 Affiliated Keywords
> 3 Greater Elements
> .. 3.1 Greater Blocks
> .. 3.2 Drawers and Property Drawers
> .. 3.3 Dynamic Blocks
> .. 3.4 Footnote Definitions
> .. 3.5 Inlinetasks
> .. 3.6 Plain Lists and Items
> .. 3.7 Tables
> 4 Elements
> .. 4.1 Babel Call
> .. 4.2 Blocks
> .. 4.3 Clock, Diary Sexp and Planning
> .. 4.4 Comments
> .. 4.5 Fixed Width Areas
> .. 4.6 Horizontal Rules
> .. 4.7 Keywords
> .. 4.8 LaTeX Environments
> .. 4.9 Node Properties
> .. 4.10 Paragraphs
> .. 4.11 Table Rows
> 5 Objects
> .. 5.1 Entities and LaTeX Fragments
> .. 5.2 Export Snippets
> .. 5.3 Footnote References
> .. 5.4 Inline Babel Calls and Source Blocks
> .. 5.5 Line Breaks
> .. 5.6 Links
> .. 5.7 Macros
> .. 5.8 Targets and Radio Targets
> .. 5.9 Statistics Cookies
> .. 5.10 Subscript and Superscript
> .. 5.11 Table Cells
> .. 5.12 Timestamps
> .. 5.13 Text Markup
> 
> 
> This document describes and comments Org syntax as it is currently read
> by its parser (Org Elements) and, therefore, by the export framework.
> It also includes a few comments on that syntax.
> 
> A core concept in this syntax is that only headlines and sections are
> context-free[1][2].  Every other syntactical part only exists within
> specific environments.
> 
> Three categories are used to classify these environments: “Greater
> elements”, “elements”, and “objects”, from the broadest scope to the
> narrowest.
> 
> The paragraph is the unit of measurement.  An element defines
> syntactical parts that are at the same level as a paragraph, i.e. which
> cannot contain or be included in a paragraph.  An object is a part that
> could be included in an element.  Greater elements are all parts that
> can contain an element.
> 
> Empty lines belong to the largest element ending before them.  For
> example, in a list, empty lines between items belong are part of the
> item before them, but empty lines at the end of a list belong to the
> plain list element.
> 
> Unless specified otherwise, case is not significant.
> 
> 
> 1 Headlines and Sections
> ════════════════════════
> 
>  A headline is defined as:
> 
>  ╭────
>  │ STARS KEYWORD PRIORITY TITLE TAGS
>  ╰────
> 
>  STARS is a string starting at column 0 and containing at least one
>  asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask'
>  library is loaded).  It’s the sole compulsory part of a headline.
> 
>  KEYWORD is a TODO keyword, which have to belong to the list defined in
>  `org-todo-keywords'.  Case is significant.
> 
>  PRIORITY is a priority cookie, i.e. a single letter preceded by a hash
>  sign # and enclosed within square brackets.  Case is significant.
> 
>  TITLE can be made of any character but a new line.  Though, it will
>  match after every other part have been matched.
> 
>  TAGS is made of words containing any alpha-numeric character,
>  underscore, at sign, hash sign or percent sign, and separated with
>  colons.
> 
>  Examples of valid headlines include:
> 
>  ╭────
>  │ *
>  │ 
>  │ ** DONE
>  │ 
>  │ *** Some e-mail
>  │ 
>  │ **** TODO [#A] COMMENT Title :tag:a2%:
>  ╰────
> 
>  If the first word appearing in the title is `org-comment-keyword', the
>  headline will be considered as “commented”.  If that first word is
>  `org-quote-string', it will be considered as “quoted”.  In both
>  situations, case is significant.
> 
>  If its title is `org-footnote-section', it will be considered as
>  a “footnote section”.  Case is significant.
> 
>  If `org-archive-tag' is one of its tags, it will be considered as
>  “archived”.  Case is significant.
> 
>  A headline contains directly at most one section, followed by any
>  number of headlines.  Only a section can contain another section.
> 
>  A section contains directly any greater element or element.  Only
>  a headline can contain a section.  As an exception, text before the
>  first headline in the document also belongs to a section.
> 
>  In a quoted headline contains a section, the latter will be considered
>  as a “quote section”.
> 
>  As an example, consider the following document:
> 
>  ╭────
>  │ An introduction.
>  │ 
>  │ * A Headline 
>  │ 
>  │   Some text.
>  │ 
>  │ ** Sub-Topic 1
>  │ 
>  │ ** Sub-Topic 2
>  │ 
>  │ *** Additional entry 
>  │ 
>  │ ** QUOTE Another Sub-Topic
>  │ 
>  │    Some other text.
>  ╰────
> 
>  Its internal structure could be summarized as:
> 
>  ╭────
>  │ (document
>  │  (section)
>  │  (headline
>  │   (section)
>  │   (headline)
>  │   (headline
>  │    (headline))
>  │   (headline
>  │    (quote-section))))
>  ╰────
> 
> 
> 2 Affiliated Keywords
> ═════════════════════
> 
>  With the exception of [inlinetasks], [items], [planning], [clocks],
>  [node properties] and [table rows], every other element type can be
>  assigned attributes.
> 
>  This is done by adding specific keywords, named “affiliated keywords”,
>  just above the element considered, no blank line allowed.
> 
>  Affiliated keywords are built upon one of the following patterns:
>  “#+KEY: VALUE”, “#+KEY[OPTIONAL]: VALUE” or “#+ATTR_BACKEND: VALUE”.
> 
>  KEY is either “CAPTION”, “HEADER”, “NAME”, “PLOT” or “RESULTS” string.
> 
>  BACKEND is a string constituted of alpha-numeric characters, hyphens
>  or underscores.
> 
>  OPTIONAL and VALUE can contain any character but a new line.  Only
>  keywords in `org-element-dual-keywords' can have an optional value.
> 
>  An affiliated keyword can appear on multiple lines if KEY belongs to
>  `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND:
>  VALUE”.
> 
>  Affiliated keywords whose KEY belong to `org-element-parsed-keywords'
>  can contain objects in their value and their optional value, if
>  applicable.
> 
> 
>  [inlinetasks] See section 3.5
> 
>  [items] See section 3.6
> 
>  [planning] See section 4.3
> 
>  [clocks] See section 4.3
> 
>  [node properties] See section 4.9
> 
>  [table rows] See section 4.11
> 
> 
> 3 Greater Elements
> ══════════════════
> 
>  Unless specified otherwise, greater elements can contain directly any
>  other element or greater element excepted:
> 
>  • elements of their own type,
>  • [node properties], which can only be found in [property drawers],
>  • [items], which can only be found in [plain lists].
> 
> 
>  [node properties] See section 4.9
> 
>  [property drawers] See section 3.2
> 
>  [items] See section 3.6
> 
>  [plain lists] See section 3.6
> 
> 
> 3.1 Greater Blocks
> ──────────────────
> 
>  Greater blocks consist in the following pattern:
> 
>  ╭────
>  │ #+BEGIN_NAME PARAMETERS
>  │ CONTENTS
>  │ #+END_NAME
>  ╰────
> 
>  NAME can contain any non-whitespace character.
> 
>  PARAMETERS can contain any character, and can be omitted.
> 
>  If NAME is “CENTER”, it will be a “center block”.  If it is “QUOTE”,
>  it will be a “quote block”.
> 
>  If the block is neither a center block, a quote block or a [block
>  element], it will be a “special block”.
> 
>  CONTENTS can contain any element, but another greater block of the
>  same type.
> 
> 
>  [block element] See section 4.2
> 
> 
> 3.2 Drawers and Property Drawers
> ────────────────────────────────
> 
>  Pattern for drawers is:
> 
>  ╭────
>  │ :NAME:
>  │ CONTENTS
>  │ :END:
>  ╰────
> 
>  NAME has to either be “PROPERTIES” or belong to `org-drawers' list.
> 
>  If NAME is “PROPERTIES”, the drawer will become a “property drawer”.
> 
>  In a property drawers, CONTENTS can only contain [node property]
>  elements.  Otherwise it can contain any element but another drawer or
>  property drawer.
> 
>                                  ―――――
> 
>  It would be nice if users hadn’t to register drawers names before
>  using them in `org-drawers' (or through the `#+DRAWERS:' keyword).
>  Anything starting with `^[ \t]*:\w+:[ \t]$' and ending with
>  `^[ \t]*:END:[ \t]$' could be considered as a drawer.  — ngz
> 
> 
>  [node property] See section 4.9
> 
> 
> 3.3 Dynamic Blocks
> ──────────────────
> 
>  Pattern for dynamic blocks is:
> 
>  ╭────
>  │ #+BEGIN: NAME PARAMETERS
>  │ CONTENTS
>  │ #+END:
>  ╰────
> 
>  NAME cannot contain any whitespace character.
> 
>  PARAMETERS can contain any character and can be omitted.
> 
> 
> 3.4 Footnote Definitions
> ────────────────────────
> 
>  Pattern for footnote definitions is:
> 
>  ╭────
>  │ [LABEL] CONTENTS
>  ╰────
> 
>  It must start at column 0.
> 
>  LABEL is either a number or follows the pattern “fn:WORD”, where word
>  can contain any word-constituent character, hyphens and underscore
>  characters.
> 
>  CONTENTS can contain any element excepted another footnote definition.
>  It ends at the next footnote definition, the next headline, two
>  consecutive empty lines or the end of buffer.
> 
> 
> 3.5 Inlinetasks
> ───────────────
> 
>  Inlinetasks are defined by `org-inlinetask-min-level' contiguous
>  asterisk characters starting at column 0, followed by a whitespace
>  character.
> 
>  Optionally, inlinetasks can be ended with a string constituted of
>  `org-inlinetask-min-level' contiguous characters starting at column 0,
>  followed by a space and the “END” string.
> 
>  Inlinetasks are recognized only after `org-inlinetask' library is
>  loaded.
> 
> 
> 3.6 Plain Lists and Items
> ─────────────────────────
> 
>  Items are defined by a line starting with the following pattern:
>  “BULLET COUNTER-SET CHECK-BOX TAG”, in which only BULLET is mandatory.
> 
>  BULLET is either an asterisk, a hyphen, a plus sign character or
>  follows either the pattern “COUNTER.” or “COUNTER)".  In any case,
>  BULLET is follwed by a whitespace character or line ending.
> 
>  COUNTER can be a number or a single letter.
> 
>  COUNTER-SET follows the pattern [@COUNTER].
> 
>  CHECK-BOX is either a single whitespace character, a “X” character or
>  a hyphen, enclosed within square brackets.
> 
>  TAG follows “TAG-TEXT ::” pattern, where TAG-TEXT can contain any
>  character but a new line.
> 
>  An item ends before the next item, the first line less or equally
>  indented than its starting line, or two consecutive empty lines.
>  Indentation of lines within other greater elements do not count,
>  neither do inlinetasks boundaries.
> 
>  A plain list is a set of consecutive items of the same indentation.
>  It can only directly contain items.
> 
>  If first item in a plain list has a counter in its bullet, the plain
>  list will be an “ordered plain-list”.  If it contains a tag, it will
>  be a “descriptive list”.  Otherwise, it will be an “unordered list”.
>  List types are mutually exclusive.
> 
>  For example, consider the following excerpt of an Org document:
> 
>  ╭────
>  │ 1. item 1
>  │ 2. [X] item 2
>  │    - some tag :: item 2.1
>  ╰────
> 
>  Its internal structure is as follows:
> 
>  ╭────
>  │ (ordered-plain-list
>  │  (item)
>  │  (item
>  │   (descriptive-plain-list
>  │    (item))))
>  ╰────
> 
> 
> 3.7 Tables
> ──────────
> 
>  Tables start at lines beginning with either a vertical bar or the “+-”
>  string followed by plus or minus signs only, assuming they are not
>  preceded with lines of the same type.  These lines can be indented.
> 
>  A table starting with a vertical bar has “org” type.  Otherwise it has
>  “table.el” type.
> 
>  Org tables end at the first line not starting with a vertical bar.
>  Table.el tables end at the first line not starting with either
>  a vertical line or a plus sign.  Such lines can be indented.
> 
>  An org table can only contain table rows.  A table.el table does not
>  contain anything.
> 
> 
> 4 Elements
> ══════════
> 
>  Elements cannot contain any other element.
> 
>  Only [keywords] whose name belongs to
>  `org-element-document-properties', [verse blocks] , [paragraphs] and
>  [table rows] can contain objects.
> 
> 
>  [keywords] See section 4.7
> 
>  [verse blocks] See section 4.2
> 
>  [paragraphs] See section 4.10
> 
>  [table rows] See section 4.11
> 
> 
> 4.1 Babel Call
> ──────────────
> 
>  Pattern for babel calls is:
> 
>  ╭────
>  │ #+CALL: VALUE
>  ╰────
> 
>  VALUE is optional.  It can contain any character but a new line.
> 
> 
> 4.2 Blocks
> ──────────
> 
>  Like [greater blocks], pattern for blocks is:
> 
>  ╭────
>  │ #+BEGIN_NAME DATA
>  │ CONTENTS
>  │ #+END_NAME
>  ╰────
> 
>  NAME cannot contain any whitespace character.
> 
>  If NAME is “COMMENT”, it will be a “comment block”.  If it is
>  “EXAMPLE”, it will be an “example block”.  If it is “SRC”, it will be
>  a “source block”.  If it is “VERSE”, it will be a “verse block”.
> 
>  If NAME is a string matching the name of any export back-end loaded,
>  the block will be an “export block”.
> 
>  DATA can contain any character but a new line.  It can be ommitted,
>  unless the block is a “source block”.  In this case, it must follow
>  the pattern “LANGUAGE SWITCHES ARGUMENTS”, where SWITCHES and
>  ARGUMENTS are optional.
> 
>  LANGUAGE cannot contain any whitespace character.
> 
>  SWITCHES is made of any number of “SWITCH” patterns, separated by
>  blank lines.
> 
>  A SWITCH pattern is either “-l “FORMAT"", where FORMAT can contain any
>  character but a double quote and a new line, “-S” or “+S”, where
>  S stands for a single letter.
> 
>  ARGUMENTS can contain any character but a new line.
> 
>  CONTENTS can contain any character, including new lines.  Though it
>  will only contain Org objects if the block is a verse block.
>  Otherwise, contents will not be parsed.
> 
> 
>  [greater blocks] See section 3.1
> 
> 
> 4.3 Clock, Diary Sexp and Planning
> ──────────────────────────────────
> 
>  A clock follows the pattern:
> 
>  ╭────
>  │ CLOCK: TIMESTAMP DURATION
>  ╰────
> 
>  Both TIMESTAMP and DURATION are optional.
> 
>  TIMESTAMP is a [timestamp] object.
> 
>  DURATION follows the pattern:
> 
>  ╭────
>  │ => HH:MM
>  ╰────
> 
>  HH is a number containing any number of digits.  MM is a two digit
>  numbers.
> 
>  A diary sexp is a line starting at column 0 with “%%(" string.  It can
>  then contain any character besides a new line.
> 
>  A planning is a line filled with more at most three INFO parts, where
>  each INFO part follows the pattern:
> 
>  ╭────
>  │ KEYWORD: TIMESTAMP
>  ╰────
> 
>  KEYWORD is a string among `org-deadline-string',
>  `org-scheduled-string' and `org-closed-string'.  TIMESTAMP is is
>  a [timestamp] object.
> 
>  Even though a planning element can exist anywhere in a section or
>  a greater element, it will only affect the headline containing the
>  section if it is put on the line following that headline.
> 
> 
>  [timestamp] See section 5.12
> 
> 
> 4.4 Comments
> ────────────
> 
>  A “comment line” starts with a hash signe and a whitespace character
>  or an end of line.
> 
>  Comments can contain any number of consecutive comment lines.
> 
> 
> 4.5 Fixed Width Areas
> ─────────────────────
> 
>  A “fixed-width line” start with a colon character and a whitespace or
>  an end of line.
> 
>  Fixed width areas can contain any number of consecutive fixed-width
>  lines.
> 
> 
> 4.6 Horizontal Rules
> ────────────────────
> 
>  A horizontal rule is a line made of at least 5 consecutive hyphens.
>  It can be indented.
> 
> 
> 4.7 Keywords
> ────────────
> 
>  Keywords follow the syntax:
> 
>  ╭────
>  │ #+KEY: VALUE
>  ╰────
> 
>  KEY can contain any non-whitespace character, but it cannot be equal
>  to “CALL” or any affiliated keyword.
> 
>  VALUE can contain any character excepted a new line.
> 
>  If KEY belongs to `org-element-document-properties', VALUE can contain
>  objects.
> 
> 
> 4.8 LaTeX Environments
> ──────────────────────
> 
>  Pattern for LaTeX environments is:
> 
>  ╭────
>  │ \begin{NAME}
>  │ CONTENTS
>  │ \end{NAME}
>  ╰────
> 
>  NAME is constituted of alpha-numeric characters and may end with an
>  asterisk.
> 
>  CONTENTS can contain anything but the “\end{NAME}” string.
> 
> 
> 4.9 Node Properties
> ───────────────────
> 
>  Patter for node properties is:
> 
>  ╭────
>  │ :PROPERTY: VALUE
>  ╰────
> 
>  PROPERTY can contain any non-whitespace character.  VALUE can contain
>  any character but a new line.
> 
>  Node properties can only exist in a [property drawers].
> 
> 
>  [property drawers] See section 3.2
> 
> 
> 4.10 Paragraphs
> ───────────────
> 
>  Paragraphs are the default element, which means that any unrecognized
>  context is a paragraph.
> 
>  Empty lines and other elements end paragraphs.
> 
>  Paragraphs can contain every type of object.
> 
> 
> 4.11 Table Rows
> ───────────────
> 
>  A table rows is either constituted of a vertical bar and any number of
>  [table cells] or a vertical bar followed by a hyphen.
> 
>  In the first case the table row has the “standard” type.  In the
>  second case, it has the “rule” type.
> 
>  Table rows can only exist in [tables].
> 
> 
>  [table cells] See section 5.11
> 
>  [tables] See section 3.7
> 
> 
> 5 Objects
> ═════════
> 
>  Objects can only be found in the following locations:
> 
>  • [affiliated keywords] defined in `org-element-parsed-keywords',
>  • [document properties],
>  • [headline] titles,
>  • [inlinetask] titles,
>  • [item] tags,
>  • [paragraphs],
>  • [table cells],
>  • [table rows], which can only contain table cell objects,
>  • [verse blocks].
> 
>  Most objects cannot contain objects.  Those which can will be
>  specified.
> 
> 
>  [affiliated keywords] See section 2
> 
>  [document properties] See section 4.7
> 
>  [headline] See section 1
> 
>  [inlinetask] See section 3.5
> 
>  [item] See section 3.6
> 
>  [paragraphs] See section 4.10
> 
>  [table cells] See section 5.11
> 
>  [table rows] See section 4.11
> 
>  [verse blocks] See section 4.2
> 
> 
> 5.1 Entities and LaTeX Fragments
> ────────────────────────────────
> 
>  An entity follows the pattern:
> 
>  ╭────
>  │ \NAME POST
>  ╰────
> 
>  where NAME has a valid association in either `org-entities' or
>  `org-entities-user'.
> 
>  POST is the end of line, "{}" string, or a non-alphabetical character.
>  It isn’t separated from NAME by a whitespace character.
> 
>  A LaTeX fragment can follow multiple patterns:
> 
>  ╭────
>  │ \NAME POST
>  │ \(CONTENTS\)
>  │ \[CONTENTS\]
>  │ $$CONTENTS$$
>  │ PRE$CHAR$POST
>  │ PRE$BORDER1 BODY BORDER2$
>  ╰────
> 
>  NAME contains alphabetical characters only and must not have an
>  association in either `org-entities' or `org-entities-user'.
> 
>  POST is the same as for entities.
> 
>  CONTENTS can contain any character but cannot contain “\)" in the
>  second template or “\]" in the third one.
> 
>  PRE is either the beginning of line or a character different from `$'.
> 
>  CHAR is a non-whitespace character different from `.', ~,~, `?', `;',
>  ~’~ or a double quote.
> 
>  POST is any of `-', `.', ~,~, `?', `;', `:', ~’~, a double quote,
>  a whitespace character and the end of line.
> 
>  BORDER1 is a non-whitespace character different from `.', `;', `.'
>  and `$'.
> 
>  BODY can contain any character excepted `$', and may not span over
>  more than 3 lines.
> 
>  BORDER2 is any non-whitespace character different from ~,~, `.' and
>  `$'.
> 
>                                  ―――――
> 
>        It would introduce incompatibilities with previous Org
>        versions, but support for “$…$” (and for symmetry,
>        `$$...$$') constructs ought to be removed.
> 
>        They are slow to parse, fragile, redundant, imply false
>        positives and do not look good in LaTeX output anyway.
>        Even the LaTeX community suggests to use `\(...\)' over
>        `$...$'.  — ngz
> 
> 
> 5.2 Export Snippets
> ───────────────────
> 
>  Patter for export snippets is:
> 
>  ╭────
>  │ @@NAME:VALUE@@
>  ╰────
> 
>  NAME can contain any alpha-numeric character and hyphens.
> 
>  VALUE can contain anything but “@@” string.
> 
> 
> 5.3 Footnote References
> ───────────────────────
> 
>  There are four patterns for footnote references:
> 
>  ╭────
>  │ [MARK]
>  │ [fn:LABEL]
>  │ [fn:LABEL:DEFINITION]
>  │ [fn::DEFINITION]
>  ╰────
> 
>  MARK is a number.
> 
>  LABEL can contain any word constituent character, hyphens and
>  underscores.
> 
>  DEFINITION can contain any character.  Though opening and closing
>  square brackets must be balanced in it.  It can contain any object
>  encountered in a paragraph, even other footnote references.
> 
>  If the reference follows the third pattern, it is called an “inline
>  footnote”.  If it follows the fourth one, i.e. if LABEL is omitted, it
>  is an “anonymous footnote”.
> 
> 
> 5.4 Inline Babel Calls and Source Blocks
> ────────────────────────────────────────
> 
>  Inline Babel calls follow any of the following patterns:
> 
>  ╭────
>  │ call_NAME(ARGUMENTS)
>  │ call_NAME[HEADER](ARGUMENTS)[HEADER]
>  ╰────
> 
>  NAME can contain any character besides `(', `)' and “\n”.
> 
>  HEADER can contain any character besides `]' and “\n”.
> 
>  ARGUMENTS can contain any character besides `)' and “\n”.
> 
>  Inline source blocks follow any of the following patterns:
> 
>  ╭────
>  │ src_LANG{BODY}
>  │ src_LANG[OPTIONS]{BODY}
>  ╰────
> 
>  LANG can contain any non-whitespace character.
> 
>  OPTIONS and BODY can contain any character but “\n”.
> 
> 
> 5.5 Line Breaks
> ───────────────
> 
>  A line break consists in “\\SPACE” pattern at the end of an otherwise
>  non-empty line.
> 
>  SPACE can contain any number of tabs and spaces, including 0.
> 
> 
> 5.6 Links
> ─────────
> 
>  There are 4 major types of links:
> 
>  ╭────
>  │ RADIO                     ("radio" link)
>  │ <PROTOCOL:PATH>           ("angle" link)
>  │ PRE PROTOCOL:PATH2 POST   ("plain" link)
>  │ [[PATH3]DESCRIPTION]      ("regular" link)
>  ╰────
> 
>  RADIO is a string matched by some [radio target].  It can contain
>  [entities], [latex fragments], [subscript] and [superscript] only.
> 
>  PROTOCOL is a string among `org-link-types'.
> 
>  PATH can contain any character but `]', `<', `>' and `\n'.
> 
>  PRE and POST are non word constituent.  They can be, respectively, the
>  beginning or the end of a line.
> 
>  PATH2 can contain any non-whitespace character excepted `(', `)', `<'
>  and `>'.  It must end with a word-constituent character, or any
>  non-whitespace non-punctuation character followed by `/'.
> 
>  DESCRIPTION must be enclosed within square brackets.  It can contain
>  any character but square brackets.  Object-wise, it can contain any
>  object found in a paragraph excepted a [footnote reference], a [radio
>  target] and a [line break].  It cannot contain another link either,
>  unless it is a plain link.
> 
>  DESCRIPTION is optional.
> 
>  PATH3 is built according to the following patterns:
> 
>  ╭────
>  │ FILENAME           ("file" type)
>  │ PROTOCOL:PATH4     ("PROTOCOL" type)
>  │ id:ID              ("id" type)
>  │ #CUSTOM-ID         ("custom-id" type)
>  │ (CODEREF)          ("coderef" type)
>  │ FUZZY              ("fuzzy" type)
>  ╰────
> 
>  FILENAME is a file name, either absolute or relative.
> 
>  PATH4 can contain any character besides square brackets.
> 
>  ID is constituted of hexadecimal numbers separated with hyphens.
> 
>  PATH4, CUSTOM-ID, CODEREF and FUZZY can contain any character besides
>  square brackets.
> 
>                                  ―――――
> 
>        I suggest to remove angle links.  If one needs spaces in
>        PATH, she can use standard link syntax instead.
> 
>        I also suggest to remove `org-link-types' dependency in
>        PROTOCOL and match `[a-zA-Z]' instead, for portability.  —
>        ngz
> 
> 
>  [radio target] See section 5.8
> 
>  [entities] See section 5.1
> 
>  [latex fragments] See section 5.1
> 
>  [subscript] See section 5.10
> 
>  [superscript] See section 5.10
> 
>  [footnote reference] See section 5.3
> 
>  [line break] See section 5.5
> 
> 
> 5.7 Macros
> ──────────
> 
>  Macros follow the pattern:
> 
>  ╭────
>  │ {{{NAME(ARGUMENTS)}}}
>  ╰────
> 
>  NAME must start with a letter and can be followed by any number of
>  alpha-numeric characters, hyphens and underscores.
> 
>  ARGUMENTS can contain anything but "}}}" string.  Values within
>  ARGUMENTS are separated by commas.  Non-separating commas have to be
>  escaped with a backslash character.
> 
> 
> 5.8 Targets and Radio Targets
> ─────────────────────────────
> 
>  Radio targets follow the pattern:
> 
>  ╭────
>  │ <<<CONTENTS>>>
>  ╰────
> 
>  CONTENTS can be any character besides `<', `>' and “\n”.  As far as
>  objects go, it can contain [entities], [latex fragments], [subscript]
>  and [superscript] only.
> 
>  Targets follow the pattern:
> 
>  ╭────
>  │ <<TARGET>>
>  ╰────
> 
>  TARGET can contain any character besides `<', `>' and “\n”.  It cannot
>  contain any object.
> 
> 
>  [entities] See section 5.1
> 
>  [latex fragments] See section 5.1
> 
>  [subscript] See section 5.10
> 
>  [superscript] See section 5.10
> 
> 
> 5.9 Statistics Cookies
> ──────────────────────
> 
>  Statistics cookies follow either pattern:
> 
>  ╭────
>  │ [PERCENT%]
>  │ [NUM1/NUM2]
>  ╰────
> 
>  PERCENT, NUM1 and NUM2 are numbers or the empty string.
> 
> 
> 5.10 Subscript and Superscript
> ──────────────────────────────
> 
>  Pattern for subscript is:
> 
>  ╭────
>  │ CHAR_SCRIPT
>  ╰────
> 
>  Pattern for superscript is:
> 
>  ╭────
>  │ CHAR^SCRIPT
>  ╰────
> 
>  CHAR is any non-whitespace character.
> 
>  SCRIPT can be `*', a string made of word-constituent characters maybe
>  preceded by a plus or a minus sign, an expression enclosed in
>  parenthesis (resp. curly brackets) containing balanced parenthesis
>  (resp. curly brackets).
> 
> 
> 5.11 Table Cells
> ────────────────
> 
>  Table cells follow the pattern:
> 
>  ╭────
>  │ CONTENTS|
>  ╰────
> 
>  CONTENTS can contain any character excepted a vertical bar.
> 
> 
> 5.12 Timestamps
> ───────────────
> 
>  There are seven possible patterns for timestamps:
> 
>  ╭────
>  │ <%%(SEXP)>                                     (diary)
>  │ <DATE TIME REPEATER>                         (active)
>  │ [DATE TIME REPEATER]                         (inactive)
>  │ <DATE TIME REPEATER>--<DATE TIME REPEATER>   (active range)
>  │ <DATE TIME-TIME REPEATER>                    (active range)
>  │ [DATE TIME REPEATER]--[DATE TIME REPEATER]   (inactive range)
>  │ [DATE TIME-TIME REPEATER]                    (inactive range)
>  ╰────
> 
>  SEXP can contain any character excepted `>' and `\n'.
> 
>  DATE follows the pattern:
> 
>  ╭────
>  │ YYYY-MM-DD DAYNAME
>  ╰────
> 
>  Y, M and D are digits.  DAYNAME can contain any non
>  whitespace-character besides `+', `-', `]', `>', a digit or `\n'.
> 
>  TIME follows the pattern =H:MM~.  H can be one or two digit long and
>  can start with 0.
> 
>  REPEATER follows the patter:
> 
>  ╭────
>  │ MARK VALUE UNIT
>  ╰────
> 
>  MARK is `+' (cumulate type), `++' (catch-up type) or `.+' (restart
>  type).
> 
>  VALUE is a number.
> 
>  UNIT is a character among `h' (hour), `d' (day), `w' (week), `m'
>  (month), `y' (year).
> 
>  MARK, VALUE and UNIT are not separated by whitespace characters.
> 
> 
> 5.13 Text Markup
> ────────────────
> 
>  Text markup follows the pattern:
> 
>  ╭────
>  │ PRE MARKER CONTENTS MARKER POST
>  ╰────
> 
>  PRE is a whitespace character, `(', `{' ~’~ or a double quote.  It can
>  also be a beginning of line.
> 
>  MARKER is a character among `*' (bold), `=' (verbatim), `/' (italic),
>  `+' (strike-through), `_' (underline), `~' (code).
> 
>  CONTENTS is a string following the pattern:
> 
>  ╭────
>  │ BORDER BODY BORDER
>  ╰────
> 
>  BORDER can be any non-whitespace character excepted ~,~, ~’~ or
>  a double quote.
> 
>  BODY can contain contain any character but may not span over more than
>  3 lines.
> 
>  BORDER and BODY are not separated by whitespaces.
> 
>  CONTENTS can contain any object encountered in a paragraph when markup
>  is “bold”, “italic”, “strike-through” or “underline”.
> 
>  POST is a whitespace character, `-', `.', ~,~, `:', `!', `?', ~’~,
>  `)', `}' or a double quote.  It can also be an end of line.
> 
>  PRE, MARKER, CONTENTS, MARKER and POST are not separated by whitespace
>  characters.
> 
>                                  ―――――
> 
>        All of this is wrong if `org-emphasis-regexp-components'
>        or `org-emphasis-alist' are modified.
> 
>        This should really be simplified and made persistent
>        (i.e. no defcustom allowed).  Otherwise, portability and
>        parsing are jokes.
> 
>        Also, CONTENTS should be anything within code and verbatim
>        emphasis, by definition.  — ngz
> 
> 
> 
> Footnotes
> ─────────
> 
> [1] In particular, the parser requires stars at column 0 to be quoted
> by a comma when they do not define a headline.
> 
> [2] It also means that only headlines and sections can be recognized
> just by looking at the beginning of the line.
> 
> As a consequence, using `org-element-at-point' or
> `org-element-context' will move up to the parent headline, and parse
> top-down from there until context around is found.
> 
> 
> 
> Regards,
> 
> -- 
> Nicolas Goaziou
>

Re: [O] [RFC] Org syntax (draft)

Reply via email to