woooooow, this is awesome Nicolas, thank you! - Carsten
On 7.3.2013, at 21:37, Nicolas Goaziou <n.goaz...@gmail.com> wrote: > Hello, > > As discussed a few days ago, here is a document describing the complete > Org syntax as read by the parser. I also added some comments. I am going > to put the Org file on Worg, so anyone can update it and fix mistakes. > > ━━━━━━━━━━━━━━━━━━━━ > ORG SYNTAX (DRAFT) > ━━━━━━━━━━━━━━━━━━━━ > > > Table of Contents > ───────────────── > > 1 Headlines and Sections > 2 Affiliated Keywords > 3 Greater Elements > .. 3.1 Greater Blocks > .. 3.2 Drawers and Property Drawers > .. 3.3 Dynamic Blocks > .. 3.4 Footnote Definitions > .. 3.5 Inlinetasks > .. 3.6 Plain Lists and Items > .. 3.7 Tables > 4 Elements > .. 4.1 Babel Call > .. 4.2 Blocks > .. 4.3 Clock, Diary Sexp and Planning > .. 4.4 Comments > .. 4.5 Fixed Width Areas > .. 4.6 Horizontal Rules > .. 4.7 Keywords > .. 4.8 LaTeX Environments > .. 4.9 Node Properties > .. 4.10 Paragraphs > .. 4.11 Table Rows > 5 Objects > .. 5.1 Entities and LaTeX Fragments > .. 5.2 Export Snippets > .. 5.3 Footnote References > .. 5.4 Inline Babel Calls and Source Blocks > .. 5.5 Line Breaks > .. 5.6 Links > .. 5.7 Macros > .. 5.8 Targets and Radio Targets > .. 5.9 Statistics Cookies > .. 5.10 Subscript and Superscript > .. 5.11 Table Cells > .. 5.12 Timestamps > .. 5.13 Text Markup > > > This document describes and comments Org syntax as it is currently read > by its parser (Org Elements) and, therefore, by the export framework. > It also includes a few comments on that syntax. > > A core concept in this syntax is that only headlines and sections are > context-free[1][2]. Every other syntactical part only exists within > specific environments. > > Three categories are used to classify these environments: “Greater > elements”, “elements”, and “objects”, from the broadest scope to the > narrowest. > > The paragraph is the unit of measurement. An element defines > syntactical parts that are at the same level as a paragraph, i.e. which > cannot contain or be included in a paragraph. An object is a part that > could be included in an element. Greater elements are all parts that > can contain an element. > > Empty lines belong to the largest element ending before them. For > example, in a list, empty lines between items belong are part of the > item before them, but empty lines at the end of a list belong to the > plain list element. > > Unless specified otherwise, case is not significant. > > > 1 Headlines and Sections > ════════════════════════ > > A headline is defined as: > > ╭──── > │ STARS KEYWORD PRIORITY TITLE TAGS > ╰──── > > STARS is a string starting at column 0 and containing at least one > asterisk (and up to `org-inlinetask-min-level' if `org-inlinetask' > library is loaded). It’s the sole compulsory part of a headline. > > KEYWORD is a TODO keyword, which have to belong to the list defined in > `org-todo-keywords'. Case is significant. > > PRIORITY is a priority cookie, i.e. a single letter preceded by a hash > sign # and enclosed within square brackets. Case is significant. > > TITLE can be made of any character but a new line. Though, it will > match after every other part have been matched. > > TAGS is made of words containing any alpha-numeric character, > underscore, at sign, hash sign or percent sign, and separated with > colons. > > Examples of valid headlines include: > > ╭──── > │ * > │ > │ ** DONE > │ > │ *** Some e-mail > │ > │ **** TODO [#A] COMMENT Title :tag:a2%: > ╰──── > > If the first word appearing in the title is `org-comment-keyword', the > headline will be considered as “commented”. If that first word is > `org-quote-string', it will be considered as “quoted”. In both > situations, case is significant. > > If its title is `org-footnote-section', it will be considered as > a “footnote section”. Case is significant. > > If `org-archive-tag' is one of its tags, it will be considered as > “archived”. Case is significant. > > A headline contains directly at most one section, followed by any > number of headlines. Only a section can contain another section. > > A section contains directly any greater element or element. Only > a headline can contain a section. As an exception, text before the > first headline in the document also belongs to a section. > > In a quoted headline contains a section, the latter will be considered > as a “quote section”. > > As an example, consider the following document: > > ╭──── > │ An introduction. > │ > │ * A Headline > │ > │ Some text. > │ > │ ** Sub-Topic 1 > │ > │ ** Sub-Topic 2 > │ > │ *** Additional entry > │ > │ ** QUOTE Another Sub-Topic > │ > │ Some other text. > ╰──── > > Its internal structure could be summarized as: > > ╭──── > │ (document > │ (section) > │ (headline > │ (section) > │ (headline) > │ (headline > │ (headline)) > │ (headline > │ (quote-section)))) > ╰──── > > > 2 Affiliated Keywords > ═════════════════════ > > With the exception of [inlinetasks], [items], [planning], [clocks], > [node properties] and [table rows], every other element type can be > assigned attributes. > > This is done by adding specific keywords, named “affiliated keywords”, > just above the element considered, no blank line allowed. > > Affiliated keywords are built upon one of the following patterns: > “#+KEY: VALUE”, “#+KEY[OPTIONAL]: VALUE” or “#+ATTR_BACKEND: VALUE”. > > KEY is either “CAPTION”, “HEADER”, “NAME”, “PLOT” or “RESULTS” string. > > BACKEND is a string constituted of alpha-numeric characters, hyphens > or underscores. > > OPTIONAL and VALUE can contain any character but a new line. Only > keywords in `org-element-dual-keywords' can have an optional value. > > An affiliated keyword can appear on multiple lines if KEY belongs to > `org-element-multiple-keywords' or if its pattern is “#+ATTR_BACKEND: > VALUE”. > > Affiliated keywords whose KEY belong to `org-element-parsed-keywords' > can contain objects in their value and their optional value, if > applicable. > > > [inlinetasks] See section 3.5 > > [items] See section 3.6 > > [planning] See section 4.3 > > [clocks] See section 4.3 > > [node properties] See section 4.9 > > [table rows] See section 4.11 > > > 3 Greater Elements > ══════════════════ > > Unless specified otherwise, greater elements can contain directly any > other element or greater element excepted: > > • elements of their own type, > • [node properties], which can only be found in [property drawers], > • [items], which can only be found in [plain lists]. > > > [node properties] See section 4.9 > > [property drawers] See section 3.2 > > [items] See section 3.6 > > [plain lists] See section 3.6 > > > 3.1 Greater Blocks > ────────────────── > > Greater blocks consist in the following pattern: > > ╭──── > │ #+BEGIN_NAME PARAMETERS > │ CONTENTS > │ #+END_NAME > ╰──── > > NAME can contain any non-whitespace character. > > PARAMETERS can contain any character, and can be omitted. > > If NAME is “CENTER”, it will be a “center block”. If it is “QUOTE”, > it will be a “quote block”. > > If the block is neither a center block, a quote block or a [block > element], it will be a “special block”. > > CONTENTS can contain any element, but another greater block of the > same type. > > > [block element] See section 4.2 > > > 3.2 Drawers and Property Drawers > ──────────────────────────────── > > Pattern for drawers is: > > ╭──── > │ :NAME: > │ CONTENTS > │ :END: > ╰──── > > NAME has to either be “PROPERTIES” or belong to `org-drawers' list. > > If NAME is “PROPERTIES”, the drawer will become a “property drawer”. > > In a property drawers, CONTENTS can only contain [node property] > elements. Otherwise it can contain any element but another drawer or > property drawer. > > ――――― > > It would be nice if users hadn’t to register drawers names before > using them in `org-drawers' (or through the `#+DRAWERS:' keyword). > Anything starting with `^[ \t]*:\w+:[ \t]$' and ending with > `^[ \t]*:END:[ \t]$' could be considered as a drawer. — ngz > > > [node property] See section 4.9 > > > 3.3 Dynamic Blocks > ────────────────── > > Pattern for dynamic blocks is: > > ╭──── > │ #+BEGIN: NAME PARAMETERS > │ CONTENTS > │ #+END: > ╰──── > > NAME cannot contain any whitespace character. > > PARAMETERS can contain any character and can be omitted. > > > 3.4 Footnote Definitions > ──────────────────────── > > Pattern for footnote definitions is: > > ╭──── > │ [LABEL] CONTENTS > ╰──── > > It must start at column 0. > > LABEL is either a number or follows the pattern “fn:WORD”, where word > can contain any word-constituent character, hyphens and underscore > characters. > > CONTENTS can contain any element excepted another footnote definition. > It ends at the next footnote definition, the next headline, two > consecutive empty lines or the end of buffer. > > > 3.5 Inlinetasks > ─────────────── > > Inlinetasks are defined by `org-inlinetask-min-level' contiguous > asterisk characters starting at column 0, followed by a whitespace > character. > > Optionally, inlinetasks can be ended with a string constituted of > `org-inlinetask-min-level' contiguous characters starting at column 0, > followed by a space and the “END” string. > > Inlinetasks are recognized only after `org-inlinetask' library is > loaded. > > > 3.6 Plain Lists and Items > ───────────────────────── > > Items are defined by a line starting with the following pattern: > “BULLET COUNTER-SET CHECK-BOX TAG”, in which only BULLET is mandatory. > > BULLET is either an asterisk, a hyphen, a plus sign character or > follows either the pattern “COUNTER.” or “COUNTER)". In any case, > BULLET is follwed by a whitespace character or line ending. > > COUNTER can be a number or a single letter. > > COUNTER-SET follows the pattern [@COUNTER]. > > CHECK-BOX is either a single whitespace character, a “X” character or > a hyphen, enclosed within square brackets. > > TAG follows “TAG-TEXT ::” pattern, where TAG-TEXT can contain any > character but a new line. > > An item ends before the next item, the first line less or equally > indented than its starting line, or two consecutive empty lines. > Indentation of lines within other greater elements do not count, > neither do inlinetasks boundaries. > > A plain list is a set of consecutive items of the same indentation. > It can only directly contain items. > > If first item in a plain list has a counter in its bullet, the plain > list will be an “ordered plain-list”. If it contains a tag, it will > be a “descriptive list”. Otherwise, it will be an “unordered list”. > List types are mutually exclusive. > > For example, consider the following excerpt of an Org document: > > ╭──── > │ 1. item 1 > │ 2. [X] item 2 > │ - some tag :: item 2.1 > ╰──── > > Its internal structure is as follows: > > ╭──── > │ (ordered-plain-list > │ (item) > │ (item > │ (descriptive-plain-list > │ (item)))) > ╰──── > > > 3.7 Tables > ────────── > > Tables start at lines beginning with either a vertical bar or the “+-” > string followed by plus or minus signs only, assuming they are not > preceded with lines of the same type. These lines can be indented. > > A table starting with a vertical bar has “org” type. Otherwise it has > “table.el” type. > > Org tables end at the first line not starting with a vertical bar. > Table.el tables end at the first line not starting with either > a vertical line or a plus sign. Such lines can be indented. > > An org table can only contain table rows. A table.el table does not > contain anything. > > > 4 Elements > ══════════ > > Elements cannot contain any other element. > > Only [keywords] whose name belongs to > `org-element-document-properties', [verse blocks] , [paragraphs] and > [table rows] can contain objects. > > > [keywords] See section 4.7 > > [verse blocks] See section 4.2 > > [paragraphs] See section 4.10 > > [table rows] See section 4.11 > > > 4.1 Babel Call > ────────────── > > Pattern for babel calls is: > > ╭──── > │ #+CALL: VALUE > ╰──── > > VALUE is optional. It can contain any character but a new line. > > > 4.2 Blocks > ────────── > > Like [greater blocks], pattern for blocks is: > > ╭──── > │ #+BEGIN_NAME DATA > │ CONTENTS > │ #+END_NAME > ╰──── > > NAME cannot contain any whitespace character. > > If NAME is “COMMENT”, it will be a “comment block”. If it is > “EXAMPLE”, it will be an “example block”. If it is “SRC”, it will be > a “source block”. If it is “VERSE”, it will be a “verse block”. > > If NAME is a string matching the name of any export back-end loaded, > the block will be an “export block”. > > DATA can contain any character but a new line. It can be ommitted, > unless the block is a “source block”. In this case, it must follow > the pattern “LANGUAGE SWITCHES ARGUMENTS”, where SWITCHES and > ARGUMENTS are optional. > > LANGUAGE cannot contain any whitespace character. > > SWITCHES is made of any number of “SWITCH” patterns, separated by > blank lines. > > A SWITCH pattern is either “-l “FORMAT"", where FORMAT can contain any > character but a double quote and a new line, “-S” or “+S”, where > S stands for a single letter. > > ARGUMENTS can contain any character but a new line. > > CONTENTS can contain any character, including new lines. Though it > will only contain Org objects if the block is a verse block. > Otherwise, contents will not be parsed. > > > [greater blocks] See section 3.1 > > > 4.3 Clock, Diary Sexp and Planning > ────────────────────────────────── > > A clock follows the pattern: > > ╭──── > │ CLOCK: TIMESTAMP DURATION > ╰──── > > Both TIMESTAMP and DURATION are optional. > > TIMESTAMP is a [timestamp] object. > > DURATION follows the pattern: > > ╭──── > │ => HH:MM > ╰──── > > HH is a number containing any number of digits. MM is a two digit > numbers. > > A diary sexp is a line starting at column 0 with “%%(" string. It can > then contain any character besides a new line. > > A planning is a line filled with more at most three INFO parts, where > each INFO part follows the pattern: > > ╭──── > │ KEYWORD: TIMESTAMP > ╰──── > > KEYWORD is a string among `org-deadline-string', > `org-scheduled-string' and `org-closed-string'. TIMESTAMP is is > a [timestamp] object. > > Even though a planning element can exist anywhere in a section or > a greater element, it will only affect the headline containing the > section if it is put on the line following that headline. > > > [timestamp] See section 5.12 > > > 4.4 Comments > ──────────── > > A “comment line” starts with a hash signe and a whitespace character > or an end of line. > > Comments can contain any number of consecutive comment lines. > > > 4.5 Fixed Width Areas > ───────────────────── > > A “fixed-width line” start with a colon character and a whitespace or > an end of line. > > Fixed width areas can contain any number of consecutive fixed-width > lines. > > > 4.6 Horizontal Rules > ──────────────────── > > A horizontal rule is a line made of at least 5 consecutive hyphens. > It can be indented. > > > 4.7 Keywords > ──────────── > > Keywords follow the syntax: > > ╭──── > │ #+KEY: VALUE > ╰──── > > KEY can contain any non-whitespace character, but it cannot be equal > to “CALL” or any affiliated keyword. > > VALUE can contain any character excepted a new line. > > If KEY belongs to `org-element-document-properties', VALUE can contain > objects. > > > 4.8 LaTeX Environments > ────────────────────── > > Pattern for LaTeX environments is: > > ╭──── > │ \begin{NAME} > │ CONTENTS > │ \end{NAME} > ╰──── > > NAME is constituted of alpha-numeric characters and may end with an > asterisk. > > CONTENTS can contain anything but the “\end{NAME}” string. > > > 4.9 Node Properties > ─────────────────── > > Patter for node properties is: > > ╭──── > │ :PROPERTY: VALUE > ╰──── > > PROPERTY can contain any non-whitespace character. VALUE can contain > any character but a new line. > > Node properties can only exist in a [property drawers]. > > > [property drawers] See section 3.2 > > > 4.10 Paragraphs > ─────────────── > > Paragraphs are the default element, which means that any unrecognized > context is a paragraph. > > Empty lines and other elements end paragraphs. > > Paragraphs can contain every type of object. > > > 4.11 Table Rows > ─────────────── > > A table rows is either constituted of a vertical bar and any number of > [table cells] or a vertical bar followed by a hyphen. > > In the first case the table row has the “standard” type. In the > second case, it has the “rule” type. > > Table rows can only exist in [tables]. > > > [table cells] See section 5.11 > > [tables] See section 3.7 > > > 5 Objects > ═════════ > > Objects can only be found in the following locations: > > • [affiliated keywords] defined in `org-element-parsed-keywords', > • [document properties], > • [headline] titles, > • [inlinetask] titles, > • [item] tags, > • [paragraphs], > • [table cells], > • [table rows], which can only contain table cell objects, > • [verse blocks]. > > Most objects cannot contain objects. Those which can will be > specified. > > > [affiliated keywords] See section 2 > > [document properties] See section 4.7 > > [headline] See section 1 > > [inlinetask] See section 3.5 > > [item] See section 3.6 > > [paragraphs] See section 4.10 > > [table cells] See section 5.11 > > [table rows] See section 4.11 > > [verse blocks] See section 4.2 > > > 5.1 Entities and LaTeX Fragments > ──────────────────────────────── > > An entity follows the pattern: > > ╭──── > │ \NAME POST > ╰──── > > where NAME has a valid association in either `org-entities' or > `org-entities-user'. > > POST is the end of line, "{}" string, or a non-alphabetical character. > It isn’t separated from NAME by a whitespace character. > > A LaTeX fragment can follow multiple patterns: > > ╭──── > │ \NAME POST > │ \(CONTENTS\) > │ \[CONTENTS\] > │ $$CONTENTS$$ > │ PRE$CHAR$POST > │ PRE$BORDER1 BODY BORDER2$ > ╰──── > > NAME contains alphabetical characters only and must not have an > association in either `org-entities' or `org-entities-user'. > > POST is the same as for entities. > > CONTENTS can contain any character but cannot contain “\)" in the > second template or “\]" in the third one. > > PRE is either the beginning of line or a character different from `$'. > > CHAR is a non-whitespace character different from `.', ~,~, `?', `;', > ~’~ or a double quote. > > POST is any of `-', `.', ~,~, `?', `;', `:', ~’~, a double quote, > a whitespace character and the end of line. > > BORDER1 is a non-whitespace character different from `.', `;', `.' > and `$'. > > BODY can contain any character excepted `$', and may not span over > more than 3 lines. > > BORDER2 is any non-whitespace character different from ~,~, `.' and > `$'. > > ――――― > > It would introduce incompatibilities with previous Org > versions, but support for “$…$” (and for symmetry, > `$$...$$') constructs ought to be removed. > > They are slow to parse, fragile, redundant, imply false > positives and do not look good in LaTeX output anyway. > Even the LaTeX community suggests to use `\(...\)' over > `$...$'. — ngz > > > 5.2 Export Snippets > ─────────────────── > > Patter for export snippets is: > > ╭──── > │ @@NAME:VALUE@@ > ╰──── > > NAME can contain any alpha-numeric character and hyphens. > > VALUE can contain anything but “@@” string. > > > 5.3 Footnote References > ─────────────────────── > > There are four patterns for footnote references: > > ╭──── > │ [MARK] > │ [fn:LABEL] > │ [fn:LABEL:DEFINITION] > │ [fn::DEFINITION] > ╰──── > > MARK is a number. > > LABEL can contain any word constituent character, hyphens and > underscores. > > DEFINITION can contain any character. Though opening and closing > square brackets must be balanced in it. It can contain any object > encountered in a paragraph, even other footnote references. > > If the reference follows the third pattern, it is called an “inline > footnote”. If it follows the fourth one, i.e. if LABEL is omitted, it > is an “anonymous footnote”. > > > 5.4 Inline Babel Calls and Source Blocks > ──────────────────────────────────────── > > Inline Babel calls follow any of the following patterns: > > ╭──── > │ call_NAME(ARGUMENTS) > │ call_NAME[HEADER](ARGUMENTS)[HEADER] > ╰──── > > NAME can contain any character besides `(', `)' and “\n”. > > HEADER can contain any character besides `]' and “\n”. > > ARGUMENTS can contain any character besides `)' and “\n”. > > Inline source blocks follow any of the following patterns: > > ╭──── > │ src_LANG{BODY} > │ src_LANG[OPTIONS]{BODY} > ╰──── > > LANG can contain any non-whitespace character. > > OPTIONS and BODY can contain any character but “\n”. > > > 5.5 Line Breaks > ─────────────── > > A line break consists in “\\SPACE” pattern at the end of an otherwise > non-empty line. > > SPACE can contain any number of tabs and spaces, including 0. > > > 5.6 Links > ───────── > > There are 4 major types of links: > > ╭──── > │ RADIO ("radio" link) > │ <PROTOCOL:PATH> ("angle" link) > │ PRE PROTOCOL:PATH2 POST ("plain" link) > │ [[PATH3]DESCRIPTION] ("regular" link) > ╰──── > > RADIO is a string matched by some [radio target]. It can contain > [entities], [latex fragments], [subscript] and [superscript] only. > > PROTOCOL is a string among `org-link-types'. > > PATH can contain any character but `]', `<', `>' and `\n'. > > PRE and POST are non word constituent. They can be, respectively, the > beginning or the end of a line. > > PATH2 can contain any non-whitespace character excepted `(', `)', `<' > and `>'. It must end with a word-constituent character, or any > non-whitespace non-punctuation character followed by `/'. > > DESCRIPTION must be enclosed within square brackets. It can contain > any character but square brackets. Object-wise, it can contain any > object found in a paragraph excepted a [footnote reference], a [radio > target] and a [line break]. It cannot contain another link either, > unless it is a plain link. > > DESCRIPTION is optional. > > PATH3 is built according to the following patterns: > > ╭──── > │ FILENAME ("file" type) > │ PROTOCOL:PATH4 ("PROTOCOL" type) > │ id:ID ("id" type) > │ #CUSTOM-ID ("custom-id" type) > │ (CODEREF) ("coderef" type) > │ FUZZY ("fuzzy" type) > ╰──── > > FILENAME is a file name, either absolute or relative. > > PATH4 can contain any character besides square brackets. > > ID is constituted of hexadecimal numbers separated with hyphens. > > PATH4, CUSTOM-ID, CODEREF and FUZZY can contain any character besides > square brackets. > > ――――― > > I suggest to remove angle links. If one needs spaces in > PATH, she can use standard link syntax instead. > > I also suggest to remove `org-link-types' dependency in > PROTOCOL and match `[a-zA-Z]' instead, for portability. — > ngz > > > [radio target] See section 5.8 > > [entities] See section 5.1 > > [latex fragments] See section 5.1 > > [subscript] See section 5.10 > > [superscript] See section 5.10 > > [footnote reference] See section 5.3 > > [line break] See section 5.5 > > > 5.7 Macros > ────────── > > Macros follow the pattern: > > ╭──── > │ {{{NAME(ARGUMENTS)}}} > ╰──── > > NAME must start with a letter and can be followed by any number of > alpha-numeric characters, hyphens and underscores. > > ARGUMENTS can contain anything but "}}}" string. Values within > ARGUMENTS are separated by commas. Non-separating commas have to be > escaped with a backslash character. > > > 5.8 Targets and Radio Targets > ───────────────────────────── > > Radio targets follow the pattern: > > ╭──── > │ <<<CONTENTS>>> > ╰──── > > CONTENTS can be any character besides `<', `>' and “\n”. As far as > objects go, it can contain [entities], [latex fragments], [subscript] > and [superscript] only. > > Targets follow the pattern: > > ╭──── > │ <<TARGET>> > ╰──── > > TARGET can contain any character besides `<', `>' and “\n”. It cannot > contain any object. > > > [entities] See section 5.1 > > [latex fragments] See section 5.1 > > [subscript] See section 5.10 > > [superscript] See section 5.10 > > > 5.9 Statistics Cookies > ────────────────────── > > Statistics cookies follow either pattern: > > ╭──── > │ [PERCENT%] > │ [NUM1/NUM2] > ╰──── > > PERCENT, NUM1 and NUM2 are numbers or the empty string. > > > 5.10 Subscript and Superscript > ────────────────────────────── > > Pattern for subscript is: > > ╭──── > │ CHAR_SCRIPT > ╰──── > > Pattern for superscript is: > > ╭──── > │ CHAR^SCRIPT > ╰──── > > CHAR is any non-whitespace character. > > SCRIPT can be `*', a string made of word-constituent characters maybe > preceded by a plus or a minus sign, an expression enclosed in > parenthesis (resp. curly brackets) containing balanced parenthesis > (resp. curly brackets). > > > 5.11 Table Cells > ──────────────── > > Table cells follow the pattern: > > ╭──── > │ CONTENTS| > ╰──── > > CONTENTS can contain any character excepted a vertical bar. > > > 5.12 Timestamps > ─────────────── > > There are seven possible patterns for timestamps: > > ╭──── > │ <%%(SEXP)> (diary) > │ <DATE TIME REPEATER> (active) > │ [DATE TIME REPEATER] (inactive) > │ <DATE TIME REPEATER>--<DATE TIME REPEATER> (active range) > │ <DATE TIME-TIME REPEATER> (active range) > │ [DATE TIME REPEATER]--[DATE TIME REPEATER] (inactive range) > │ [DATE TIME-TIME REPEATER] (inactive range) > ╰──── > > SEXP can contain any character excepted `>' and `\n'. > > DATE follows the pattern: > > ╭──── > │ YYYY-MM-DD DAYNAME > ╰──── > > Y, M and D are digits. DAYNAME can contain any non > whitespace-character besides `+', `-', `]', `>', a digit or `\n'. > > TIME follows the pattern =H:MM~. H can be one or two digit long and > can start with 0. > > REPEATER follows the patter: > > ╭──── > │ MARK VALUE UNIT > ╰──── > > MARK is `+' (cumulate type), `++' (catch-up type) or `.+' (restart > type). > > VALUE is a number. > > UNIT is a character among `h' (hour), `d' (day), `w' (week), `m' > (month), `y' (year). > > MARK, VALUE and UNIT are not separated by whitespace characters. > > > 5.13 Text Markup > ──────────────── > > Text markup follows the pattern: > > ╭──── > │ PRE MARKER CONTENTS MARKER POST > ╰──── > > PRE is a whitespace character, `(', `{' ~’~ or a double quote. It can > also be a beginning of line. > > MARKER is a character among `*' (bold), `=' (verbatim), `/' (italic), > `+' (strike-through), `_' (underline), `~' (code). > > CONTENTS is a string following the pattern: > > ╭──── > │ BORDER BODY BORDER > ╰──── > > BORDER can be any non-whitespace character excepted ~,~, ~’~ or > a double quote. > > BODY can contain contain any character but may not span over more than > 3 lines. > > BORDER and BODY are not separated by whitespaces. > > CONTENTS can contain any object encountered in a paragraph when markup > is “bold”, “italic”, “strike-through” or “underline”. > > POST is a whitespace character, `-', `.', ~,~, `:', `!', `?', ~’~, > `)', `}' or a double quote. It can also be an end of line. > > PRE, MARKER, CONTENTS, MARKER and POST are not separated by whitespace > characters. > > ――――― > > All of this is wrong if `org-emphasis-regexp-components' > or `org-emphasis-alist' are modified. > > This should really be simplified and made persistent > (i.e. no defcustom allowed). Otherwise, portability and > parsing are jokes. > > Also, CONTENTS should be anything within code and verbatim > emphasis, by definition. — ngz > > > > Footnotes > ───────── > > [1] In particular, the parser requires stars at column 0 to be quoted > by a comma when they do not define a headline. > > [2] It also means that only headlines and sections can be recognized > just by looking at the beginning of the line. > > As a consequence, using `org-element-at-point' or > `org-element-context' will move up to the parent headline, and parse > top-down from there until context around is found. > > > > Regards, > > -- > Nicolas Goaziou >