Re: Bug: inconsistent escaping of coderef regexp
Hi Nicolas, I've included the simplest patch I could come up with for the divergence in behavior between org-babel-tangle-single-file and org-link-search. I think there are two new threads that I need to create. One is related to how to make it possible to specify what should be removed along with the coderef (i.e. coderef prefix), the other is the addition of header arguments that provide the same functionality as switches. Best, Tom > This is already conflating the two. I'd like to solve the issue at hand > without having header args interfere at all. > > This can happen later, after a discussion on the ML. Ok. I've included the simplest version of the fix, which is to use org-src-coderef-regexp in org-babel-tangle-single-file. > Would you mind answering my questions first? I still don't follow you > about the coderef prefix/regexp. https://code.orgmode.org/bzg/org-mode/src/2d78ea57cfad1ddc3e993c949daf117b76315170/lisp/org-src.el#L882 That line defines a hardcoded regular expression for matching coderefs. The codref prefix is the first =[ \t]*= and the coderef regexp is the equivalent to the fully formatted version of that format string. Neither of those can currently be specified by the user. The user should not be able to specify the coderef regexp due to the fact that it is too easy to specify a regexp that will not work correctly and because the format string is needed to make org-link-search work for named coderefs (otherwise you wind up trying to replace .+ in the coderef regexp which is a nightmare). The coderef prefix is something that should probably be configurable by the user so that empty comments are not left in the file. I also looked into detecting the comment character for the language in question, but that is significantly more difficult even using (with-temp-buffer (funcall lang-mode) comment-start) because not all languages have sane comment start values and comment-start is not complete, so we would need a way to manually specify what to exclude anyway. From c30913da6b1c8d6be3670a59ae867df019505af3 Mon Sep 17 00:00:00 2001 From: Tom Gillespie Date: Wed, 7 Apr 2021 12:29:01 -0700 Subject: [PATCH] lisp/ob-tangle.el: Fix coderef removal during tangling * lisp/ob-tangle.el (orb-babel-tangle-single-block): Regularize behavior when removing coderefs during tangling. This fixes an issue where trailing whitespace would be retained when coderefs were removed for tangling. --- lisp/ob-tangle.el | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el index aa0373ab8..4c0c3132d 100644 --- a/lisp/ob-tangle.el +++ b/lisp/ob-tangle.el @@ -414,9 +414,8 @@ non-nil, return the full association list to be used by (src-lang (nth 0 info)) (params (nth 2 info)) (extra (nth 3 info)) - (cref-fmt (or (and (string-match "-l \"\\(.+\\)\"" extra) - (match-string 1 extra)) - org-coderef-label-format)) + (coderef (nth 6 info)) + (cref-regexp (org-src-coderef-regexp coderef)) (link (let ((l (org-no-properties (org-store-link nil (and (string-match org-link-bracket-re l) (match-string 1 l @@ -445,8 +444,7 @@ non-nil, return the full association list to be used by (funcall assignments-cmd params)) (when (string-match "-r" extra) (goto-char (point-min)) - (while (re-search-forward - (replace-regexp-in-string "%s" ".+" cref-fmt) nil t) + (while (re-search-forward cref-regexp nil t) (replace-match ""))) (run-hooks 'org-babel-tangle-body-hook) (buffer-string -- 2.26.3
Re: [Patch] to correctly sort the items with emphasis marks in a list
Hi Greg, seq cannot be used because it is not available in older versions of emacs that org still supports. When support for those older versions is dropped then seq could be used. Best, Tom
Re: Concerns about community contributor support
Hi Tim, David, and Gustav, I am fairly certain that with only a few exceptions it is possible to specify a context free grammar for org syntax, followed by a second pass that deals specifically with markup and a few other forms, notably the reassembly of things like plain lists. The fact that this is possible because most org constructs are line oriented. Just a note that the linked parser.rkt [0] is indeed a BNF describing org syntax in the same style as a bison/yacc grammar. One of the reasons why I set out to work on this was precisely so that there could be a reference that could be consulted by the community when questions about extended org come up. There are proposals for new syntax that appear on this list with terrifying frequency, and they are routinely shot down or simply ignored for good reason, however it is hard to communicate that to enthusiastic contributors who have an immediate use case that they want to solve and share and are unlikely to be aware of side effects. Having a grammar where such issues can be tested empirically should provide a significant safeguard while also making it easier for contributors to play with the grammar and see the issues. In all my work on the grammar I have found maybe 2 or 3 places where the grammar could be "extended" but it isn't so much extended as it is regularized, where some parts of org already parse a more complex grammar while other very similar parts choose not to. Overall the cost of not parsing certain forms in certain situations adds complexity rather than reducing it. The situation for contribution is also further complicated by the fact that the elisp implementation of org mode is internally inconsistent in its behavior with regard to the syntax, so great care has to be taken if someone tries to make and argument based on the behavior of one component. All this to say that the need for a conservative approach to changes and extensions combined with the internally inconsistent behavior of different parts of the elisp implementation means that the introduction of new features is extremely difficult because it is hard to predict the consequences on other parts of org. Overcoming this is why I started working on the grammar, because in the absence of a formal spec for what org should do, it is very hard to make changes to what it is currently doing without having nasty side effects. Best! Tom 0. https://github.com/tgbugs/laundry/blob/next/laundry/parser.rkt note the upcoming path change (which I will note in the original thread when it happens). PS I'm planning to reply to the main thread as well. My short take is finding a dedicated and responsive maintainer that can take over from Bastien is a high priority. The only other thing that might help is to have some way to track outstanding and closed patches, issues, etc. that is more accessible than trolling through years worth of posts on this mailing list, but that is a can of worms that has already been shot down multiple times.
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Sébastien, The temp -> rename approach is good, but you should probably use make-temp-file to create the file to reduce the risk of collisions/race conditions. For example as (make-temp-file (concat file-name ".tangling")). I think that the location of condition-case is ok, but I wonder what would happen if something were to fail before entering that? I think that only a subset of the files would be tangled, but they would all have their correct modes, so I think that that is ok. I also think that the message to the user should probably not be changed right now. While it might can be useful for debug, if someone is tangling to a large number of files then the filenames/paths are going to flood messages, so I would leave it out of this patch, and possibly submit it as another patch for a separate discussion. Best! Tom
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Sébastien, Some comments while looking over this (will report back when I have tested it out as well). This is a section of the ob export functionality that I have been looking for on and off for quite a while because it is responsible for some bad and insecure behavior. I think that some of your changes may have fixed/improved this as a side effect. I don't know whether it is worth doing anything about the issues in this patch, but since we are here, I think they are worth mentioning. All of the issues that I'm aware of are related to what happens if tangling fails part way through the process. First, your patch already fixes a major issue which is that the modes of all files would not be set if any one of them failed to tangle. Next, during the process the existing file is deleted prior to tangling, which means that it cannot be restored if tangling fails, it would be better if the old file was moved to a temporary location and then deleted on success or replaced on failure. This likely requires wrapping the bits that can fail in unwind-protect and restoring on failure or fully deleting at the end of success. The next issue is that setting the tangle mode should happen before the file is written, an empty file should be created, the mode should then be set, the contents of the file should be written only after the mode has been set. This involves a bit of reordering of operations in lines 124-126 of your patch. This ordering of opertions prevents security issues related to race conditions and potential errors being evoked during write-region (though again, your changes already make the tangling code much more secure by setting the modes on each file immediately after writing instead of how it works currently where if any other block encounters an error then no modes were set). Best! Tom On Sun, Apr 18, 2021 at 12:23 AM Sébastien Miquel wrote: > > Hi, > > The attached patch modifies the ~org-babel-tangle~ function to avoid a > quadratic behavior in the number of blocks tangled to a single file. > > Tangling an org buffer with 200 blocks to 5 different files yields a > 25 % speedup. > > > * lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group > collected blocks by tangled file name. > (org-babel-tangle): Avoid quadratic behavior in number of blocks. > > -- > Sébastien Miquel
Re: Properties on buffer level
You should be able to run C-c C-c on #+property: directives before the first headline and they will be updated without reloading the buffer. Best, Tom
Bug: doc string for "org-end-of-meta-data"
Hello everybody, I believe the last paragraph of the doc string for the function "org-end-of-meta-data" contains an error. That one-sentence paragraph currently reads: When FULL is non-nil but not t, skip planning information, clocking lines and only non-regular drawers, i.e. properties and logbook drawers. I believe that should be "regular drawers," not "non-regular drawers." IMO, the last paragraph could be clearer were it rewritten as follows: When FULL is non-nil but not t, skip only planning information, clocking lines and regular drawers, i.e. properties and logbook drawers. If any non-regular drawers exist and do not follow the two regular drawers, stop at the first non-regular drawer instead. I believe that this expansion of the paragraph corrects the error and adds coverage of a rare case. Many thanks to all the developers of Org-mode. -- Tom Davey t...@tomdavey.com New York NY USA
Re: [org-cite] citations in property drawers?
> That would be a terrible idea. Exporters are not required to handle all > data contained in properties drawers, so this may introduce errors, > e.g., when trying to number citations. I agree completely. You can't export something that has no anchor in text that would be rendered. Maybe I misunderstood the original question, because there is no way that a citation or footnote could be exported from there, so I think in your conception text that follows the format of the citations or footnotes isn't actually a citation or footnote unless it exports as such. Best, Tom
RE: Bug: doc string for "org-end-of-meta-data"
Hi Marco, You make sense. What you propose to substitute is easier to understand and concise: When FULL is non-nil but not t, skip planning information, properties, clocking lines and logbook drawers. Thank you! -- Tom Davey t...@tomdavey.com New York NY USA -Original Message- From: Marco Wahl Sent: Wednesday, September 15, 2021 5:04 PM To: Tom Davey Cc: 'emacs-org list' Subject: Re: Bug: doc string for "org-end-of-meta-data" Hello Tom, > I believe the last paragraph of the doc string for the function > "org-end-of-meta-data" contains an error. That one-sentence paragraph > currently reads: > > When FULL is non-nil but not t, skip planning information, > clocking lines and only non-regular drawers, i.e. properties > and logbook drawers. > > I believe that should be "regular drawers," not "non-regular drawers." > IMO, the last paragraph could be clearer were it rewritten as follows: > >When FULL is non-nil but not t, skip only planning information, >clocking lines and regular drawers, i.e. properties and logbook >drawers. If any non-regular drawers exist and do not follow the >two regular drawers, stop at the first non-regular drawer instead. > > I believe that this expansion of the paragraph corrects the error and > adds coverage of a rare case. I think the use of the word "regular" is not a good idea in their documentation of org-end-of-meta-data. I could not find any occurance of the term "regular drawer" in the org-info manual. There is a section where the property drawer is called "special". In conclusion I'd say that the logic of the recent documentation is okay with "regular" meaning "non-special". Finally I propose to remove completely the categorisation due to "regular" from the documentation. Which reads: When FULL is non-nil but not t, skip planning information, properties, clocking lines and logbook drawers. WDYT?
Re: [org-cite] citations in property drawers?
> I understand the problem, but the solution should not be: "let's pretend > export does not exist". >From my perspective any org object that is not in a section that allows org objects could in principle be parsed as such, but it would not be in the core of the grammar, and it also would have to parse to something that did not trigger side effects related to export. Allowing org objects to appear at arbitrary places in the grammar is definitely not a good idea because in many senses they cannot actually be those objects. Maybe the syntax could be the same, but they would have to be "shadow objects" or something like that? Best, Tom
Re: [org-cite] citations in property drawers?
Hi Bruce, I could certainly imagine using it, and I don't see any issue with doing it from the point of view of the grammar. Footnotes can appear in a property drawer without issue, though obviously they don't export. One question though since I may have missed this in the other threads is cite: allowed without the square brackets? Either way, org element just parses the value to a string and it is up to any consuming application to parse the node property further. Best! Tom On Thu, Sep 9, 2021 at 11:45 AM Bruce D'Arcus wrote: > > Just bumping this. > > Another question about where to allow cite elements. > > On Fri, Aug 20, 2021 at 4:18 PM Bruce D'Arcus wrote: > > > > So this is a tentative request/question; I'm not really sure the best > > approach here. > > > > This is based on discussion with one of the org-roam-bibtex developers > > about what the proper way to indicate an org-roam note is a > > bibliographic note; e.g. a note about a bibliographic source. > > > > Traditionally in org-roam, that is in a property drawer; like: > > > > :ROAM_REFS: cite:wallace-wells2019 > > > > That is using org-ref syntax there. > > > > So the obvious question is should one just put an org-cite citation > > there to do the same thing? > > > > Right now, the answer is clearly no, since they aren't allowed in > > property drawers. > > > > But perhaps they should be, just as any link can be? > > > > Except if they are, I recognize, they need to be treated as special > > cases; e.g ignored for the purposes of export and such. > > > > WDYT? > > > > Bruce >
Re: A requires/provides approach to linking source code blocks
We have been receiving many new feature suggestions and requests coming in for org babel. I think that Tim's suggestion is the right one. Nearly all of these need to be implemented as an extension first and tested independently. Further, even if this is done, it should be clear that there is zero expectation that such extensions will be incorporated. Once I wrap up the formal grammar for org, one of the next things I plan to work on is a clear specification for org babel. This is critical because so many of the suggestions that come in deal with individuals' specific problems and thus fail to account for how such features interact with existing features and how the newly proposed feature would block some other features in the future, confuse users, etc. Such suggestions also often fail to account for increased complexity, nor have they been exposed to a sufficient number of examples to reveal fundamental ambiguities in how they could be interpreted. The issues with variable behavior between ob langs for :pre :post :prologue :epilogue etc. are already enough to keep us busy for quite some time. With regard to this thread in particular, it is of some interest, but there are fundamental issues, including the fact that certain languages (e.g. racket) expect module code to exist somewhere on the file system. There are ways around many of these issues, in fact there are likely many ways around any individual issue, so org babel needs to systematically consider the issues and provide a clear specification, or at least a guide for how such cases should be handled. To give an example from one of my continual pain points: I start writing python or racket in an org src block and then I want it to be a library so that it can be reused by other code both inside and outside the org file without having to resort to noweb. What is the best way to handle this? I don't know. Right now I tangle things and resort to awful hacks for the reuse-in-this-org-file case, but I'm guessing there is a better generic solution which would allow _any_ org block to be exported as a library instead of nowebbed in. Before jumping for any particular suggestion for how to handle this we need to explore the diversity of cases that various ob langs present, so that we can find a solution that will work for all of them. After all, packaging code to a library for reuse is an extremely common pattern that org babel should be able to abstract, but it is a major undertaking, not just the addition of a keyword here and there. In short I suggest that we issue a general moratorium on new org babel feature suggestions until we can stabilize what we already have and provide a clear specification for correct behavior. Until we have that spec we could encourage users to create extensions that implement those features. Best, Tom PS The other next thing that I am working on might be another way out for this particular feature request. Namely, it is simplifying and extending org keyword syntax so that new keywords (with options) and associated keywords can be specified using keyword syntax within a single org file. This would make it possible to get useful high level keyword behavior in a single file without burdening the core implementation with more special cases for associated keywords, and it would allow users to write small elisp functions that could do some of what is suggested here, all without need to add anything to the core org implementation.
Re: [PATCH] Rename headline to heading
Hi André, Thanks for taking a first pass at this. I think that this patch is difficult to review. Could you break it into two separate patches, one for documentation (non-code, e.g. docstring and comment) changes and one for code changes? That way we could more easily see where we may need to mitigate the kind of issues Maxim noticed. Best! Tom
Re: bug: Error handling in source blocks.
I will also chime in here to say that managing output streams and errors for babel is a major new feature that I am interested in. The issue, as Tim points out, is that there is a lot of complexity lurking here due to the fact that certain languages have fundamentally different capabilities and ways of handling or not handling errors, and of running code (on arbitrary hosts) in the first place. What works for one will almost certainly not work for another. Take for example ob-lisp where there is already built in error handling in emacs itself. Compare that with python where someone would likely need to implement a special PYTHONBREAKPOINT entrypoint or something like that, if it were possible at all. I have had a draft of a document on what I called "babel regularization" for well over a year now, but it is not in a state that would be productive to share due to the sheer number of ob-langs and systems affected and the need to be able to clearly catalog and articulate the diversity of existing behaviors. If you dig through old conversations on this list you will find a discussion of the default behavior for ob-shell :returns values vs output as the default, we were barely able to agree on which principles should be followed to make the decision. In that case we were lucky that there was already a way for users to set their desired behavior in their init file or in a setup file or in the file itself. How to handle errors will be much more complex, in part because it will touch on what ob-lang implementations are able to overwrite and/or must provide in order to actually function. At the moment there are practically no constraints. Lots of work to do here, so grateful for a report on the variability in the behavior of the existing system. Best! Tom
Re: [Concept talk] Org-connector
Hi Sébastien, I think you are probably looking for org-sync which implements exactly this functionality. You would need to write a new backend for your particular ticketing system, but github, bit bucket, and redmine backends already exist and can serve as an example. Best, Tom https://orgmode.org/worg/org-contrib/gsoc2012/student-projects/org-sync/tutorial/
Re: Expanding how the new cite syntax is used to include cross-references - thoughts?
In general I like John's suggestion. Org link syntax can be made to do nearly anything because it is possible to bind link actions to arbitrary elisp functions (I have used them to create buttons that run source blocks for some of my non-technical colleagues). The grouping of cross references under org-cite seems reasonable to me, and I would love it if they could handle arbitrary references, e.g. to hypothesis web annotation links or org-capture links. Actually, having written this now, I think that both solutions have their own use cases. Org cite is clearly about providing evidence for, or a scholarly reference for something, and critically it can embed some metadata about that reference in the document as a citation or perhaps as an excerpt (and extension of what org-ref does now when the cursor is over a reference?). Regular links do not provide any way to embed metadata within the document, they are purely pointers. That being said, it seems that there are a number of use cases where org-ref links are simply internal document links that can point to an element with a specific #+name: and no embedded information about the target is needed. However, I think it would be a mistake to use up equation/eq and table/tbl or figure/fig prefixes for references that are internal to org, because it implicitly limits/collides with the #+link: keyword. Best, Tom
[PATCH] lisp/ox-html.el: Restore org-svg class
Hi, This patch restores the addition of class="org-svg" to svg images during html export. Best! Tom From 4363eec0913ccd0d05ecf3d6346208c62d3597f8 Mon Sep 17 00:00:00 2001 From: Tom Gillespie Date: Fri, 30 Jul 2021 20:53:07 -0700 Subject: [PATCH] lisp/ox-html.el: Restore org-svg class. * lisp/ox-html.el (org-html--format-image): Restore org-svg class. d96e8975791bd3b1a5f8fdb75609d73f134dc831 removed the org-svg class which is necessary even when using tags otherwise svg images will render at absurdly large sizes. --- lisp/ox-html.el | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lisp/ox-html.el b/lisp/ox-html.el index bd6771a76..f25a9731e 100644 --- a/lisp/ox-html.el +++ b/lisp/ox-html.el @@ -1707,7 +1707,9 @@ a communication channel." (org-html-encode-plain-text (org-find-text-property-in-string 'org-latex-src source)) (file-name-nondirectory source))) - attributes)) + (if (string= "svg" (file-name-extension source)) + (org-combine-plists '(:class "org-svg") attributes '(:fallback nil)) + attributes))) info)) (defun org-html--textarea-block (element) -- 2.31.1
Re: Help requested: Support for basic Org mode support in tools outside of Emacs
Hi Karl, Great initiative. For many of the things in the table you will probably want to link to the underlying library For example for github and gitlab there is https://github.com/wallyqs/org-ruby (which I have been trying to find time to submit fixes to). I've linked a couple relevant threads and repos. Best! Tom python https://github.com/novoid/Memacs python https://github.com/karlicoss/orgparse python https://github.com/bjonnh/PyOrgMode racket https://github.com/tgbugs/laundry/tree/next racket https://github.com/jeapostrophe/org-mode racket https://github.com/antoineB/org-mode See https://github.com/tgbugs/laundry/blob/next/laundry/cursed.org for an org file that github fails to render clojure https://github.com/200ok-ch/org-parser/blob/master/resources/org.ebnf https://orgmode.org/list/ca+g3_pobab1qx1zv8q9sjfh4khuhvmanxp3xo7__6eosdxk...@mail.gmail.com/ https://orgmode.org/list/ca+g3_pnj6pekqv+twfkwbd778xhw9wsfx+kjjhjsoreplhu...@mail.gmail.com/ On Tue, Aug 3, 2021 at 11:46 AM Greg Minshall wrote: > > Karl, > > orgtbl-query is a script for querying tables in .org files. it doesn't > do any special text formatting. > > https://gitlab.com/minshall/orqtbl-query > > cheers, Greg >
Re: [PATCH] lisp/ox-html.el: Restore org-svg class
Bumping this patch for 9.5. On Fri, Jul 30, 2021 at 8:59 PM Tom Gillespie wrote: > > Hi, >This patch restores the addition of class="org-svg" to svg images > during html export. Best! > Tom
Re: Org lint and named source blocks
> Should we allow syntax like #+KEYWORD:value to be correct or do we > require a whitespace/space after colon all the time? The spec as written is ambiguous/silent on this issue. In my work on laundry tokenizer and grammar I have found keyword syntax to be a thorny issue, and I strongly suggest that for the time being we either make no ruling on this or we state that the colon that ends the keyword should be followed by a space as a precautionary measure. The safe thing to do is to always require whitespace after the colon because it guarantees correct interpretation. Requiring whitespace after the colon simplifies the grammar, however it means that you can't compact keyword lines, and it induces an annoying failure mode where missing spaces are no longer keywords. However, it is technically possible to make keywords work without the whitespace, so long as there is at least one whitespace prior to the next colon (but not contained in square brackets, e.g. #+key:lol[ a b c ]:value is a well formed keyword under a slighly generalized grammar). The problem is that we would like to make keyword syntax fully closed, and I need a bit more time to get that worked out before any definitive conclusions are drawn. The complexity of the generalized keyword syntax can be seen here https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L107-L249 Best, Tom
Re: how to org-babel-detangle with nested noweb?
Hi Edgar, Degangling of nested noweb blocks tangled using :comments noweb is broken at the moment. There are some deep bugs that need to be worked out, and last time I looked at the code I think my conclusion that it was better to do a complete rewrite starting from a new specification of the behavior along with some gnarly test cases to ensure that everything works as expected. Best! Tom
Re: Empty headline titles unsupported: Bug?
Hi Bastien, I am strongly in favor of this change. It simplifies the grammar significantly, and from my work on the laundry lexer and parser, I'm 99% certain that the current behavior is a bug that is the result of gobbling the space after the stars in the headline. The correct implementation peeks 1 char ahead for the space, and then starts parsing again starting with the space. This is because tags MUST be preceded by a space, so if you incorrectly gobble the space after the stars then that space cannot be used as the start for tags. Best, Tom
Re: [PATCH] Accept more :tangle-mode specification forms
I strongly oppose this patch. It adds far too much complexity to the org grammar. Representation of numbers is an extremely nasty part of nearly every language, and I suggest that org steer well clear of trying to formalize this. With an eye to future portability I suggest that no special cases be given to something as important for security as tangle mode without very careful consideration. Emacs lisp closures have clear semantics in Org and the number syntax is clear. If users are concerned about the verbosity of (identity #o0600) they could go with the sorter (or #o0600). Best, Tom
Re: [PATCH] Don't fill displayed equations
> do not see a reason for idiosyncrasy that markup intended to add LaTeX > snippet that looks like exactly as LaTeX commands for this purpose and > even actually preserved during export to LaTeX should have different > semantics for Org parser. The answer is that \[ \] can only occur inside paragraphs. The issues here are exactly the same as the issues for inline footnotes. Org gives us a bit more power, but not the full power because Org is Org, not Latex. Making \[ \] available outside of a paragraph would be a massive breaking change. In Timothy's original example he is narrowly skirting the syntax to allow that all to remain a single paragraph, but stick in a newline anywhere and boom, no more paragraph, no more equation. I guess one thing I'm missing/not understanding is when/why people want to use \[ \] instead of full #+begin_export latex block? Best, Tom
Re: Comments break up a paragraph when writing one-setence-per-line
A general comment (heh) here. This is not a bug and not easily fixed. Line comments are their own top level element distinct from paragraphs. If you need something that fits in a paragraph you can use @@comment:@@ at the start of a line. I agree that it is annoying, but Org line comment syntax also only works if it starts the line, so the behavior diverges from traditional code comments. It may make sense to update the docs to call them "line comments" instead of just comments. One area where we could almost certainly do better is in how line comments break up the flow of text. I'm not sure there will ultimately be much we can do about it, but it is worth investigating. Best, Tom
Re: [PATCH] Don't fill displayed equations
Hi Timothy, > │ \[ > │ not part of a paragraph > │ \] My point is that that parses first as a paragraph (check org-element-at-point). \[ and \] would be meaningless if it did not first parse as a paragraph. > I also don’t see how footnotes are analogous, as footnotes are placed in the > middle of a line of text. Inline footnotes [fn:: can span multiple lines] but can't contain empty lines because the empty line ends the paragraph that they are contained in. > org-latex-preview :) But surely #+begin_export latex works with org-latex-preview? If not then that would be a feature request to org-latex-preview yes? Best! Tom
Re: [PATCH] Don't fill displayed equations
Some thoughts. > Maybe you are right and Tom was actually assuming \begin{equation*}, not > #+begin_export latex. Correct. My bad on that one. > Just as Timothy, I believe that \begin{equation*} is unnecessary verbose > when \[ works *mostly* in a similar way. \begin{equation*} is absolutely required if you want to be able to include newlines because \[ and \begin are not similar at all as far as parsing is concerned. >From the spec: https://orgmode.org/worg/dev/org-syntax.html#LaTeX_Environments > CONTENTS can contain anything but the “\end{NAME}” string. The spec is not completely accurate since latex environments can't contain a new heading, but the point is that latex environments are elements, whereas \[ \] is an object. > If I understand correctly, making \[ \] available outside paragraph > would mean that it becomes a new element (currently \[\] is a > latex-fragment object). Correct. Promoting \[ to an element would mean every \ in an org file becomes a stop word. Also, Since full fledged latex environments already exist to serve this purpose I find it hard to justify, especially given that Org tries to give clear indication of when a block structure is starting and ending. > Isn't the whole point of the \[ ... \], \( ... \), $ ... $, $$ ... $$, > and \begin{env} ... \end{env} and constructs in Org to be consistent > with LaTeX? For \begin and \end yes. For the others no. In general it would be to make it possible to express things using latex-like syntax that would otherwise require Org to come up with some new and different syntax. These are values that may be translated to latex, but they exist inside a larger syntax that is decidedly not latex, and thus they only have meaningful translation to latex if they exist as well formed Org. As a side note, the $ syntax is slated to be deprecated and removed. https://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments > It would introduce incompatibilities with previous Org versions, but > support for $...$ (and for symmetry, $$...$$) constructs ought to be removed. > Indeed, it will be a breaking change. I'm actually fairly certain that such a change should never be made due to the recent changes in org link syntax. Specifically given how \[ is used for escapes in links. https://orgmode.org/manual/Link-Format.html This means that the only place you could reliably use \[ is at the start of a new line preceded only by whitespace. However, if this were to happen then pretty much every org document that uses \[ \] is at risk for being broken because something that was once a single paragraph will now be multiple paragraphs. If you need multiline use \begin \end, that is what they are there for, and they fit better with org's general extensible approach to blocks. I would dearly love to be able to have a single shorthand for src blocks that worked inline and standalone, but the complexity that it would induce is just not worth it. Same thing for \[ \]. It seems simple until you get down to account for all the edge that it would induce in the grammar. Consider the case where you have something like \[ something something more content more content [[www.example.com/\]oops][evil link]] \] I've seen enough cases that are similar to this in the existing implementation that have inconsistent behavior that I can safely say that this one would too. Not to mention that I can think of at least 3 different cases that will all have slightly different behavior that is inexplicable to users at best and infuriating at worst. \[ a b \] \[ a b \] a \[ b c \] d etc. There are plenty more variants that would all be subtly different depending on the exact way such a thing were implemented. In short. Just not worth it.
Re: [PATCH] Don't fill displayed equations
> Does anybody have any other thoughts? >From time to time I encounter random patterns that I don't want to be reformatted during a fill operation. Maybe a custom variable like org-fill-paragraph-skip-regexp or similar that could be set by the user? For Timothy's use case he would set it to the regexp provided in the original patch? Not sure how much of the implementation in the patch is dependent on that particular regexp, but a general solution that could even be set per org file might be a very useful new feature. Best! Tom
Re: Org lint and named source blocks
> By the way, wouldn't it be better to use tree-sitter rather than > something else for the format grammar? Not really since we are going to need more than one implementation using a parser generator to avoid baking implementation specific details into the spec by accident. This is true for more than just the grammar as well. The complexity of tokenization, parsing, expanding, etc, for Org means that we are going to need multiple implementations to nail the behavior for any formal spec. That said, we definitely want a TS implementation at some point. See https://github.com/tgbugs/laundry/issues/1 for a recent discussion about ways forward. The implementation I'm working on should translate to TS without too much work since both brag and tree sitter describe LR variants. There may be some subtle differences, but nothing fundamental. The issue for me is that I don't have the bandwidth to get started with a full tree sitter implementation, especially because it is going to need a custom scanner, and because you're effectively on your own when it comes to reconstructing the output of the AST into the actual internal representation of an Org file. I also have no idea how to deal with nested parsers in tree sitter. I have some ideas about how it might be done, but nothing concrete (see the linked issue for more on that). Best, Tom
Re: [PATCH] Accept more :tangle-mode specification forms
> I'd like to understand these objections better. Aren't you overstating what is at issue? Yes, after hitting send I realized I overstated my position a bit. In the meantime the comments in this thread are encouraging, however I have finally figured out what I was really trying to say. tl;dr file permission modes are not universal and should thus not be part of the Org implementation, Org itself knows nothing about files or permissions, it is the system that Org is running in/on. Therefore, so long as we make it abundantly clear that the value for :tangle-mode is not expected to be portable and that it is always up to the user to ensure correct behavior, then we are ok. I'm not happy about this conclusion from a security perspective, but it isn't really worse than the situation we have right now. As many have pointed out, the grammar itself will not be affected. However, other parts of the spec will. In general my objective is to try to reduce the number of special cases that an org implementation has to know about and delegate them to something else. However in this case it is a bit tricky because of the security implications and due to the fact that octal modes for file permissions are NOT universal and should not be expected to be universal! I actually think that my gut reaction was correct, but was expressed in the wrong way. Unix file modes are not universal and should thus not be encoded as part of a portable document format. This means that it is up to the user to know what representation is suitable. Right now that representation is delegated to Emacs, because Emacs handles file permissions for Org, and Emac's language for modes is octal. There are some octal modes that do not translate on Windows, and cannot be correctly set. There will (hopefully) be some happy day in the future where there is an operating system that will run Org babel where octal file modes do not exist at all! Therefore I suggest that we do not enshrine a particularly obscure way of expressing file modes into Org itself. Right now Org is confined to Emacs' representations, which in a sense protects Org from becoming too ossified by bad designs of the past --- Emacs can keep all that for us! If we want a more user friendly syntax for this I would suggest that we do something like what has been done for Org babel :results, i.e. like :tangle-mode read write execute, unfortunately that does not compose well at all with user, group, and other and becomes exceedingly verbose. Final conclusion, after all that rambling, is that I'd actually be ok with any of the solutions proposed, so long as it is clear that :tangle-mode will always be implementation dependent, and may or may not be meaningful depending on which operating system you are using. Unfortunate for security, but I don't see any way around tha. The best we could do for security would be for implementations to test the file modes after tangling to ensure that they match, which is more important I think. That said, reducing the number of forms as Eric suggests would be a happy medium. Best! Tom
Re: Org lint and named source blocks
Thanks for the pointer! The actual point of contact seems to be https://github.com/milisims/tree-sitter-org. Good to find another group that is working on this. Best, Tom
Re: On zero width spaces and Org syntax
one, and probably would already have been done, and I suspect people might use it. There are very few syntax changes that reduce the complexity for Org (though there are some). The rest have major costs, both in implementation time, and in disruption of workflows, and hunting down of edge cases, and total complexity. The burden of proof for syntax changes lies squarely with the individual(s) suggesting the change to show that it can be done without disrupting the existing implementation and without inducing complexity and changing the interpretation of existing documents. I say this as someone who has at least one major syntax change suggestion in the pipeline. Requesting a syntax change is among the most deeply invasive and complex things that can be done. I know that syntax is also the most obvious to users, it is their interface to the format afterall! However, each individual shares that interface with thousands of other people. The maintainers have to speak for those thousands who never read, much less respond on this mailing list, and that almost always means that the response will be one that is decidedly conservative. I don't mean to be dismissive of the suggestion, but a lot of time is spent on this list walking back ideas that have not had sufficient time put into understanding what the unintended consequences would be, so I wouldn't say that it is irresponsible, I would say instead that it lacks sufficient rigor and depth to be seriously considered. If you can add those to this proposal (e.g. in the form of a patch) then I suspect it would get a much warmer reception. Best, Tom
Re: Some commentary on the Org Syntax document
Hi Timothy, Replies in line. Some things might seem a bit out of order because I responded from bottom to top. Best, Tom > from heading to bed, so to quote Pascal "I have only made this letter > longer because I have not had the time to make it shorter". Likewise, and I've heard it as Mark Twain :D > I think a a big problem is the mix of implicit and explicit information. > Some components are rigorously specified in terms of the characters they > may contain, elements and objects that are recognised inside them, and > even the order in which different parts of the pattern are parsed. I agree completely. > As mentioned originally, the current Dynamic Blocks description doesn't > even mention the CONTENTS part of the pattern, and relies on the reader > inferring that it operates similarly to the CONTENTS part of Drawers. Indeed this should be fixed. > Forcing the reader to start making inferences like this is a treacherous > path, and I think I can blame for some of the other issues I've > experienced. Take for instance the "surely X can't contain a newline?" > comments I've made. In the Node Properties and Entities descriptions you > have statements along the lines of "X can contain any character [...] > except a newline". In my mind this then sets up the reader to interpret > a similar statement without the "except a newline" clause to mean that > newlines are permitted. I agree completely and had almost the exact same experience as you when I was working on it. As I mention below, my responses were to illustrate why the explicit information is missing, not to suggest that it should be left out. We should definitely work to make everything more explicit so that future readers don't have to go through the same issues we have. > I'm also thinking that the term "element" is overworked in the document. > It's basically pulling tripple duty: you have Elements, Greater > Elements, and elements which are Elements and/or Greater Elements . In extreme agreement. > 3. Section Technically This isn't part of the syntax, rather it is part of elisp Org mode's internal representation. I'm not sure I would even mention sections at all, because they have to do with the interpretation of the syntax. In a section on the internal representation for Org sections definitely belong, but they are incidental. That said, I suspect we will find that they are useful for talking about the behavior of the file under transformation, e.g. "headings are not reordered when pressing M-up or M-down, sections are reordered" this allows us to make it possible to talk about an Org implementation that has commands that allow one to switch the headings without moving their associated sections. > 5. (Greater Element / Element) There are issues here with forms that are part of the syntax vs forms that are part of the intermediate representation. A line based parser for Org syntax that assembles greater blocks after the fact and a parser that uses arbitrary lookahead to truncate on headings won't have the exact same surface syntax, however they will both have an equivalent in their intermediate representation that corresponds to a greater block. Again, very deep in implementation details here, but trying to force things like sections into the syntax hierarchy seems confusing to me. > 7. Object Paragraph element maybe? Might seem odd for heading titles to have paragraph scope, but on the other hand it certainly simplifies the explanation of the grammar. And you can put an inline footnote in a heading title. > 8. Pattern / Form Don't know what to make of this one. Like "Term" these are incredibly generic. > 9. Term Use of "Term" is super confusing to me. > We could say call (1) Components, (7) Units, (6) Objects, (5) Element or > Object (why not spell it out to avoid telling people to remember > something). I'm not sure we are ready to specify this. One way that we might try to manage this would be to create a taxonomy of element types, e.g. top-level elements, paragraph elements, etc. This would be consistent with the fact that the elisp implementation of org-element has all of these as an instance of element. > I could have put more thought into this, but it should do for > illustrating my line of thinking. Let me know if you have any good > ideas. Let's leave the terminology as is right now. I'm expecting that there will be quite a few new terms that we will want to introduce and we will want to separate syntax and intermediate representation. With progress on using org-element for fontification and on laundry we should be able to come up with language that can be used to distinguish between concepts that are needed for syntax, (tokens, parser) and for intermediate representations. Things like basic syntax highlighting need only the langua
Re: Concrete suggestions to improve Org mode third-party integration :: an afterthought following Karl Voit's Orgdown proposal
Hi all, I have a much longer mail in the works, a quick one for now. I think it is a major strategic mistake to exclude discussions about interoperability from this list. As Bastien pointed out in his talk at Emacsconf there is only a single list for both users and developers. Discussion about interoperability with tools for working with Org are entirely valid subjects for the user list. Obviously help and support for other tools is not valid for the list, but questions about interoperability or incorrectness of some external tool should always be valid. We must provide strong technical leadership for all tools that want to work with Org syntax otherwise we risk it spiraling out of control. Forcing discussions off list will split the community and I think the fact that Karl's work made it to this list so late in the process shows the danger of trying to exclude certain discussions. I follow this list, I keep the community up to date with my work, I have no idea where to look for other Org related dicussions, nor frankly do I have time to look for them. I suspect I am not alone in this. Whether a certain portion of the Org community likes it or not, there is another portion for whom Org syntax already has a life beyond Org mode (e.g. academic papers and computation notebook style workflows). For some workflows documents written in Org syntax are a primary exchange format and format of record, not just an internal format from which documents for sharing are generated. The plain text nature of Org syntax and the freedom that it enables also means freedom from Emacs. Empowering users to own and control their own data to use with their own tools is the whole point. The fact that this means that it works outside Emacs is a critical feature for many data preservation use cases. Enough for now. Best! Tom
Re: Org-syntax: Intra-word markup
Hi all, After a bunch of rambling (see below if interested), I think I have a solution that should work for everyone. The key realization is that what we really want is the ability to have a "parse me separately" type of syntax. This meets the intra-word syntax needs and might meet some other needs as well. The solution is to make @@org:...@@ "parse me separately" block! It nearly works that way already too! To minimize typing we could have @@:...@@ the empty type default to org. This seems like a winner to me. The syntax for it already exists and won't conflict. It requires relatively minimal additional typing the implication is clear, and there are other places where such behavior could be useful. This syntax seems like a winner to me @@org:/hello/@@world @@:/hello/@@world You can also do things like #+begin_src org I want a number in this number@@org:src_elisp{(+ 1 2)}@@word! #+end_src Which would render to #+begin_src org I want a number in this number3word! #+end_src Thoughts? Best! Tom --- rambling below - > This idea reminds me a bit of Scribble/Racket where every document is > just inverted code, which makes it possible to insert arbitrary Racket > code in your prose... I will say, despite some of my comments elsewhere, that I think exploring certain features of Scribble syntax for use in Org mode would simplify certain parts of the syntax immensely. For example various inline blocks are an absolute pain to parse because they allow nested delimiters /if they are matched/. The implementation of the /if they are matched/ clause is currently a nasty hack which generates a regular expression that can only actually handle nesting to depth 3. Actually implementing the recursive grammar add a lot of complexity to the syntax and is hard to get right. It would be vastly simpler to use Scribble's |<{hello }} world}>| style syntax and always terminate at the first matching delimiter. I'm sure that this would break some Org files, but it would make dealing with latex fragments and inline source blocks and inline footnotes SO much simpler. Matching an arbitrary number of angle brackets does add some complexity, but it is tiny compared to the complexity of enforcing matched parens and their failure cases especially because many of the places where nesting is required probably only see use of the nesting feature in a tiny fraction of all cases. One other reason why this is attractive is that all the instances where nested delimiters can appear on a line are preceded by some non-whitespace character. This means that using the pipe syntax does not conflict with table syntax! Now the question comes. If we could implement this for delimiters, could we also implement something similar for markup? The issue with the proposed markup outside delimiter inside approach is that it will change existing behavior for files that want the delimiters to be included in the markup, i.e. /{oops}/ becoming /oops/ is bad. A second issue is that putting the delimiter inside the markup cannot work for verbatim and code ={oops}= is ={oops}= no matter what. Therefore the solution is not uniform across all types of markup. We need another solution that works for all types of markup. What if we put the "start arbitrary markup" char outside the markup? Say something like |/ital/|icks? Or what if we went whole hog and used |{/ital/}|ics and made the |{...}| syntax trigger a generalized feature where the contents of the |{...}| block are parsed by themselves and can abutt any other text? This would be generally useful in a variety of situations beyond just intra-word markup. What are the issues with this approach? The first issue is that there is a conflict with table syntax if we were to use the pipe character because markup can appear at the start of a line. The second issue is that it might be confusing for users if |{}| also worked like {} when in the context of latex elements or inline src blocks, or maybe that is ok because |{}| never renders as text. Hrm. Ok. Second issue resolved, but what to do about the first? If we want generalized "parse this by itself" syntax so that we can write hello|{/world/}|ok, then we need a solution that can appear at the start of a line. So we can't use pipe because that is always a table line even if a zero width space is put before it ;). What other options do we have? How about #+|{/hello/}|world for the start of a line? As long as there is no trailing colon it isn't a keyword, so it could work ... except that if someone reflows the text and it is no longer a the start of a line then the syntax breaks. That is to say using #+| at the start of a line is not uniform, so we can't take that approach. What other chars to we have at our disposal? Hrm. How about @@? Could we use that? What happens if we use @@org:/hello/@@world? Or maybe if we want to minimize the number of chars we could do @@:/hello/@@world and have the empty prefix in @@ blocks mean org?
Re: Org-syntax: Intra-word markup
> Since org is a valid export backend though, perhaps this behaviour should be > reserved for @@:…@@, i.e. no export backend, which I think semantically fits > fairly nicely. This ends up being even more convenient than I initially realized. The current spec for export snippets is ambiguous when it says "NAME can contain any alpha-numeric character and hyphens" but the implementation behavior requires that "any" means "at least one" and is implemented using the + regex operator. What this means is that @@:...@@ syntax is not actually used in Org at all at the moment and renders as plain text. I agree that we need to avoid @@org:..@@ because it has legitimate uses. Making a back-end of empty string valid for parse separately syntax thus makes @@ syntax more regular overall, and allows @@:...@@ to be processed separately because it currently never enters the export snippet processing. This is important because export snippets do not seem to be easily accessible to earlier phases of the org-export machinery, i.e. there isn't a nice centralized place to preprocess @@org:...@@ even if we wanted to. On the other hand @@:...@@ isn't processed at all. I could be missing something in the org export code though. It will take a bit of work to get this behavior implemented I think, but it doesn't seem to have any conflicts. Some users may have set the empty backend to expand manually via org-export-snippet-translation-alist, but as long as we give org-export-snippet-translation-alist priority and warn people that setting "" manually will disable the new functionality then there shouldn't be any disruption. The behavior also sort of matches what we would want the empty string to be in this case, which is "all backends" and of course the only markup that makes sense for "all backends" is org itself! Best, Tom
Re: Parens matching errors in org-babel code blocks
Definitely a known issue. No easy way to fix it without someone doing a deep pass on syntax propertization I think. I have a version of rainbow delimiters mode that tries to work around this at least for font locking, but it is severely broken and has some nasty quadratic performance issues in large files. I'll have to look into the proposed solution that Tim mentions, I may have missed it (unless it was the solution for <> that John mentions in the linked thread, in which case that one is not sufficient). Here is a discussion from back in April. https://lists.gnu.org/archive/html/emacs-orgmode/2021-04/msg00031.html Best, Tom
Re: [PATCH] Accept more :tangle-mode specification forms
Hi Timothy, The confusion with 755 and "755" could lead to security issues in cases like 600 vs "600" vs #o600. The need to protect against the 600 case is fairly important, however I don't think there is anything we can do about it, because someone might want to enter their modes as base 10 integers. If we were to prepend every integer with #o (or setting the radix to 8 when reading this particular field) before passing it to org-babel-parse-header-arguments then it would be impossible to use base 10 integers unless they were provided in the #10r600 form (Emacs doesn't support #d600 notation). I think the best bet is to change the radix for bare integers to 8 when reading that particular header, however I don't know how complex that would be to implement. If we don't want to change the radix to 8 then here are some suggestions. If #o0600 already parses correctly, then I suggest we leave things as is. Adding complexity just to drop the leading # seems wasteful. We may want to warn or raise an error if someone uses a value such as the base 10 integer 600 which does not map to the usual expected octac codes so that they don't silently get bad file modes that could leave files readable to the world. Best, Tom
Re: "Orgdown", the new name for the syntax of Org-mode
> I believe (IMHO) that it does not make much sense to separately name the > Org Mode syntax (as a markup language). That would only generate > confusion among users. This is unfortunately not the case. Conflating Org mode which is an Emacs major mode with Org syntax is a major communication barrier that leads to confusion for anyone trying to implement a tool based on Org syntax. For example I couldn't just call my implementation of an org-mode-like package for Racket "Org mode" because it is not an Emacs major mode. The absence of a name for Org syntax hampers search and discovery. I'm happy to keep using the multi-word term Org syntax, but I have found a practical need to distinguish the surface syntax from the Emacs major mode to reduce confusing for technical users. Best, Tom PS Another brainstormed name: Orgsyn?
Re: "Orgdown", the new name for the syntax of Org-mode
I had jokingly suggested "orgup" to have a more positive feeling (up instead of down) than markdown. I'm not sure orgdown will be any more confusing than some other name. It could imply a version of the org syntax that uses markdown surface syntax, but it seems that that would probably be called org flavored markdown by the existing conventions in the markdown community. Best, Tom
Re: Some commentary on the Org Syntax document
Hi Timothy, Replies in line. Best! Tom On Thu, Dec 2, 2021 at 1:32 AM Timothy wrote: > > Hi All (& Nicolas in particular again), > > With my recent efforts to write a parser based on > <https://orgmode.org/worg/dev/org-syntax.html>, I’ve developed a few thoughts > on > that document. Hopefully, they can lead to some improvements and > clarifications. > > > > As a general comment, in many places the Org Syntax document states what > characters a component can contain, but not what objects/elements. This feels > like a bit of a hole in the current specifications. This is indeed confusing because there are some implicit constraints that are not listed because they never come up. For example, you cannot have two newlines inside an inline footnote because the two newlines break the paragraph and the thing that appears to be an inline footnote is just plain text that is never terminated. Ensuring that font locking is in sync org-element and org-export is critical to ensure that users know what will actually happen. > > > Sections > > > Heading > ─── > > ⁃ Ok, so `TITLE' can have any character but a newline, but what Org > components can it contain? > I’m going to assume any object? Via org-element-object-restrictions it is standard-set-no-line-break which is all elements except citation-reference, table-cell, and line-break. > > > Affiliated Keywords > ═══ > > > Greater Elements > > > Greater blocks > ── > > ⁃ It is not explained what is ment by a “special block” > ⁃ Aren’t lines starting with `#+' also quoted by a comma? > > > Drawers and Property Drawers > > > ⁃ “Contents can contain any element but another drawer” > • Does “any element” mean “any Element or Greater Element” Any element that does not have greater precedence, so that would be only a heading. > > Dynamic Blocks > ── > > ⁃ It is not specified what `CONTENTS' may be Implicitly follows the same rules as drawers, no headings and no nesting of dynamic blocks. Text should be added that states this explicitly. > ⁃ Surely `PARAMETERS' cannot contain a newline? Termination by newline is implicit in the example, but the text is confusing. > Plain Lists and Items > ─ > > ⁃ It is not completely clear what content an item may have. > I assume any Object? By my reading it may contain anything, objects and elements, except for a heading, but that is already implied by the de-indent. To quote from the docs: An item ends before the next item, the first line less or equally indented than its starting line, or two consecutive empty lines. Indentation of lines within other greater elements do not count, neither do inlinetasks boundaries. This makes plain lists one of the most complex elements to parse. > > Tables > ── > > ⁃ Surely newlines are not allowed in `FORMULAS' No newlines are implicit in the use of "lines" but still confusing. > > Elements > > > Clocks > ── > > Two allowed forms are listed, but are all four of the below allowed or only > two? > ┌ > │ CLOCK: INACTIVE-TIMESTAMP > │ CLOCK: INACTIVE-TIMESTAMP DURATION > │ CLOCK: INACTIVE-TIMESTAMP-RANGE > │ CLOCK: INACTIVE-TIMESTAMP-RANGE DURATION > └ No. Only the two are allowed. An inactive timestamp alone is a starting point, adding a duration without the end point means that there is no way to check that the range and duration match. > All the best, > Timothy
Re: Org-syntax: Intra-word markup
I don't mean to be a wet blanket, but the edge cases for the current markup syntax are already hard enough to implement correctly, to the point where different parts of Org mode are inconsistent. Intra-word markup isn't viable because there simply isn't any sane way to parse something like *hello world*/hrm/oh no*. The other issue is that this will degrade parsing performance because almost every character could precede the start of a markup section. I recommend anyone suggesting solutions try to implement something that can parse the markup unambiguously with lots of nasty test cases. You will likely find that it is impossible to consistently tokenize markup, and that you have to hand write a whole bunch of heuristics, making Org syntax even harder to implement correctly. Any solution that suggests extending how =/*~+_ can be used gets a hard no from me. I could see teaching other exporters how to interpret \emph{hello}world, but trying for to have any sane behavior for something like why *hello*world oh no a wild askterisk* is not worth it. Best, Tom
Re: Orgdown: negative feedback & attempt of a root-cause analysis (was: "Orgdown", the new name for the syntax of Org-mode)
Karl, The exact naming of a thing is nearly always the most contentious step in trying to promulgate it. In my own field we can easily get all parties to agree on a definition, but they refuse to budge on a name. As others have said, I wouldn't worry about kibitizing over the name. I would however worry about the larger negative reaction. From my perspective I think the issue is that there are many efforts working toward a formalized specification for Org syntax and Org mode functionality, and some of those stakeholders who have invested significant effort may feel blindsided by a public declaration announcing Orgdown because they were not consulted and not made aware that you were working on it. I appreciate the amount of work that you have put in, I have devoted hundreds of hours to working on an alternate implementation of org in Racket that uses a formal ebfn in hopes that others will be able to use it as a guide and as a way to talk formally about how Org parsers and implementations should behave. It would thus be easy for me to say that your approach has put the cart before the horse, because there are countless nuances in the specification for Org syntax which must be addressed before any levels of org compliance can be specified, otherwise the behavior between levels will be inconsistent. If I were to say this, it would not be fair to you at all. The ideas and motivation for Orgdown are vital and important. You have put in enormous thought and effort, all because you care about Org and want to see it succeed. The issue is that any shared specification for Org syntax is fundamentally about how to coordinate as a community. The way that Orgdown was presented to the community feels (to me) like it is being imposed top down or coming from an individual source, not from an open and visible community process (the subject of your original email reads as a declaration in english, and thus can be quite off putting, though I know that was not the intention). I personally haven't bothered with promulgation because I think that we are not technically ready as a community to approach outreach to other developers in a way that we can succeed. The good news is that all of this can co-exist if we want it to, but we need to be clear about our objectives as a community. To me these objectives are as follows (and I would love to hear from others about additional or alternate objectives). 1. To never fracture Org syntax so as to avoid the nightmare of markdown flavors. (This means being able to say clearly as a community that a parser is out of compliance and that it is up to the user to fix their files. The ruby org parser used by Github is a major issue here.) 2. To provide a clear specification for what graceful degradation looks like when parsing Org syntax if a parser does not support some portion of that syntax (e.g. should property drawer lines be excluded or rendered as plain text?). 3. Provide a solid basis on which further formal specification can be built. (My interests in particular are around providing consistent semantics for org-babel blocks across languages so that babel implementations can clearly communicate what runtime features they support.) The approach for Orgdown can absolutely meet all three of these objectives, however in its current form Orgdown1 is not sufficiently well specified to avoid fracturing the syntax. This is because Org syntax is extremely complex (even the elisp implementation of Org mode is internally inconsistent) and there are edge cases where behavior will diverge if parsing of even the simplest elements is not fully specified. There are many ways to remedy this, however they require a more formal approach. A number of us are working to build technical foundations for such a formal approach, but I do not think that any of those projects are ready to be used to specify discrete levels of Org syntax parsing compliance. If I may, I would suggest that an Orgdown0 is something that could be well specified, but it would avoid parsing of markup altogether and only deal with the major element types. Parsing paragraphs and all the org objects is not something that can be done piecemeal. There are too many interactions between different parts of the syntax, and in some cases the existing specification desperately needs to be revisited due to the complexity that it induces or because it is underspecified. Of course this would make Orgdown0 fairly useless as a replacement for markdown, but at least it would be a start. Best, Tom
Re: noweb and shell heredocs
Hi Łukasz, One workaround that is fairly reliable is to prefix the names of the blocks to be nowebbed with an &. So #+name: block-name becomes #+name: Then you reference it as <<>> and the heredoc syntax is broken. Best, Tom
Re: Formal syntax for org-cite
Hi Timothy, Thanks for putting this together. Comments in line. Best! Tom For reference here is the tokenizer pattern I use in laundry at the moment. There are a number of issues with it ... https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L896-L913 > Citation syntax is currently not documented, but from the implementation > it looks something like this: > #+begin_example > [cite CITESTYLE: GLOBALPREFIX KEYCITES GLOBALSUFFIX] > #+end_example There is potential confusion here because =[cite= does not have to be followed by a space (rather, cannot be). The top level syntax is =[cite= terminating at the first occurrence of =]=. I think we may also need to include a note that no whitespace is allowed either? It will only be recognized within paragraph context (e.g. headings, paragraphs, and other places where org objects can appear). Stating that up front would clarify that the rest of the syntax described here is how to determine whether the citation is well formed/how to parse it. > =KEY= can be made of any word-constituent character, =-=, =.=, =:=, =?=, > =!=, =`=, ='=, =/=, =*=, =@=, =+=, =|=, =(=, =)=, ={=, =}=, =<=, =>=, > =&=, =_=, =^=, =$=, =#=, =%=, =%=, or =~=. You have a duplicated =%= here. > I have not yet confirmed what =KEYPREFIX= and =KEYSUFFIX= may contain, > but as a starting point, any of the characters allowed in =KEY= except > =@= plus whitespace would seem fairly safe. =KEYSUFFIX= must start with > a whitespace character to be able to be differentiated from =KEY=. I don't think we can allow whitespace here? > =CITESTYLE= consists of a main =STYLE= and any number of =VARIANT=s > (including zero), prefixed by forwards slashes in the following pattern > #+begin_example > /STYLE/VARIANT/VARIANT/VARIANT > #+end_example Need clarification on empty syles e.g. [cite//:] > "cite" and =CITESTYLE=, =KEYCITES= and =GLOBALSUFFIX= are /not/ > separated by whitespace. Neither are =KEYPREFIX=, =@KEY=, or =KEYSUFFIX= > separated by whitespace. I may be missing something, but this is confusing with respect to the statement about =KEYSUFFIX= and whitespace made above.
Re: [PATCH] ob-core: tangle check library of babel after current buffer
Pinging on this to see if anyone can test it so that it can be merged. Tom On Wed, Jun 16, 2021 at 4:29 PM Tom Gillespie wrote: > > Hi, >This is a patch that fixes tangling behavior when a block has been > ingested into the library of babel and then modified. Best! > Tom
Re: Headings and Headlines
I enthusiastically support changing the documentation to use heading. I use heading in my formal grammar because I have found there are more ways that it can be modified and remain grammatically correct when used in english sentences. The internal implementation in elisp still refers to headlines, but changing the docs would be a good first step. Best! Tom
RE: Timestamp parsing inside node properties and other contexts out of org-element-object-restrictions (was: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-
Hi Tim, Thanks for these thoughtful comments. I agree that the Org developers (to whom I, as a mere user, owe enormous thanks) must be wary before making changes to how timestamps are handled. This argues, I would say, for keeping what I believe was the status quo since at least Org version 9.4.4: Agenda views would display entries with active timestamps in property drawers. That has been my historical experience. Tim, has your historical experience been different? In the invoicing example you give, were the timestamps in the properties drawer active, or inactive? I have just verified with a simple test that Org version 9.4.4, which was shipped with Emacs 27.2 I believe, does display entries with an active timestamp as the value of a property in the ordinary :PROPERTIES: drawer. That's the situation I'm calling the "status quo." I'm wondering if my experience coincides with the experience of others. Here's the simple entry that will be shown on the Week/Day Agenda view in 9.4.4: * TODO Test of active timestamps :PROPERTIES: :Created: <2022-03-22 Tue 18:30> :END: And note this: adding a second active timestamp to the test entry, e.g., to accompany a SCHEDULED: keyword, results in the entry appearing on the Agenda twice, as would be expected: * TODO Test of active timestamps SCHEDULED: <2022-03-22 Tue 18:30> :PROPERTIES: :Created: <2022-03-22 Tue 18:30> :END: This second example shows why the variable org-agenda-skip-additional-timestamps-same-entry is valuable. I rarely want an entry to display twice on the same day. Tom Davey -- Tom Davey t...@tomdavey.com New York NY USA -Original Message- From: Emacs-orgmode On Behalf Of Tim Cross Sent: Tuesday, March 22, 2022 5:10 PM To: Ihor Radchenko Cc: Ignacio Casso ; emacs-orgmode@gnu.org; t...@tomdavey.com; Nicolas Goaziou Subject: Re: Timestamp parsing inside node properties and other contexts out of org-element-object-restrictions (was: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)]) Ihor Radchenko writes: > Ihor Radchenko writes: > >> After further reading the source code, I figured that agenda is, in >> fact, supposed to handle timestamps inside property drawers. Optional >> arguments for org-at-timestamp-p imply that, in agenda specifically, >> timestamps inside node properties are considered timestamps despite >> they are not being parsed as timestamps by org-element. > > Even though I fixed the reported issue with agenda not showing > headings with matching timestamps inside property drawers, this > situation is revealing a big inconsistency in Org mode's handling of timestamps. > > org-at-timestamp-p usage implies that Org syntax for timestamps is not > only context-dependent, but also depends on current command! > > org-at-timestamp-p is called with non-nil argument in a number of > functions in Org: > - org-clock-timestamps-change > - org-mouse-delete-timestamp > - org-mouse-context-menu > - org-follow-timestamp-link > - org-get-repeat > - org-auto-repeat-maybe > - org-time-stamp > - ... many more in org.el > > So, depending on the current command, Org may on may not treat objects > matching org-ts-regexp-both as timestamps. > > This situation complicates syntax and makes org-element unreliable > when dealing with Org buffers. > > Should we just simply allow timestamps to be a part of node property > values? Should we _not_ treat timestamp-looking text outside their > allowed contexts (like quotes, source blocks, etc) as timestamps? > I think we have to be very wary here. I can see any changes here causing lots of breakage for people. I know for my own use case, I use timestamps a lot in property draws and various source blocks. I never want any of them showing up in my agenda. As an example, I was recently working for a company which required that you put a timestamp in both a file header and in comments. The format they used was pretty much the same as an org-mode active timestamp. I use org mode to tangle the source files I write (as well as manage my client data, such as todos, invoicing, contacts etc), so these files are searched for agenda items, but I do not want any of those timestamps causing lines in my agenda views. These timestamps are most commonly found in source and example blocks. I think the only time an org timestamp should be recognised in a source block is when that source block is an org-mode source block. I don't think they should ever be 'recognised' in example blocks. IN addition, my invoicing solution, which is based on org, uses timestamps to track invoice periods etc. None of these should ever appear in the agenda. This information is typically tracked in property draws. Unfortunately, I think org times
RE: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)]
Ihor writes: > I personally see allowing timestamps (and links) inside property values as a > useful feature. > Would it be of interest for other users? Yes, it's a quite useful feature. For years, via my Capture templates, I've been adding a property named :Created: to the properties drawer as follows: :PROPERTIES: :Created: <2022-03-06 Sun 22:42> :END: Now, in 9.5.2, literally hundreds of entries that formerly appeared on the built-in Agenda views cannot be easily found. Regards to all, Tom PS The variable 'org-agenda-skip-additional-timestamps-same-entry seemed expressly made for my use case, to clean up same-day clutter in entries with a TODO timestamp. -- Tom Davey t...@tomdavey.com New York NY USA -Original Message- From: Emacs-orgmode On Behalf Of Ihor Radchenko Sent: Saturday, March 12, 2022 7:29 AM To: Ignacio Casso Cc: emacs-orgmode@gnu.org Subject: Re: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)] Ignacio Casso writes: > In Emacs 27.2, with an up to date version of org from ELPA (9.5.2), > org-agenda considers timestamps that appear in property drawers, so > the entry below appears in the daily agenda view. > > * Heading > :PROPERTIES: > :timestamp: <2022-03-12 sáb> > :END: > > However, in the latest Emacs version built from source (29.0.50), with > the built-in version of org (also 9.5.2, but the latest release, I > assume), this is no longer the case and that entry does not appear in > the agenda view. > > I know that maybe it's unorthodox, but I have some org files that rely > in the previous behavior, with entries like the following: > > * Some friend > :PROPERTIES: > :birth-date: <1994-03-12 sáb +1y> > :END: > > Is this a bug? If it's not, can someone point me to the functions I > need to touch to restore the previous behavior? Or maybe I should stop > doing this and start moving those timestamps out of the properties > drawer in my files? What you see in the new Org version is not a bug. Property values are treated as plain text by Org. In the older versions, agenda code did not rely on Org's internal parsing and matched timestamps in places where timestamps are not allowed (inside code blocks, for example). See https://orgmode.org/list/20220101122409.ga29...@itccanarias.org Dear all, I was unable to find a place in manual describing that timestamps cannot be placed inside property values: >> A timestamp is a specification of a date (possibly with a time or a >> range of times) in a special format, either ‘<2003-09-16 Tue>’ or >> ‘<2003-09-16 Tue 09:39>’ or ‘<2003-09-16 Tue 12:00-12:30>’(1). A >> timestamp can appear anywhere in the headline or body of an Org tree >> entry. Its presence causes entries to be shown on specific dates in >> the agenda (see *note Weekly/daily agenda::). We distinguish: I personally see allowing timestamps (and links) inside property values as a useful feature. Would it be of interest for other users? In any case, we should probably clarify manual in this regard. Best, Ihor
RE: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)]
Ignacio writes: > I've located the line in org-agenda.el responsible of the new behavior, > and the following patch seems to fix it. I suggest it is incorporated > into the repository, maybe with a variable org-agenda-skip-timestamps- > in-properties-drawer defaulting to t if not everyone agrees. I second that suggestion for the repository! Thanks very much for the patch. I think you are correct in supposing that when Emacs 28.1 is released, many Org users who upgrade will be mystified at the new timestamp behavior and will spend time without success trying to figure out what changed. Perhaps the new variable you propose, org-agenda-skip-timestamps-in-properties-drawer, should default to nil to preserve the historical behavior? -- Tom Davey t...@tomdavey.com New York NY USA -Original Message- From: Emacs-orgmode On Behalf Of Ignacio Casso Sent: Monday, March 21, 2022 7:21 PM To: t...@tomdavey.com Cc: emacs-orgmode@gnu.org Subject: Re: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)] >> What you see in the new Org version is not a bug. Property values are >> treated as plain text by Org. >> >> I was unable to find a place in manual describing that timestamps >> cannot be placed inside property values: >> I personally see allowing timestamps (and links) inside property values as a useful feature. >> Would it be of interest for other users? > > Yes, it's a quite useful feature. For years, via my Capture templates, I've been adding a property named :Created: to the properties drawer as follows: > > :PROPERTIES: > :Created: <2022-03-06 Sun 22:42> > :END: > > Now, in 9.5.2, literally hundreds of entries that formerly appeared on the built-in Agenda views cannot be easily found. It seems that I'm not the only one using this unintended feature in previous versions of org, and probably there will be many others who don't use the latest version of org and have not noticed yet but will have the same problem when they upgrade. I think that even if timestamps were never intended to be used inside property drawers before, the fact that it worked for a long time and nothing in the documentation suggested otherwise makes it a de facto feature, even if unintended, and should be preserved. I've located the line in org-agenda.el responsible of the new behavior, and the following patch seems to fix it. I suggest it is incorporated into the repository, maybe with a variable org-agenda-skip-timestamps-in-properties-drawer defaulting to t if not everyone agrees.
Re: [BUG] Make SVG + LaTeX work by default [9.5.2 (release_9.5.2-9-g7ba24c @ /Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/org/)]
I do not think we can add -shell-escape by default because it is an arbitrary code execution vector. It might be good to add a setting in org that would do the right thing without requiring a user to understand the arcana of latex cli options though. Best, Tom
Re: Suggestion: convert dispatchers to use transient
The backward compatibility requirements for org mean that it won't be possible to replace the existing implementation for quite a while. That said, I imagine that having optional transient dispatchers for users on newer versions of emacs would be appreciated. Best, Tom
Re: Org Syntax Specification
Hi Ihor, Thank you very much for the detailed responses. Let me start with some context. 1. A number of the comments that I made fall into the brainstorming category, so they don't need to make their way into the document at this time. I agree that it is critical for this document to capture how org is parsed right now and that we should not put the pie-in-the-sky changes in until the behavior of org-element matches (if such a change is made at all). 2. Though I haven't been hacking on it, I fully intend to contribute test cases and exploratory work on org-element in the future, so please don't interpret some of what I am writing as requests for other people to write code (unless they want to :) 3. When I say grammar in this context I mean specifically an eBNF that generates a LALR(1) or LR(1) parser. This is narrower than the definition used in the document, which includes things that have to be implemented in the tokenizer, or in a pass after the grammar has been applied, or are related to some other aspect beyond the pure surface syntax. 4. A number of my comments are about the structure of the document more than the structure of the syntax or the implementation. I think that most of them are trying to ask whether we want to clearly delineate pure surface syntax from semantics to make the document easier to understand. More replies in line. Best! Tom > As for your other comments, you seem to be suggesting a number of > changes to the existing Org syntax. Some of them looks fine, some are > not. However, please keep in mind that we have to deal with back > compatibility, third party compatibility, and not breaking existing Org > documents unless we have a very strong justification. I suggest to > branch a number of new threads from here for each concrete suggestion > where you want to make changes to Org syntax, as opposed to just > document wording. Otherwise, this discussion will become a total mess. Agreed. I put many of these in here as notes from my experiences, I will branch those off into separate discussions so that we don't pollute this thread. > Nope. Sections are actually elements. See =org-element-all-elements=. I realized this at a slightly later date but missed cleaning up this comment. See my response on section vs segment below. > I disagree. Nesting rules are the important part of syntax. We have > restrictions on what elements can be inside other element. The same > patterns are not recognised in Org depending on their nesting. For > example, links that you put into property drawers are not considered > link objects. When I wrote this comment I was still confused about sections.I think discussion of nesting in most contexts is ok, but there are some case where nesting cannot be determined from the grammar, and there I think we need to make a distinction. In my thinking I separate the context sensitive nature of parsing from the nesting structure of the resulting sexpressions, org elements, etc.The most obvious example of this is that the sexpression representation for headings nests based on the level of the heading, but heading level cannot be determined by the grammar so it must be reconstructed from a flat sequence of headings that have varying level. > Again I disagree. While your idea about table cells is reasonable > (similar for citation-references inside citations), I am against > decoupling Org syntax from org-element implementation. In > org-element.el, table-cells are just yet another object. If we make > things in org-element and syntax document out of sync, confusion and > errors will follow during future maintenance. Org element treats all elements and objects as a single homogenous type. This is fine. However, to help people understand the syntax it seems easier to define things in a positive way so that we don't say "all except these two." Therefore, despite the fact that the implementation of org-element treats table rows and cells no different from any other node in the parse tree, we don't need to burden the reader with that information at this point in time, and could provide that information as an implementation note for cells. I think the other issue I was having here is that the spec for tables is spread allover the place, and it would be much easier to understand and implement ifit were all in one place. > This actually reads slightly confusing. "Blank lines separate paragraphs > and other elements" sounds like blank lines are only relevant > before/after paragraphs. However, there are also footnote references and > lists. Maybe we can try something like: > > Blank lines can be used to indicate end of some elements. > > "can" because a single blank line usually does not separate anything. I think your version is quite a bit more readable. Can we list the set of all the elements t
Re: Problem when tangling source blocks with custom coderefs
Hi Luis, I don't think you are doing anything wrong. IIRC the portion of the patch that allowed the customization to propagate to the tangled code was not included. Given that I am no longer the only one who is looking for/expecting this behavior, maybe it is worth revisiting the decision. The simplest fix right now would be to prepend your coderef with the python comment symbols # |hello| so that at the very least it won't break your tangled files. I would like to see this implemented, so let's see what Nicolas has to say. Best! Tom
Re: Org Syntax Specification
Hi Timothy, I have attached a patch with some modifications and a bunch of comments (as footnotes). More replies in line. Thank you for all your work on this! Tom > Marking this as depreciated would have no effect on Org’s current behaviour, > but we could: > > Mark as depreciated now-ish > Add a utility to convert from TeX-style to LaTeX-style > Add org lint/fortification warnings > A while later (half a decade? more?) actually remove support In favor of this. There are good alternatives for this now. > The other component of the syntax which feels particularly awkward to me is > source block switches. They seem a bit odd, and since arguments exist, > completely redundant. Extremely in favor of removing switches. There are so many better ways to do this now that aren't like some eldritch unix horror crawling up out of the abyss and into the eBNF :) From 3527331f02e593ec6ba6cb4c8bde3f64de3ad216 Mon Sep 17 00:00:00 2001 From: Tom Gillespie Date: Mon, 17 Jan 2022 19:34:21 -0500 Subject: [PATCH] Tom's comments and modifications to org syntax edited I removed any mention of markdown because it is a distraction in this document and is not something we want anyone attending to here. I change "top level section" to "zeroth section" which I think is more consistent terminology because level is often used to refer to the depth of parsing at any given point in the file and the top level refers to anything that can be parsed without context. Zeroth makes it clear that we are talking about the actual zeroth occurrence of a section in a file/buffer/stream. --- dev/org-syntax-edited.org | 399 +++--- 1 file changed, 331 insertions(+), 68 deletions(-) diff --git a/dev/org-syntax-edited.org b/dev/org-syntax-edited.org index c3259473..2e99070d 100644 --- a/dev/org-syntax-edited.org +++ b/dev/org-syntax-edited.org @@ -19,9 +19,7 @@ under the GNU General Public License v3 or later. Org is a plaintext format composed of simple, yet versatile, forms which represent formatting and structural information. It is designed to be both intuitive to use, and capable of representing complex -documents. Like [[https://datatracker.ietf.org/doc/html/rfc7763][Markdown]], Org may be considered a lightweight markup -language. However, while Markdown refers to a collection of similar -syntaxes, Org is a single syntax. +documents. This document describes and comments on Org syntax as it is currently read by its parser (=org-element.el=) and, therefore, by the export @@ -32,14 +30,13 @@ framework. ** Objects and Elements The components of this syntax can be divided into two classes: -"[[#Objects][objects]]" and "[[#Elements][elements]]". To better understand these classes, -consider the paragraph as a unit of measurement. /Elements/ are -syntactic components that exist at the same or greater scope than a -paragraph, i.e. which could not be contained by a paragraph. -Conversely, /objects/ are syntactic components that exist with a smaller -scope than a paragraph, and so can be contained within a paragraph. - -Elements can be stratified into "[[#Headings][headings]]", "[[#Sections][sections]]", "[[#Greater_Elements][greater +"[[#Elements][elements]]" and "[[#Objects][objects]]". Elements are +syntactic components that have the same priority as or greater +priority than a paragraph. Objects are syntactic components that are +only recognized inside a paragraph or other paragraph-like elements +such as heading titles. + +Elements are further divided into "[[#Headings][headings]]", "[[#Sections][sections]]"[fn::sections are not elements], "[[#Greater_Elements][greater elements]]", and "[[#Lesser_Elements][lesser elements]]", from broadest scope to narrowest. Along with objects, these sub-classes define categories of syntactic environments. Only [[#Headings][headings]], [[#Sections][sections]], [[#Property_Drawers][property drawers]], and @@ -52,7 +49,12 @@ elements that cannot contain any other elements. As such, a paragraph is considered a lesser element. Greater elements can themselves contain greater elements or lesser elements. Sections contain both greater and lesser elements, and headings can contain a section and -other headings. +other headings. [fn:tom2:I would not discuss strata here because it is +not related to the syntax of the document. It is related to how that +syntax is interpreted by org mode. The strata are nesting rules that +are independent of the syntax, and discussing that here in the syntax +document is confusing, because the nesting is not something that can be +parsed directly because it depends on the number of asterisks.] ** The minimal and standard sets of objects @@ -60,25 +62,33 @@ To simplify references to common collections of objects, we define two useful sets. The
Re: call blocks as a function from inside elisp code
Hi George, Here is an example of how I call nested elisp and python. The python block is an input argument to the elisp block in this case, but the python block could be called directly as well. I'm not sure how to pass arguments to the block from inside elisp via org-babel-eval though, that seems like it would require some deeper tampering/advising of functions. Best, Tom https://github.com/SciCrunch/sparc-curation/blame/master/docs/queries.org#L1704-L1707 #+begin_src elisp :results none :exports none (ow-babel-eval "neru-simplified") #+end_src The implementation I use is included below and is source dfrom https://github.com/tgbugs/orgstrap/blob/bc981b957967be8d872c08be9ba7f2dbde5caf1d/ow.el#L786-L803 (defun ow-babel-eval (block-name universal-argument) "Use to confirm running a chain of dependent blocks starting with BLOCK-NAME. This retains single confirmation at the entry point for the block." ;; TODO consider a header arg for a variant of this in org babel proper (interactive "P") (let ((org-confirm-babel-evaluate (lambda (_l _b) nil))) ;; FIXME TODO set messages buffer size to nil (save-excursion (when (org-babel-find-named-block block-name) ;; goto won't raise an error which results in the block where ;; `ow-confirm-once' is being used being called an infinite ;; number of times and blowing the stack (org-babel-goto-named-src-block block-name) (unwind-protect (progn ;; FIXME optionally raise errors on failure here !? (advice-add #'org-babel-insert-result :around #'ow--results-silent) (org-babel-execute-src-block)) (advice-remove #'org-babel-insert-result #'ow--results-silent)) (defun ow--results-silent (fun args) "Whoever named the original version of this has a strange sense of humor." ;; so :results silent, which is what org babel calls between vars ;; set automatically is completely broken when one block calls another ;; there likely needs to be an internal gensymed value that babel blocks ;; can pass to eachother so that a malicious user cannot actually slience ;; values, along with an option to still print, but until then we have this (let ((result (car args)) (result-params (cadr args))) (if (member "silent" result-params) result (apply fun args
Re: [PATCH] Add support for $…$ latex fragments followed by a dash
> The change is local and minor. We can't know that. Consider for example someone that has the following line somewhere in their files. #+begin_src org I spent $20 on food and was paid$-10 dollars by friends so I am down $10. #+end_src Yes =paid$-10= is probably a typo that should have a space in between, but it could still be in a file and cause an issue. The more likely case would be of someone that has $ in the name of a variable that also uses dashes. For example if I have a list of variable names such as #+begin_src org Text a $A_BASH_VAR Text b some-$-lisp-var #+end_src The proposed change would break any file with a pattern like this. We have no way of seeing every org file that users have written so we don't know the extent of the impact, and thus have to assume that there would be some impact. Making such a change with an unknown blast radius in the midst of considering removing support for that syntax altogether is inviting disaster. Best, Tom
Re: [PATCH] Add support for $…$ latex fragments followed by a dash
> The attached patch adds support for $…$ latex fragments followed by a > dash, such as $n$-th. Unfortunately this falls into the realm of changes to syntax. The current behavior is not a bug and is working as specified because hyphen minus (U+002D) does not count as punctuation for the purposes of org syntax. We should specify which chars count as punctuation in the syntax doc. As noted by Eric \(\) has no such restrictions. >From https://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments > POST is any punctuation (including parentheses and quotes) or space > character, or the end of line. Best, Tom
Re: Description list with " :: " in the tag.
Thanks! -- Tom Alexander On Sat, Sep 9, 2023, at 5:06 AM, Ihor Radchenko wrote: > "Tom Alexander" writes: > >> Emacs version: 29.1 >> Org-mode version: 163bafb43dcc2bc94a2c7ccaa77d3d1dd488f1af >> >> Found a conflict between the documentation and the parser behavior. The >> org-mode documentation[1] for description list items says that TAG '[...] >> does not contain the substring " :: "' >> >> Using this sample document, I have created a plain list item with a tag that >> contains that substring by wrapping it in a verbatim block: >> ``` >> - =foo :: bar= :: baz >> ``` >>(item >> ... >> ((1 0 "- " nil nil "=foo :: bar=" 23)) >> ... >> It seems that "TAG-TEXT" is not just text but it can include objects and >> those objects can include the substring " :: ". > > It is simpler. > Everything after the bullet and before the last " :: " is considered as > tag. Everything after the last " :: " is description. > Then, tag and description are parsed, allowing objects inside. > > org-syntax document is inaccurate here - it says that the _first_ " :: " > is used as tag:description delimiter, not the _last_. > > I do not see any benefit changing the current parser. So, we probably > need to update org-syntax document instead. > > -- > Ihor Radchenko // yantar92, > Org mode contributor, > Learn more about Org mode at <https://orgmode.org/>. > Support Org development at <https://liberapay.com/org-mode>, > or support my work at <https://liberapay.com/yantar92>
Fixed width areas not allowing tab after leading colon.
The documentation for fixed width areas states: A “fixed-width line” starts with a colon character (:) and either a whitespace character or the immediate end of the line. Using the test document: ``` :foo ``` parses as a paragraph instead of a fixed-width area: ``` (org-data (:standard-properties [1 1 1 7 7 0 nil org-data nil nil nil 3 7 nil # nil nil nil] :path nil :CATEGORY nil) (section (:standard-properties [1 1 1 7 7 0 nil first-section nil nil nil 1 7 nil # nil nil #0]) (paragraph (:standard-properties [1 1 1 7 7 0 nil top-comment nil nil nil nil nil nil # nil nil #1]) #(": foo\n" 0 6 (:parent #2) ``` This happens in a document in worg: https://git.sr.ht/~bzg/worg/tree/74e80b0f7600801b1d1594542602394c085cc2f9/item/org-contrib/org-bom.org#L499 Emacs version: GNU Emacs 29.1 (build 1, x86_64-pc-linux-musl) Org-mode version: c703541ffcc14965e3567f928de1683a1c1e33f6 (latest in git) Fixed-width area documentation: https://orgmode.org/worg/org-syntax.html#Fixed_Width_Areas -- Tom Alexander
Re: [DISCUSSION] May we recognize everything like [[protocol:uri]] as a non-fuzzy link? (was: [BUG] URI handling is overly complicated and nonstandard [9.6.7 (N/A @ /gnu/store/mg7223g8mw90lccp6mm5g6f3
This is a timely discussion. I have been thinking about how to deal with prefixes defined by the #+link: keyword which is directly related to this question. I think the following might be a solution that also avoids the issue brought up by Arne. The original "bug" cannot be resolved because bare URIs have syntax that conflicts with Org syntax. However I think we can do better than directing users to org-link-set-parameters. My suggestion is as follows. Schemes/prefixes defined by the #+link: keyword can be used without surrounding syntax markers but may not contain spaces etc. To support this Org parsers should always parse prefix:suffix as a _putative_ link which must then be checked against a list of known schemes that are either built in or have been declared by the user to indeed be legitimate schemes. In the tel: case, the way to solve the original bug is simply to add the line #+link: tel tel: which would tell Org that e.g. tel:555-555- is a real uri, and that it should expand to itself. At the same time this solution would avoid Arne's issue (which I also have in some of my documents where I have use fig: and tbl: as prefixes in names and reference them via [[fig:figure-name]]) because the parser would only treat prefix: in an internal link as a scheme if it is defined explicitly by the user in a #+link: keyword or in their init.el. Thoughts? Tom
Re: [RFC] Quoting property names in tag/property matches [Was: [BUG?] Matching tags: & operator no more implicit between tags and special property]
Ignore the previous message. I see that this was about matching tags not about specifying them. Best, Tom
Re: [RFC] Quoting property names in tag/property matches [Was: [BUG?] Matching tags: & operator no more implicit between tags and special property]
Without wading too far into this, why do we need escape syntax for this? The only character that might need an escape would be colon :, but my reading of the syntax doc is that colo : will immediately terminate the property, so we would update the doc to make it clear that property names cannot contain a colon. As written, if there is an issue with the minus sign in property names then that is a bug, but I feel like I might be missing something? Tom
Re: Description list with " :: " in the tag.
I've written a patch (attached) with my proposed wording changes to the documentation, should I be starting another thread or does dropping it here work best? I do not have commit access so I'd need someone with such authority to do the last bit. -- Tom Alexander From 20addaa5ab7d4e9420ade1125c2a337345ecdd31 Mon Sep 17 00:00:00 2001 From: Tom Alexander Date: Wed, 13 Sep 2023 18:19:05 -0400 Subject: [PATCH] org-syntax.org: Fix definition of description list tags. Description lists support objects in their tags and they support the substring " :: ". --- org-syntax.org | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/org-syntax.org b/org-syntax.org index 123fc232..3046e26c 100644 --- a/org-syntax.org +++ b/org-syntax.org @@ -470,9 +470,10 @@ BULLET COUNTER-SET CHECK-BOX TAG CONTENTS + CHECK-BOX (optional) :: A single whitespace character, an =X= character, or a hyphen enclosed by square brackets (i.e. =[ ]=, =[X]=, or =[-]=). + TAG (optional) :: An instance of the pattern =TAG-TEXT ::= where - =TAG-TEXT= represents a string consisting of non-newline characters - that does not contain the substring ="Â ::Â "= (two colons surrounded by - whitespace, without the quotes). + =TAG-TEXT= is one of more objects from the standard set so long as + they do not contain a newline character, until the last occurrence + of the substring ="Â ::Â "= (two colons surrounded by whitespace, + without the quotes). + CONTENTS (optional) :: A collection of zero or more elements, ending at the first instance of one of the following: - The next item. -- 2.42.0
Description list with " :: " in the tag.
Emacs version: 29.1 Org-mode version: 163bafb43dcc2bc94a2c7ccaa77d3d1dd488f1af Found a conflict between the documentation and the parser behavior. The org-mode documentation[1] for description list items says that TAG '[...] does not contain the substring " :: "' Using this sample document, I have created a plain list item with a tag that contains that substring by wrapping it in a verbatim block: ``` - =foo :: bar= :: baz ``` Which parses to: ``` (org-data (:standard-properties [1 1 1 23 23 0 nil org-data nil nil nil 3 23 nil # nil nil nil] :path nil :CATEGORY nil) (section (:standard-properties [1 1 1 23 23 0 nil first-section nil nil nil 1 23 nil # nil nil #0]) (plain-list (:standard-properties [1 1 1 23 23 0 nil top-comment nil nil nil nil nil nil # nil ((1 0 "- " nil nil "=foo :: bar=" 23)) #1] :type descriptive) (item (:standard-properties [1 1 19 23 23 0 (:tag) item nil nil nil nil nil nil # nil ((1 0 "- " nil nil "=foo :: bar=" 23)) #2] :bullet "- " :checkbox nil :counter nil :pre-blank 0 :tag ((verbatim (:standard-properties [3 nil nil nil 15 0 nil nil nil nil nil nil nil nil # nil nil #3] :value "foo :: bar" (paragraph (:standard-properties [19 19 19 23 23 0 nil nil nil nil nil nil nil nil # nil nil #3]) #("baz\n" 0 4 (:parent #4))) ``` It seems that "TAG-TEXT" is not just text but it can include objects and those objects can include the substring " :: ". [1] https://orgmode.org/worg/org-syntax.html#Items -- Tom Alexander
Document-level properties incorrect and/or missing based on preceding blank lines and/or comments
Emacs version: Emacs 29.1 Org-mode version: e1569918cc94253650781e83a09695739c93352f (latest in git) The org-mode syntax document[1] says that property drawers can exist in the zeroth section with the format: ``` BEGINNING-OF-FILE BLANK-LINES COMMENT PROPERTYDRAWER ``` Using this test document: ``` :PROPERTIES: :FOO:bar :END: ``` I correctly get the foo property in the top-level org-data ``` (org-data (:standard-properties [1 1 1 33 33 0 nil org-data nil nil nil 32 33 nil # nil nil nil] :path nil :FOO "bar" :CATEGORY nil) ``` But now there are two separate issues: ### Issue 1 Putting a comment before it makes the value for the foo property incorrect (seems to be grabbing an earlier string slice): ``` # baz :PROPERTIES: :FOO:bar :END: ``` ``` (org-data (:standard-properties [1 1 1 39 39 0 nil org-data nil nil nil 38 39 nil # nil nil nil] :path nil :FOO "O: " :CATEGORY nil) ``` Interestingly, looking farther down the AST, the value for foo is properly set in the node-property, just not the org-data: ``` (node-property (:standard-properties [20 20 nil nil 33 0 nil node-property nil nil nil nil nil nil # nil nil #2] :key "FOO" :value "bar")) ``` ### Issue 2 Putting any blank lines before it makes the foo property not appear in org-data at all ``` :PROPERTIES: :FOO:bar :END: ``` ``` (org-data (:standard-properties [1 1 2 34 34 0 nil org-data nil nil nil 4 34 nil # nil nil nil] :path nil :CATEGORY nil) ``` Looking farther down the AST it seems the property-drawer became a regular drawer [1] https://orgmode.org/worg/org-syntax.html#Property_Drawers -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Inconsistent text markup handling when double-nesting markers
> Fixed, on main. Thanks! -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Incorrect quantity of en-spaces
The org-mode syntax document describes entities as: > \NAME POST > \NAME{} > Where NAME and POST are not separated by a whitespace character. and POST is defined as: > Either the end of line or a non-alphabetic character. So using the test document: ``` \_ Foo ``` (a backslash, underscore, three spaces, and then the word Foo) I would expect to get only 2 en-spaces but I am getting 3. Looking at org-entities, an underscore with 2 spaces gets 2 en-spaces, whereas an underscore with 3 spaces gets 3 en-spaces, but if we match all 3 spaces as NAME then POST becomes invalid because "F" is neither the end of the line nor a non-alphabetic character, so we can only match the first two spaces as NAME. emacs version: 29.1 org-mode version: 9bbc21df84d507e568a3ebd17e105cdb9e163784 (latest in git) -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Clock becomes a paragraph by prefixing with not-really-affiliated-keyword
Thanks! -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Comments following not-really-affiliated keywords are becoming paragraphs
Thanks! -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Fixed width areas not allowing tab after leading colon.
Thanks! -- Tom Alexander On Sun, Sep 17, 2023, at 5:48 AM, Ihor Radchenko wrote: > "Tom Alexander" writes: > >> The documentation for fixed width areas states: A “fixed-width line” starts >> with a colon character (:) and either a whitespace character or the >> immediate end of the line. >> ... >> Fixed-width area documentation: >> https://orgmode.org/worg/org-syntax.html#Fixed_Width_Areas > > org-syntax.html is not accurate here. The parser only allows ": " (colon > followed by space) and no other variant. > > Fixed, on master. > https://git.sr.ht/~bzg/worg/commit/a42f57ac > >> This happens in a document in worg: >> https://git.sr.ht/~bzg/worg/tree/74e80b0f7600801b1d1594542602394c085cc2f9/item/org-contrib/org-bom.org#L499 > > Fixed, on master. > https://git.sr.ht/~bzg/worg/commit/0c8d5679 > > -- > Ihor Radchenko // yantar92, > Org mode contributor, > Learn more about Org mode at <https://orgmode.org/>. > Support Org development at <https://liberapay.com/org-mode>, > or support my work at <https://liberapay.com/yantar92>
Re: [PATCH] Re: Description list with " :: " in the tag.
Sorry for the delay, I've been busy in the IRLs. I've updated the patch to reflect that the parser grabs the text before the last " :: " and then parses it as objects. The new patch is attached. -- Tom Alexander On Thu, Sep 14, 2023, at 7:24 AM, Ihor Radchenko wrote: > "Tom Alexander" writes: > >> I've written a patch (attached) with my proposed wording changes to >> the documentation, should I be starting another thread or does >> dropping it here work best? > > You can just modify subject with [PATCH], as I did. > >> ... I do not have commit access so I'd need >> someone with such authority to do the last bit. > > Sure. > >> + =TAG-TEXT= is one of more objects from the standard set so long as >> + they do not contain a newline character, until the last occurrence >> + of the substring =" :: "= (two colons surrounded by whitespace, >> + without the quotes). > > It does not fully represent what is going on - Org parser is top-down > and does not parse objects before it is done parsing the descriptive > list item. So, > > - *foo :: bar* does not actually contain bold markup > > Rather it is "* foo" tag + "bar* does not actually contain bold markup" > description. > > What happens is that the parser splits the first line of the item by the > last " :: " and only then proceeds with parsing the tag and description > using standard set of objects: > > - <> :: > > -- > Ihor Radchenko // yantar92, > Org mode contributor, > Learn more about Org mode at <https://orgmode.org/>. > Support Org development at <https://liberapay.com/org-mode>, > or support my work at <https://liberapay.com/yantar92>From c8812bf7d81dc824d8ecf2c03368f58884773ddf Mon Sep 17 00:00:00 2001 From: Tom Alexander Date: Wed, 13 Sep 2023 18:19:05 -0400 Subject: [PATCH] org-syntax.org: Fix definition of description list tags. Description lists support objects in their tags and they support the substring " :: ". --- org-syntax.org | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/org-syntax.org b/org-syntax.org index 123fc232..fc5e9a37 100644 --- a/org-syntax.org +++ b/org-syntax.org @@ -470,9 +470,10 @@ BULLET COUNTER-SET CHECK-BOX TAG CONTENTS + CHECK-BOX (optional) :: A single whitespace character, an =X= character, or a hyphen enclosed by square brackets (i.e. =[ ]=, =[X]=, or =[-]=). + TAG (optional) :: An instance of the pattern =TAG-TEXT ::= where - =TAG-TEXT= represents a string consisting of non-newline characters - that does not contain the substring ="Â ::Â "= (two colons surrounded by - whitespace, without the quotes). + =TAG-TEXT= is the text up until the last occurrence of of the + substring ="Â ::Â "= (two colons surrounded by whitespace, without the + quotes) on that line. =TAG-TEXT= is then parsed with the standard + set of objects. + CONTENTS (optional) :: A collection of zero or more elements, ending at the first instance of one of the following: - The next item. -- 2.42.0
Subscript with parenthesis
The org-mode documentation[1] states that the SCRIPT portion of the subscript/superscript is either an asterisk, the standard set of objects wrapped in balanced curly braces, or an optional sign followed by "Either the empty string, or a string consisting of any number of alphanumeric characters, commas, backslashes, and dots" But I'm seeing the following test document parse as containing a subscript despite using parenthesis which I do not think matches any of the above criteria: ``` foo_(bar) ``` [1] https://orgmode.org/worg/org-syntax.html#Subscript_and_Superscript -- Tom Alexander
Consecutive plain list items of different types
The org-mode documentation[1] states for plain lists that: > List types are mutually exclusive at the same level of indentation, if both > types are present consecutively then they parse as separate lists. first a minor nit-pick that "both" is probably not the correct word here since there are 3 types of lists, not two (unordered, ordered, and descriptive). I'd go with "multiple" instead IMO. but more importantly, based on that description I would expect the following test document to parse into three separate plain lists, but it parses as a single plain list with 3 items: ``` 1. foo - bar - lorem :: ipsum ``` [1] https://orgmode.org/worg/org-syntax.html#Plain_Lists -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Subscript with parenthesis
Some additional things I'm noticing: - when using parenthesis, :use-brackets-p is nil, so they're not equivalent to curly braces. - it does not support objects inside the parenthesis, just plain text, which again means they're not equivalent to braces. - it, however, seems to require that the parenthesis are balanced because this test document does NOT contain a subscript: ``` foo_(b(ar) ``` which is closer to the curly braces requirement since that seems to be the only part of the subscript/superscript documentation that mentions needing balance. -- Tom Alexander
[PATCH] Add backslash to list of POST characters for text markup
Backslash appears to be supported. To test I used the following test document: ``` foo ~bar~\& baz ``` This happens in a document in worg: https://git.sr.ht/~bzg/worg/tree/ae64e1a54185232d4ebdcab174d8d4319ffd564d/org-release-notes.org#L The ampersand was chosen for the test document since that is not a supported POST character, to make sure backslash was not simply escaping the next character. In the documentation I wrote out the word "backslash" in parenthesis to disambiguate between backslash and escaping the following comma. Patch is attached. -- Tom Alexander pgp: https://fizz.buzz/pgp.ascFrom 098434680b5e3942acc00684a47389f2cdab6208 Mon Sep 17 00:00:00 2001 From: Tom Alexander Date: Thu, 21 Sep 2023 21:14:33 -0400 Subject: [PATCH] Add backslash to list of POST characters for text markup. Backslash appears to be supported. To test I used the following test document: ``` foo ~bar~\& baz ``` The ampersand was chosen since that is not a supported POST character, to make sure backslash was not simply escaping the next character. In the documentation I wrote out the word "backslash" in parenthesis to disambiguate between backslash and escaping the following comma. --- org-syntax.org | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/org-syntax.org b/org-syntax.org index c5299741..8f0f9b0c 100644 --- a/org-syntax.org +++ b/org-syntax.org @@ -249,9 +249,9 @@ discarded. This also applies to single-line elements. :This paragraph will not contain :a long sequence of spaces before "a". -: +: :This paragraph does not have leading spaces according to the parser. -: +: :#+begin_src emacs-lisp : (+ 1 2) :#+end_src @@ -1742,7 +1742,7 @@ whitespace characters. verbatim) or a series of objects from the standard set. In both cases, CONTENTS may not begin or end with whitespace. + [[#Special_Tokens][POST]] :: Either a whitespace character, =-=, =.=, =,=, =;=, =:=, =!=, =?=, ='=, =)=, =}=, - =[=, ="=, or the end of a line. + =[=, ="=, =\= (backslash), or the end of a line. *Examples* -- 2.42.0
COUNTER-SET for alphabetical ordered lists ignored for utf-8 exporter
It seems that COUNTER-SET[1] is not being honored when exporting to utf-8 for alphabetical lists even though it is honored for numeric lists. When exporting to html, COUNTER-SET is honored for both. Test document: ``` # An ordered list starting at 3 1. [@3] foo # An ordered list starting at 1 m. bar # An ordered list starting at 11 m. [@k] baz ``` Launching emacs with: (Setting org-list-allow-alphabetical is necessary or else the alphabetical lists will become paragraphs) ``` emacs -q --eval '(setq org-list-allow-alphabetical t)' /tmp/test.org ``` When exporting to html you get (edited to remove whitespace for clarity): ``` foo bar baz ``` But when exporting to utf-8 you get: (whitespace removed again) ``` 3. foo m. bar m. baz ``` Whereas I would expect: (whitespace removed again) ``` 3. foo m. bar k. baz ``` On a slightly related note: it seems the COUNTER-SET[1] allows single-letter values even when org-list-allow-alphabetical is nil. I don't think that is going to hurt anyone but I figured I should mention it in case its a bug (test doc: `1. [@k] foo` is a plain list starting at 11 even when org-list-allow-alphabetical is nil). [1] https://orgmode.org/worg/org-syntax.html#Items Emacs 29.1, Org-mode version 9.7-pre (release_9.6.8-781-gc70354) -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Subscript with parenthesis
> Not true. I tried > > b^(*asd*) and bold inside superscript does get parsed. Ah thanks for double-checking! You're right, that is getting parsed. Not sure what test document I was using to make me think objects didn't work inside the parenthesis. -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Lesser blocks allowing unescaped lines
This happens in worg at: https://git.sr.ht/~bzg/worg/tree/ba6cda890f200d428a5d68e819eef15b5306055f/exporters/ox-docstrings.org#L2490 The documentation for lesser blocks[1] states: > Lines beginning with an asterisk or `#+` must be quoted by a comma (`,*`, > `,#+`). However, the following test document parses as a lesser block despite containing a line starting with an unescaped #+: ``` #+CATEGORY: foo #+begin_src text #+CATEGORY: bar #+end_src ``` which parses as: ``` (org-data (:standard-properties [1 1 1 60 60 0 nil org-data nil nil nil 3 60 nil # nil nil nil] :path nil :CATEGORY "foo") (section (:standard-properties [1 1 1 60 60 0 nil first-section nil nil nil 1 60 nil # nil nil #0]) (keyword (:standard-properties [1 1 nil nil 17 0 nil top-comment nil nil nil nil nil nil # nil nil #1] :key "CATEGORY" :value "foo")) (src-block (:standard-properties [17 17 nil nil 60 0 nil nil nil nil nil nil nil nil # nil nil #1] :language "text" :switches nil :parameters nil :number-lines nil :preserve-indent nil :retain-labels t :use-labels t :label-fmt nil :value "#+CATEGORY: bar\n" ``` whereas I would expect this to be ``` (section (keyword :key "CATEGORY" :value "foo") (paragraph "#+begin_src text") (keyword :key "CATEGORY" :value "bar") (paragraph "#+end_src") ) ``` This test document shows that lines with an unescaped "*" do break up the lesser block: ``` * foo #+begin_src text * bar #+end_src ``` [1] https://orgmode.org/worg/org-syntax.html#Blocks -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Extra paragraphs incorrectly spawning when ":end:" appears.
Same problem occurs with this sample document: ``` foo #+BEGIN: bar baz ``` which parses as: ``` (section (paragraph "foo\n") (paragraph "#+BEGIN: bar\nbaz\n) ) ``` again, no blank lines and no non-paragraph elements but the single paragraph got split in two. -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Extra paragraphs incorrectly spawning when ":end:" appears.
This test document has 1 paragraph: ``` foo bar baz ``` which parses as: ``` (section (paragraph "foo\nbar\nbaz\n") ) ``` This test document should have 1 paragraph but org-mode is parsing it as 2: ``` foo :end: baz ``` which parses as: ``` (section (paragraph "foo\n") (paragraph ":end:\nbaz\n") ) ``` The paragraph documentation[1] states that: > Empty lines and other elements end paragraphs. But the document contains no empty lines and we can see in the output that it only contains paragraphs. [1] https://orgmode.org/worg/org-syntax.html#Paragraphs -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
[PATCH] Clarify that REST is not supported on the start TIME in a time-range timestamp.
If REST is included in the first TIME on a time-range timestamp then the entire timestamp becomes a single range-less timestamp. To test I used the following test document: ``` [1970-01-01 Thu 8:15-13:15foo] [1970-01-01 Thu 8:15foo-13:15] ``` The first line parses as a timerange from 8:15-13:15. The second line parses as a single timestamp at 8:15. -- Tom Alexander pgp: https://fizz.buzz/pgp.asc From b1114e983d961d48e1d837b8d2ad209a976a5417 Mon Sep 17 00:00:00 2001 From: Tom Alexander Date: Mon, 2 Oct 2023 17:35:28 -0400 Subject: [PATCH] * org-syntax.org (Timestamps): Clarify that REST is not supported on the start TIME in a time-range timestamp. If REST is included in the first TIME on a time-range timestamp then the entire timestamp becomes a single range-less timestamp. To test I used the following test document: ``` [1970-01-01 Thu 8:15-13:15foo] [1970-01-01 Thu 8:15foo-13:15] ``` The first line parses as a timerange from 8:15-13:15. The second line parses as a single timestamp at 8:15. --- org-syntax.org | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/org-syntax.org b/org-syntax.org index c2061431..0c326ba8 100644 --- a/org-syntax.org +++ b/org-syntax.org @@ -1686,9 +1686,10 @@ -MM-DD DAYNAME - DAYNAME (optional) :: A string consisting of non-whitespace characters except =+=, =-=, =]=, =>=, a digit, or =\n=. + TIME (optional) :: An instance of the pattern =H:MMREST= where =H= - represents a one to two digit number (and can start with =0=), and =M= - represents a single digit. =REST= can contain anything but =\n= or - closing bracket. + represents a one to two digit number (and can start with =0=), and + =M= represents a single digit. =REST= can contain anything but =\n= + or closing bracket. =REST= cannot exist on the start TIME in a + time-range timestamp (the patterns with =TIME-TIME=). + REPEATER-OR-DELAY (optional) :: An instance of the following pattern: #+begin_example MARK VALUE UNIT -- 2.42.0
Re: [PATCH] Clarify that REST is not supported on the start TIME in a time-range timestamp.
Potentially related, org-mode is accepting this malformed timestamp from[1]: ``` <2016-02-14 Sun ++y> ``` The org-mode documentation[2] only includes REST with TIME, defining TIME as "H:MMREST". The above does not have any TIME but it accepts the timestamp anyway: ``` (timestamp :type active :range-type nil :raw-value "<2016-02-14 Sun ++y>" :year-start 2016 :month-start 2 :day-start 14 :hour-start nil :minute-start nil :year-end 2016 :month-end 2 :day-end 14 :hour-end nil :minute-end nil ) ``` Perhaps that grammar is wrong and REST needs to be separated from TIME? [1] https://github.com/howardabrams/pdx-emacs-hackers/blob/bfb7bd640fdf0ce3def21f9fc591ed35d776b26d/workshops/org-mode-gtd-feature-demo.org#L183 [2] https://orgmode.org/worg/org-syntax.html#Timestamps -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Extra paragraphs incorrectly spawning when ":end:" appears.
Hmm thanks, that makes sense. I guess a post-processing step to merge adjacent paragraphs wouldn't work either since that wouldn't stitch together objects like the bold in this test document without re-parsing the entire paragraph: ``` foo *bar :end: baz* ``` oh well 路 -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: [PATCH] Add backslash to list of POST characters for text markup
Thanks! -- Tom Alexander pgp: https://fizz.buzz/pgp.asc On Fri, Sep 22, 2023, at 5:29 AM, Ihor Radchenko wrote: > "Tom Alexander" writes: > >> Backslash appears to be supported. To test I used the following test >> document: >> ``` >> foo ~bar~\& baz >> ``` > > Thanks! > You are right. > Applied, onto master, with minor amendments to the commit message. > https://git.sr.ht/~bzg/worg/commit/ba6cda89 > > -- > Ihor Radchenko // yantar92, > Org mode contributor, > Learn more about Org mode at <https://orgmode.org/>. > Support Org development at <https://liberapay.com/org-mode>, > or support my work at <https://liberapay.com/yantar92>
Comments following not-really-affiliated keywords are becoming paragraphs
Emacs version: 29.1 Org-mode version: e1569918cc94253650781e83a09695739c93352f (latest in git) Test document: ``` #+CAPTION: foo # bar ``` This parses as a paragraph with the caption of foo and the body of "# bar" when it should parse as a regular keyword followed by a comment. Relevant org-syntax[1] bit: > a keyword with the same KEY as an affiliated keyword may occur so long as it > is not immediately preceding a valid element that can be affiliated. For > example, an instance of #+caption: hi followed by a blank line will be parsed > as a keyword, not an affiliated keyword. [1] https://orgmode.org/worg/org-syntax.html#Keywords -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Org-mode starting with 37d6bde27 errors out parsing org-mode/testing/examples/pub/a.org
Steps to reproduce: 1. Build emacs 29.1 2. Build org-mode with revision 37d6bde27fe228cdadcb5cdaa09287872a508777 3. Run the following: ``` emacs -q --no-site-file --no-splash --batch --eval "(progn (require 'org) (setq vc-handled-backends nil) (find-file-read-only \"org-mode/testing/examples/pub/a.org\") (org-mode) (message \"%s\" (pp-to-string (org-element-parse-buffer))) )" ``` I've attached a Dockerfile that reproduces the issue. Just throw that in a directory and run `docker build -t temp .` to see it fail. Change the `ARG ORG_VERSION=` line to `ac108a3ac1b332bf27ff2984a9cf26af3744185d` to see it succeed. Error message: ``` File mode specification error: (void-function org-export--list-bound-variables) Error: void-function (org-export--list-bound-variables) mapbacktrace(#f(compiled-function (evald func args flags) #)) debug-early-backtrace() debug-early(error (void-function org-export--list-bound-variables)) org-export--list-bound-variables() org-element--generate-copy-script(# :copy-unreadable do-not-check :drop-visibility t :drop-narrowing t :drop-contents t :drop-locals nil) org-element-copy-buffer(:to-buffer # :drop-visibility t :drop-narrowing t :drop-contents t :drop-locals nil) org-element-parse-secondary-string("<2014-03-04 Tue>" (bold citation code entity export-snippet inline-babel-call inline-src-block italic line-break latex-fragment link macro radio-target statistics-cookie strike-through subscript superscript target timestamp underline verbatim)) org-macro--find-date() org-macro--collect-macros() org-macro-initialize-templates() org-mode() (progn (require 'org) (setq vc-handled-backends nil) (find-file-read-only "/input/home/talexander/git/org-mode/testing/examples/pub/a.org") (org-mode) (message "%s" (pp-to-string (org-element-parse-buffer command-line-1(("--no-splash" "--eval" "(progn\n (require 'org)\n (setq vc-handled-backends nil)\n (find-file-read-only \"/input/home/talexander/git/org-mode/testing/examples/pub/a.org\")\n (org-mode)\n (message \"%s\" (pp-to-string (org-element-parse-buffer)))\n)")) command-line() normal-top-level() Symbol’s function definition is void: org-export--list-bound-variables ``` -- Tom Alexander pgp: https://fizz.buzz/pgp.asc Dockerfile Description: Binary data
Keyword becoming a paragraph based on optval
Emacs version: 29.1 Org-mode version: f3de4c3e041e0ea825b5b512dc0db37c78b7909e (latest in git) This test document parses as a keyword: ``` #+CAPTION[*foo*]: baz ``` but this test document parses as a paragraph: ``` #+CAPTION[*foo* bar]: baz ``` -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Clock becomes a paragraph by prefixing with not-really-affiliated-keyword
This test document correctly parses as a clock: ``` CLOCK: [2023-04-21 Fri 19:43] ``` This test document incorrectly parses as a paragraph: ``` #+NAME: foo CLOCK: [2023-04-21 Fri 19:43] ``` -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Keyword becoming a paragraph based on optval
> Note that _affiliated keyword_ has an optional form of Ah, that was what I was missing, thanks! -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Clarify that REST is not supported on the start TIME in a time-range timestamp.
> As for the problem with REST you raised, I am inclined to remove it from > syntax doc for the time being - it only creates more confusion, > unfortunately. Makes sense, thanks. Is there anything we do to mark patches as rejected? I removed [PATCH] from the subject line. -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Lesser blocks allowing unescaped lines
Thank you! Makes sense. -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Incorrect quantity of en-spaces
> This appears to be a special case, not documented on org-syntax page. Sounds good, thanks! -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Inconsistent text markup handling when double-nesting markers
I used the following test document: ``` __foo__ **foo** ``` I'd expect the two to behave the same but the first one parses as: ``` (paragraph "_" (subscript "foo") "__" ) ``` Whereas the second parses as: ``` (paragraph (bold (bold "foo" ) ) ) ``` This pattern happens in worg at [2] Looking at the description for text markup in the syntax document[1], I don't see any reason the first wouldn't be parsed as an underline: 1. PRE: valid because it is the beginning of a line 2. MARKER: valid underscore 3. CONTENTS: valid. Series of objects from standard set includes both subscript and text markup, so regardless of how we parse the interior, its valid. Also cannot begin or end with whitespace but there is no whitespace in the CONTENTS. 4. MARKER: valid underscore 5. POST: Only valid if we extend the underline to the 2nd underscore so it ends at the end of the line. But the 2nd line shows us that having copies of the marker inside the CONTENTS is fine so I see two possible expected parses of the CONTENTS: 4a. (underline "foo") 4b. ((subscript "foo") (plain-text "_")) I also ran the following test document to further prove that having copies of the marker inside the CONTENTS is fine: ``` *foo*bar* ``` which parses as (bold "foo*bar") So the only way the top line would fail to parse as an underline is if it matched the first closing underscore as closing the underline, but that would be invalid because underscore is not a valid POST character and invalid copies of the closing marker are ignored as proven by both "**foo**" and "*foo*bar*". [1] https://orgmode.org/worg/org-syntax.html#Emphasis_Markers [2] https://git.sr.ht/~bzg/worg/tree/ba6cda890f200d428a5d68e819eef15b5306055f/org-contrib/babel/intro.org#L117 -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: COUNTER-SET for alphabetical ordered lists ignored for utf-8 exporter
Thanks! > We aim to reduce config-dependent Org syntax in the long term. Thats wonderful news! Sometimes this stuff can really surprise you. For example, the structure of the document created by running `echo "1. foo\n 1.bar\n1.baz\n\t1.lorem"` changes based on the user's **tab-width**!! If tab-width is less than 8 then this is: ```text 1. foo 1. bar 1. baz 2. lorem ``` If tab-width is 8 then this is: ```text 1. foo 1. bar 1. baz 2. lorem ``` and if tab-width is greater than 8 this is: ```text 1. foo 1. bar 1. baz 1. lorem ``` Absolute madness! I always considered tab-width to be a personal aesthetic choice and not something that would functionally change how documents other people wrote will be parsed. Idk if its been discussed, but personally if I were given dictatorship over org-mode I would take all of these emacs variables that are defined outside of the document, and instead of having them influence org-mode directly, I would *only* use them to pre-populate values for in-buffer settings templates. For example, if a user had set `org-odd-levels-only` then I wouldn't have that impact ANY existing document they open, but if they open a new document then I would have it auto-insert `#+STARTUP: odd` at the top of the fresh document. Otherwise it seems like org-mode is unsuitable for multi-person collaboration without dictating the contents of everyone's `.emacs` file. -- Tom Alexander pgp: https://fizz.buzz/pgp.asc
Re: Clarification on blank lines following list items
Thank you so much for explaining all of that! There is some good information there I was missing. I think the most important bit I was missing is the post-blank stuff. I was only looking at begin->end but I think digging into the post-blank is what makes this consistent. I've got 2 separate questions: 1. Is the following statement true? "Two elements can count the same character in their post-blank?" I am seeing dual-ownership of the post-blank in the examples below, but at the same time if I put a plain-list inside a footnote definition, the footnote definition ends up with sole custody of the post-blank. 2. I'm still not sure about some behavior I'm seeing. I think it would be easiest to see if we focus on exactly 1 blank line: ``` 1. bar 2. baz < this blank line here ipsum ``` In this example, the blank line gets counted in the post-blank for the plain-list but not for the item: ``` plain-list: post-blank 1 | begin 1 end 16 | contents-begin 1 contents-end 15 item: post-blank 0 | begin 1 end 8 | contents-begin 4 contents-end 8 paragraph: post-blank 0 | begin 4 end 8 | contents-begin 4 contents-end 8 item: post-blank 0 | begin 8 end 15 | contents-begin 11 contents-end 15 paragraph: post-blank 0 | begin 11 end 15 | contents-begin 11 contents-end 15 paragraph: post-blank 0 | begin 16 end 22 | contents-begin 16 contents-end 22 ``` but if we take that plain-list and nest it inside another plain-list: ``` 1. foo 1. bar 2. baz < this blank line here 2. lorem ipsum ``` The blank line gets counted as a post-blank for both the item "foo" and the item "baz": ``` plain-list: post-blank 0 | begin 1 end 38 | contents-begin 1 contents-end 38 item: post-blank 1 | begin 1 end 29 | contents-begin 4 contents-end 28 paragraph: post-blank 0 | begin 4 end 8 | contents-begin 4 contents-end 8 plain-list: post-blank 0 | begin 8 end 29 | contents-begin 8 contents-end 29 item: post-blank 0 | begin 8 end 18 | contents-begin 14 contents-end 18 paragraph: post-blank 0 | begin 14 end 18 | contents-begin 14 contents-end 18 item: post-blank 1 | begin 18 end 29 | contents-begin 24 contents-end 28 paragraph: post-blank 0 | begin 24 end 28 | contents-begin 24 contents-end 28 item: post-blank 0 | begin 29 end 38 | contents-begin 32 contents-end 38 paragraph: post-blank 0 | begin 32 end 38 | contents-begin 32 contents-end 38 paragraph: post-blank 0 | begin 38 end 44 | contents-begin 38 contents-end 44 ``` Meaning the post-blank did this movement: ``` plain-list: post-blank 0 item: post-blank 1 <---<<<-\ paragraph: post-blank 0 | plain-list: post-blank 0 >>--| item: post-blank 0 | paragraph: post-blank 0 | item: post-blank 1 <---<---/ paragraph: post-blank 0 item: post-blank 0 paragraph: post-blank 0 paragraph: post-blank 0 ``` Question ---> So why is the item "baz" gaining a post-blank instead of the inner plain-list (bar baz) keeping that post-blank? I would expect it to instead be: ``` plain-list: post-blank 0 item: post-blank 1 paragraph: post-blank 0 here -> plain-list: post-blank 1 item: post-blank 0 paragraph: post-blank 0 not here -> item: post-blank 0 paragraph: post-blank 0 item: post-blank 0 paragraph: post-blank 0 paragraph: post-blank 0 ``` I re-did both test cases using greater blocks and lesser blocks instead of paragraphs to make sure it wasn't that historical exception at the end of your email, and the post-blank behavior was exactly the same. -- Tom Alexander
Re: [BUG] inline src blocks in caption of not-inline src blocks do not execute
Confirming fixed. Thanks! PS A new issue arises however caused by 487f39efa68fa2d857f8d446d1c4b3a3b3e3f482, which is that it is now confusing to get the {{{results(=value=)}}} macro without verbatim which is what :results drawer meant in that context. I expect that change will break things for a number of people beyond myself. A bit of reading the code revealed that setting :wrap t makes it possible to get {{{results(value)}}} again, but line 2647 [1] seems to indicate that drawer is an expected and valid value for inline :results. 1. ((or (member "drawer" result-params) ;; Stay backward compatible with <7.9.2 (member "wrap" result-params)) (goto-char beg) (when (org-at-table-p) (org-cycle)) (funcall wrap ":results:" ":end:" 'no-escape nil "{{{results(" ")}}}"))
Re: [PATCH] ob-tangle.el: restore :tangle closure nil behavior
> My confusion about you patch comes from the fact that > > #+begin_src emacs-lisp :tangle (if (= 1 1) "yes") > 2 > #+end_src > > works just fine on main. It appears to work fine on main, but that is because what is actually happening behind the scenes is that in the test (unless (or (string= src-tfile "no") ...) ...) the actual comparison is (string= "(if (= 1 1) \"yes\")" "no") which appears to work, but is not comparing the result of the closure, only its string value. > I admit that I don't fully understand your use case. I want to use a closure to conditionally control whether a block will tangle. If I hardcode :tangle no, then :var x=(error "oops") will not evaluate. The (error "oops") is a placeholder for a number of things that will result in an error if the condition for :tangle (when condition "file-name") is not satisfied. > Something like (org-babel-get-heading-arg :tangle info/params) I need to go to bed, because I definitely started on an implementation of that I forgot about it as a potential solution. Yes, this seems like a better and clearer way to go about it. > May you please elaborate? Disregard, your suggestion clarified what you meant, and in that case, yes we can consolidate.
Re: [BUG] inline src blocks in caption of not-inline src blocks do not execute
> It was a slip when the patch was applied. > See the table of :results params vs. expected output that Nicolas > provided in > https://list.orgmode.org/orgmode/87zjbqrapy@nicolasgoaziou.fr/ > > Or maybe I miss something. > > May you please explain more about {{{results(=value=)}}} problem? > Isn't it sufficient to do src_elisp[:results verbatim]{'value} > {{{results(=value=)}}}? The issue is the opposite I think. Currently the default value (i.e. absent) for :results does not produce {{{results(value)}}} as suggested, and instead producers {{{results(=value=)}}}. This means that without :results drawer there isn't an obvious way to get {{{results(value)}}} because you can't e.g. use [:results default], or if a user overrides the default value for inline header args at file level then they have no way to reset to the default. It looks like there used to be an option [:results wrap] which was deprecated a _very_ long time ago. [:results drawer] replaced that, and while there is some confusion about the name (because there is no actual drawer in an inline result) the behavior was meant to replace the old :results wrap behavior where the name does make sense since {{{results(value)}}} do "wrap" the value. I think that covers it, but let me know if something doesn't make sense. Best, Tom
Re: [PATCH] ob-tangle.el: restore :tangle closure nil behavior
On Wed, Aug 16, 2023 at 2:09 AM Ihor Radchenko wrote: > > Tom Gillespie writes: > > > Subject: [PATCH] ob-tangle.el: restore :tangle closure evaluation before > > eval > > info > > This patch fixes a bug where header arguments like :tangle (or "no") > > were treated as if they were tangling to a file named "(or \"no\")". > > As a result, org-bable would call org-babel-get-src-block-info with > > 'no-eval set to nil, causing parameters to be evaluated despite the > > fact that when :tangle no or equivalent is set, the other parameters > > should never be evaluated. > > What do you mean by "restore"? Were it evaluated in the past? > May you please provide a reproducer? Hrm. I think I may have mixed two commit lines. It is the case that :tangle closures used to work, but you are right, the historical behavior when tangling closures meant that all parameters were evaluated (tested with the block below in 27 and 28). #+begin_src elisp :var value=(error "oops") :tangle (or "no") value #+end_src My use case is that I have blocks that I want to tangle that set :var from e.g. the library of babel, which isn't always loaded, but which also is not required if :tangle is no. > > -(defun org-babel-tangle--unbracketed-link (params) > > +(defun org-babel-tangle--unbracketed-link (params > > info-was-evaled) > > This is not acceptable. Taking care about evaluating INFO should be done > in a single place instead of adding checks across the babel code. If we > go the proposed way, I expect a number of bugs appearing when somebody > forgets to change the eval check in some place. I don't like the solution either. I see two potential alternatives. 1. change the structure of the info list to indicate whether it has already been evaluated 2. always call org-babel-read on (cdr (assq :tangle params)) even if it may already have been evaluated which can lead to some unexpected and potentially nasty results. I don't think we can consolidate evaluating parameters into one place in the general case because there are order dependencies where a setting in one param header should mask others (as is the case here). In principle we could consolidate them, but I think that would add significant complexity because we would have to push all the logic for handling whether a given ordering restriction applies inside that location. e.g. if I have a block set :eval (if ev "yes" "no") it would be bad form to evaluate the parameters before determining whether the :eval closure evaluates to "yes" or "no". Should that go inside org-process-params, or should it be handled locally by e.g. org-babel-tangle and org-babel-execute-src-block separately? Thoughts?